Tutorial#

How do I use logpyle?#

Consider this example of a ‘simulation’ logging various quantities:

#!/usr/bin/env python3

import logging
from random import uniform
from time import sleep
from warnings import warn

from logpyle import (GCStats, IntervalTimer, LogManager, LogQuantity,
                     add_general_quantities, add_run_info, add_simulation_quantities,
                     set_dt)


class Fifteen(LogQuantity):
    def __call__(self) -> int:
        return 15


logger = logging.getLogger(__name__)


def main() -> None:
    logmgr = LogManager("log.sqlite", "wo")

    logmgr.enable_save_on_sigterm()

    # set a run property
    logmgr.set_constant("myconst", uniform(0, 1))

    # Generic run metadata, such as command line, host, and time
    add_run_info(logmgr)

    # Time step duration, wall time, ...
    add_general_quantities(logmgr)

    # Simulation time, time step
    add_simulation_quantities(logmgr)

    # Additional quantities to log
    vis_timer = IntervalTimer("t_vis", "Time spent visualizing")
    logmgr.add_quantity(vis_timer)
    logmgr.add_quantity(Fifteen("fifteen"))
    logmgr.add_quantity(GCStats())

    # Watches are printed periodically during execution
    logmgr.add_watches(["step.max", "t_sim.max", "t_step.max", "fifteen",
                        "t_vis", "t_log", "memory_usage_hwm"])

    for istep in range(200):
        logmgr.tick_before()

        dt = uniform(0.01, 0.05)
        set_dt(logmgr, dt)
        sleep(dt)

        # Illustrate custom timers
        if istep % 10 == 0:
            with vis_timer.start_sub_timer():
                sleep(0.05)

        # Illustrate warnings capture
        if uniform(0, 1) < 0.05:
            warn("Oof. Something went awry.")

        if istep == 50:
            logger.warning("test logging")
            print("FYI: Setting watch interval to 5 seconds.")
            logmgr.set_watch_interval(5)

        if istep == 150:
            logger.error("another logging test")
            print("FYI: Setting watch interval back to 1 second.")
            logmgr.set_watch_interval(1)

        logmgr.tick_after()

    logmgr.close()


if __name__ == "__main__":
    main()

(You can find this example as examples/log.py in the logpyle source distribution.)

Running this script will give you a file log.sqlite. Note how log quantities given as “watches” are also periodically logged to the console.

To analyze this data, we need some tools that are distributed as part of this package.

You may use runalyzer to analyze the log file:

$ runalyzer log.sqlite
Creating an in memory database from provided files
Scanning...          [########################################] ETA ?
Importing...         [########################################] ETA ?
Runalyzer running on Python 3.8.5 (default, Aug  2 2020, 15:09:07)
[GCC 10.2.0]
Run .help to see help for 'magic' commands
>>>

runalyzer accepts multiple input files. Behind the scenes, these files are gathered into a single database for analysis. You can optionally do this manually by running runalyzer-gather as a script:

$ runalyzer-gather summary.sqlite log.sqlite
Scanning...          [########################################] ETA ?
Importing...         [########################################] ETA ?

Specifying multiple data files serves two important functions:

  • If you would like to compare data from multiple runs, you can simply add data files from multiple runs and analyze them side-by-side.

  • For distributed-memory runs, you can likewise give data sets from multiple ranks to be gathered in a single file for centralized analysis.

  • As well as, naturally, any combination of the above.

This is a normal Python console, but it has a few extra tricks up its sleeve. .help gives an overview. Here, we’ll focus on demonstrating two interesting features. First, lets get an overview of all the logged quantities:

>>> .quantities
id | name    | unit | description                 | rank_aggregator
---+---------+------+-----------------------------+-----------------
7  | dt      | s    | Simulation Timestep         | None
9  | fifteen | None | None                        | None
5  | step    | 1    | Timesteps                   | None
2  | t_2step | s    | Step-to-step duration       | None
3  | t_wall  | s    | Wall time                   | None
4  | t_log   | s    | Time spent updating the log | None
6  | t_sim   | s    | Simulation Time             | None
1  | t_step  | s    | Time step duration          | None
8  | t_vis   | s    | Time spent visualizing      | None

Next, we could plot, e.g. time step duration against simulation time:

>>> .plot select $t_sim, $t_2step

Next, a fact that the select may have given away: The argument to .plot is a SQL query that is passed on to sqlite3, with support for some additional syntax. You can see what this query expanded to internally:

>>> mangle_sql("select $t_sim, $t_2step")
'select t_sim.value, t_2step.value from runs  inner join t_sim on (t_sim.run_id = runs.id)  inner join t_2step on (t_2step.run_id = runs.id and t_sim.step = t_2step.step and t_sim.rank=t_2step.rank) '

As you can see, by default these queries return data from all runs. You can filter which runs you are interested in through run properties:

>>> .runprops
cmdline
date
dirname
filename
id
is_parallel
machine
myconst
rank_count
schema_version
unique_run_id
unixtime

simply by adding a SQL where clause referring to these properties.

As you can see from the demo, you can of course add your own, application-focused run properties.

This means that you have the full power of SQL available to analyze your time series data, for example:

  • Plot the solution 2-norms of all runs that were run on a Tuesday between time step 2000 and 3000

  • Make a scatterplot correlating memory transfer amounts and simulation time

In some cases (for example, when logging is optional), it may be convenient to retrieve sub-timers once and reuse them. Adapting the example above, sub-timer initialization becomes:

# Add a timer quantity and retrieve a sub-timer for later use
if logmgr:
    vis_timer = IntervalTimer("t_vis", "Time spent visualizing")
    logmgr.add_quantity(vis_timer)
    time_vis = vis_timer.get_sub_timer()
else:
    from contextlib import nullcontext
    time_vis = nullcontext()

and usage becomes:

if istep % 10 == 0:
    # Use the sub-timer
    with time_vis:
        sleep(0.05)

(Full example code can be found in examples/optional-log.py in the logpyle source distribution.) Check out the Analysis section for further information.