Profiler

Understanding latency-related issues with the engine is difficult since attaching the debugger alters the real-time performance characteristics. Similarly, frequent log output itself can induce engine latency that prevents pinpointing of problems. For this reason, the engine includes a low-overhead profiler to observe the engine retrospectively without affecting its operation. This functionality is paired with a utility in the vortex-tools package for parsing and interpreting the profiler output.

Usage

The profiler is activated by setting the environment variable VORTEX_PROFILER_LOG with a valid path, such as log.prof. When this variable is present, each session of the vortex engine (i.e., each call to Engine.start()) will write profiler output to the given path, creating intermediate directories as needed. No code changes are required to activate this functionality. THe profiler is disabled when the VORTEX_PROFILER_LOG environment variable is absent or empty.

Caution

The profiler will overwrite logs from prior sessions. Take care when running multiple sessions.

Caution

Profiler output is written to disk in a background thread. Ensure that the session shuts down completely so that all profiling events are recorded.

Tip

To confirm activation of the profiler, check the engine’s debug-level log output for messages regarding the opening and closing of the profiler log.

Interpretation

The vortex-tools package provides the profiler module for inspecting and interpreting profiler output.

> python -m vortex_tools.profiler -h
usage: profiler.py [-h] [--count COUNT] [--skip SKIP] [--pretty] [--timeline] [--waterfall] [--statistics] path

print vortex profiler log

positional arguments:
path                  path to profiler log

options:
-h, --help            show this help message and exit
--count COUNT, -c COUNT
                      number of records to show (default: None)
--skip SKIP, -s SKIP  number of records to skip (default: None)
--pretty, -p          show pretty output (default: False)
--timeline, --timing  show profiler timeline (timing) diagram (default: False)
--waterfall           show profiler waterfall diagram (default: False)
--statistics, --stats
                      show profiler statistics (default: False)
$ python3 -m vortex_tools.profiler -h
usage: profiler.py [-h] [--count COUNT] [--skip SKIP] [--pretty] [--timeline] [--waterfall] [--statistics] path

print vortex profiler log

positional arguments:
path                  path to profiler log

options:
-h, --help            show this help message and exit
--count COUNT, -c COUNT
                      number of records to show (default: None)
--skip SKIP, -s SKIP  number of records to skip (default: None)
--pretty, -p          show pretty output (default: False)
--timeline, --timing  show profiler timeline (timing) diagram (default: False)
--waterfall           show profiler waterfall diagram (default: False)
--statistics, --stats
                      show profiler statistics (default: False)

The output consists of a series of records that specific the profiling event, associated task index (e.g., for an acquisition task), the job or block number, and the timestamp in nanoseconds. The first profiling event indicates the vortex and profiler versions in the task and job fields, respectively.

> python -m vortex_tools.profiler log.prof --pretty --count 10
                          code  task  job        timestamp
0             PROFILER_VERSION   403    0                0
1                ENGINE_LAUNCH     0    0  681110124089600
2                   ENGINE_RUN     0    0  681110147807500
3                   JOB_CREATE     0    0  681110149373700
4                JOB_CLEARANCE     0    0  681110149378900
5            JOB_GENERATE_SCAN     0    0  681110149919100
6          JOB_GENERATE_STROBE     0    0  681110150233500
7   JOB_ACQUIRE_DISPATCH_BEGIN     0    0  681110150344300
8     JOB_ACQUIRE_DISPATCH_END     0    0  681110150993800
9                   JOB_CREATE     0    1  681110150997300
$ python3 -m vortex_tools.profiler log.prof --pretty --count 10
                          code  task  job        timestamp
0             PROFILER_VERSION   403    0                0
1                ENGINE_LAUNCH     0    0  681110124089600
2                   ENGINE_RUN     0    0  681110147807500
3                   JOB_CREATE     0    0  681110149373700
4                JOB_CLEARANCE     0    0  681110149378900
5            JOB_GENERATE_SCAN     0    0  681110149919100
6          JOB_GENERATE_STROBE     0    0  681110150233500
7   JOB_ACQUIRE_DISPATCH_BEGIN     0    0  681110150344300
8     JOB_ACQUIRE_DISPATCH_END     0    0  681110150993800
9                   JOB_CREATE     0    1  681110150997300

When the --timeline or --waterfall flag is passed, the profiler module generates a timeline derived from the profiler log. For the --waterfall flag, each row in the timeline corresponds to one block of samples moving through the engine’s pipeline. For the --timeline flag, each row in the timeline corresponds to all block of samples moving through a specific stage of the engine’s pipeline. The time that the engine and its components spend in various activities are represented by color-coded bars within each row. Phases of activity that have multiple tasks contain a smaller color-coded bar for each task.

When the --statistics flag is passed, the profiler module generates a table that lists the interval, rate, duration, and time usage for each task within the engine’s pipeline. In contrast to the above visual displays, this table is suitable for quantitative analysis of the engine’s performance.

Note

The number of preloaded blocks can artificially inflate the acquisition duration since blocks are queued far in advance. To compensate, the profiler module attempts to detect the number of preloaded blocks and scales the resulting values by it. The output table includes P<N>, where <N> is the detected number of preloaded blocks, when this compensation is applied. This detection may fail for short acquisitions or acquisitions that exit prematurely due to an error.

../../_images/profiler-timeline.svg

Example waterfall timeline derived from the profiler log.

Diagnostics

The profiler timeline can be used to identify configuration and electrical issues.

  • Incorrect Triggering: Lines drawn through completion events of acquisitions tasks do not remain parallel after the session start. This suggests that the hardware components are triggering at different rates.

  • Insufficient Preload: Random jitter in acquisition dispatch times is not much larger than the acquisition wait interval. The engine is likely to underflow the acquisition buffers when the computer is highly loaded.

  • Insufficient Block Size: The engine spends more time dispatching tasks than waiting for tasks to complete. This suggests that the overhead of setting up a block is too high so the block size should be increased.

  • Excessive Processing: The line drawn through completion events of recycling tasks has a smaller slope than the line drawn through acquisition tasks. This suggests that the processing pipeline is not returning blocks to the engine sufficiently fast to maintain the acquisition indefinitely.