Event log

The event log stores information about each non-deterministic event that affects the execution of the program. These events include:

  1. System calls.

  2. Reads from shared memory.

  3. Asynchronous signal delivery.

  4. Thread switches and thread interactions.

  5. Non-deterministic machine instructions.

Event log rotation

In the default (circular) mode, the Undo Engine discards events from the beginning of the event log in order to make space for new events. This means that the program can continue to run without allocating extra space for the event log.

Note

When an event is discarded from the event log, the execution history prior to the event can no longer be replayed.

Note

When UDB is loading a recording, this setting is ignored and the complete event log is loaded.

Configuring event log size

Memory is allocated dynamically for the event log as required, and by default is limited to 1GB on x64 and 256MB on other platforms. To query the event log size, use the info event-log-size command. The maximum size can be configured when starting UDB, either using the --max-event-log-size command-line option, for example:

$ udb --max-event-log-size 2G

or using the UNDO_event_log_max environment variable:

$ UNDO_event_log_max=2G udb

In either case specify the maximum size as a number followed by an optional multiplier (K for kilobytes, M for megabytes, or G for gigabytes), or as 0 to choose a suitable size.

Note

It’s recommended to set the event log size to a value that is substantially smaller than the available system memory, otherwise UDB risks being killed by the Linux out-of-memory (OOM) killer.

Note

When UDB is loading a recording, this setting is ignored and the complete event log is loaded.

info event-log-size

Report the current size of the event log.

For example:

recording 8,304> info event-log-size
The event log size is 67108864 bytes (64.00M).

set max-event-log-size size[K|M|G]

Set the maximum size to which the event log may grow.

For example:

recording 8,304> set max-event-log-size 1
Maximum event log size rounded up to 67108864 bytes (64.00M).
recording 8,304> set max-event-log-size 512M
Maximum event log size set to 536870912 bytes (512.00M).
recording 8,304> set max-event-log-size 1G
Maximum event log size set to 1073741824 bytes (1.00G).

show max-event-log-size

The maximum size of the event log.

For example:

recording 8,304> show max-event-log-size
Maximum event log size is 1073741824 bytes (1.00G).

Using a straight event log

If you prefer the event log not be rotated, you can switch to the “straight” event log mode. In this mode, when the event log is full, UDB stops the program and emits this message:

ERROR: The Undo Engine's event log is full, so no more history can be recorded.
You may still use UDB commands to go backwards, or alternatively:  Use "set
max-event-log-size <size>[K|M|G]" to increase the event log size, or use "set
event-log-mode circular" to use a circular event log.  The current event log
size is 67108864 bytes (64.00M).

At this point you can increase the maximum event log size using the set max-event-log-size, or switch to a circular event log using the set event-log-mode command.

Configure the event log when starting UDB, either using the --event-log-mode command-line option, for example:

$ udb --event-log-mode straight

or using the UNDO_event_log_mode environment variable:

$ UNDO_event_log_mode=straight udb

set event-log-mode circular|straight

Set the event log mode.

For example:

recording 8,304> set event-log-mode straight

The default (circular) event log mode can be restored using:

recording 8,304> set event-log-mode circular

show event-log-mode

The event log mode (circular or straight).

For example:

recording 8,304> show event-log-mode
The event log mode is circular.

Event statistics

When investigating a performance issue, or trying to get an understanding of where a program is spending its time, use the info event-stats to output a summary table of statistics for each type of event. For example:

recording 221,166> info event-stats
  Count    %Count    Total size    %Size  Event type
                        (bytes)
-------  --------  ------------  -------  -----------------
     67     46.21         4,288    27.35  CPUID
     22     15.17         1,760    11.23  write
     12      8.28         1,694    10.80  mmap
      8      5.52           512     3.27  RDTSC
      4      2.76           960     6.12  fstat
      4      2.76           512     3.27  mprotect
      3      2.07           384     2.45  brk
      3      2.07           240     1.53  openat

Events with names in lower case are system calls executed by the program as it was recorded, and events with names in upper case are other kinds of non-deterministic behavior. See Common event types below.

info event-stats [start[,stop]]

Table of event types. The optional arguments start and stop specify the range of events to summarize, and must either be bbcounts, or in the form base[[+|-]offset], where base is b (beginning), c (current) or e (end) and offset is the number of events to count from base. If stop is omitted it defaults to start + 10. If both start and stop are omitted, the range is the whole event log.

The output is a table with one row for each type of event, giving the count of events of that type, their proportion of events as a percentage, their total size in bytes, their proportion of the event log size as a percentage, and the name of the event type. Event with names in lower case are system calls, and events with names in upper case are other kinds of non-deterministic behavior. The table is sorted in descending order by the count.

Common event types

Events with names in lower case are system calls executed by the program as it was recorded, and events with names in upper case are other kinds of non-deterministic behavior. Some common events in the latter category are described below.

CPUID

The CPUID (CPU identification) instruction on x32 and x64. This queries the features of the processor, and is used by library code to select implementations at runtime according to the available features, for example, the standard C library might select an implementation of memcpy() using instructions in the AVX512 extension, if those instructions are available.

In the example above, the program is short, so the event log is dominated by CPUID instructions issued by the standard C library when it was loaded.

NDETERM

This is a catch-all for any instruction that updates registers non-deterministically. Some common instructions have their own events, for example, CPUID and RDTSC, but rare instructions use the generic NDETERM event. Examples include LSL (load segment limit), IN (input from port), and OUT (output to port).

RDTSC

The RDTSC (read time-stamp counter) instruction on x32 and x64. This queries the processor’s time-stamp counter, which increments every clock cycle. This is used to implement clock_gettime() with the CLOCK_PROCESS_CPUTIME_ID argument, or more generally for high-frequency time measurement.

SHMEM_FIXUP

A non-deterministic change to shared memory. The program read the contents of a region of shared memory and it was found that the contents had changed since the last time the memory was in a known state. This means that another process changed the memory (if the region is mapped anonymously), or that the mapped file changed in the file system (if the region is mapped from a file).

SIG_TOCHILD

A signal was received by the program being recorded.

STATUS_TODEBUGGER

A signal was received by the debugger.

THREADSWITCH

The program switched threads. (The Undo Engine serializes the execution of threads, so that one thread executes at a time and the Engine switches between them.)

Event navigation

Query the event log using the info events command, and jump to the time of the next or previous event using the ugo event command.

Event condition

All the event navigation commands optionally take a condition, which is a Python expression. it is evaluated for each event considered by the command. If condition tests true, the event is included by the command; if it tests false, the event is excluded. The expression can use the following variables:

  • address: The base address of a shared memory update, for SHMEM_FIXUP events, or 0 for other events.

  • bbcount: The bbcount.

  • name: The event type or system call name.

  • pc: The program counter.

  • result: The value returned by the system call, or 0 for other events.

  • signum: The signal number, for SIGNAL, SIG_TOCHILD and STATUS_TODEBUGGER events, or 0 for other events.

  • size: The size of the event in bytes.

  • syscall: The system call number, or None for other events.

  • tid: The thread id of the thread switched to, for NEWTHREAD and THREADSWITCH events, or 0 for other events.

  • timestamp: The time stamp counter read by the RDTSC instruction, for RDTSC events, or 0 for other events. Note that this is a monotonically increasing count of clock cycles since the processor was reset, not a wall-clock time or a time in execution history.

For example, to list the openat() system calls among the first 100 events:

recording 221,166> info events -l 100 name == 'openat'
time=1,726:0xffffffffffffffff: openat. result=0x3 size=80.
time=1,807:0xffffffffffffffff: openat. result=0x3 size=80.
time=2,043:0xffffffffffffffff: openat. result=0x3 size=80.

info events [options] [condition]

Show events.

-limit N, -l N

Consider at most N events.

-min BBCOUNT|BOOKMARK, -after BBCOUNT|BOOKMARK, -a BBCOUNT|BOOKMARK

Only show events at BBCOUNT/BOOKMARK or later. A BBCOUNT may contain commas.

The string now can be used to specify the current bbcount.

If BBCOUNT starts with + or -, it is relative to the current bbcount.

-max BBCOUNT|BOOKMARK, -before BBCOUNT|BOOKMARK, -b BBCOUNT|BOOKMARK

Only show events before BBCOUNT/BOOKMARK. See the -min option.

-quiet, -q

Don’t print progress information during the search for events.

See Event condition for the condition argument.

For example, to list all the read() system call events:

recording 221,166> info events name == 'read'
time=1,809:0xffffffffffffffff: read. result=0x340 size=928.
time=2,045:0xffffffffffffffff: read. result=0x340 size=928.

To show the event at the current bbcount, if any:

0% 1,809> info events -a now -b +1
The "info events" command ignores the PC, therefore the time used may slightly
vary from the current time in execution.
time=1,809:0xffffffffffffffff: read. result=0x340 size=928.

To show five events at or after the current bbcount:

0% 1,809> info events -a now -l 5
time=1,809:0xffffffffffffffff: read. result=0x340 size=928.
time=1,816:0xffffffffffffffff: fstat. result=0x0 size=240.
time=1,861:0xffffffffffffffff: mmap. result=0x7ffff7ed1000 size=145.
time=1,864:0xffffffffffffffff: mmap. result=0x7ffff7ee1000 size=145.
time=1,867:0xffffffffffffffff: mmap. result=0x7ffff7f60000 size=145.

ugo event next|prev [condition]

Jump to the next or previous event.

With a condition, jump to the next or previous event matching the condition. See Event condition for details.

For example, to jump to the next write() system call event:

0% 1,809> ugo event next name == 'write'

To jump to the previous read() system call event:

3% 8,684> ugo event prev name == 'read'

If the ugo event command succeeds, the program is stopped at the instruction just before the event, so that the stepit 0 command will replay the event.

Note

In a multi-threaded program, the instruction just before the event might be in another thread. This case can be surprising because the backtrace does not show the expected function call, but the stepit 0 command can be used to step over the thread switches and replay the event.