Reverse execution is implemented by replaying a program’s execution from previously captured snapshots. Since replay is guaranteed to be deterministic (due to the event log), when a snapshot is played forwards it will eventually reach the next snapshot.
In order to minimise overhead, snapshots are created by forking the original process. This means they benefit from Linux’s copy-on-write memory semantics, so that that the memory cost of a snapshot is proportional to the amount of memory that was subsequently modified by the program.
Viewing snapshot information¶
List the snapshots in UDB using the info snapshots command.
Table of snapshots, ordered by recording time.
The columns are as follows:
number: The snapshot number.
pid: The process ID of the snapshot.
memory: The memory used by the snapshot. This is the “proportional set size” (PSS), which takes the copy-on-write memory into account by allocating it proportionally among the processes that share it. (Note that tools such as ps and top typically show the “resident set size” (RSS) which is much larger for snapshot processes as it does not take into account the sharing of copy-on-write memory.)
created: The wall-clock time at which the snapshot was created.
One of the snapshots has an arrow pointing to it, and this is the snapshot that is currently being used for recording or replay. For example:24% 24,481> info snapshots number recording time pid memory created 0 1:0x00007ffff7fd5090 3203662 2.62M 14:10:59 1 4,388:0x00007ffff7fe03b0 3203668 1.69M 14:10:59 => 2 24,481:0x0000555555555367 3203666 3.32M 14:10:59 3 65,537:0x0000555555555298 3203664 2.91M 14:10:59 4 99,000:0x0000555555555298 3203656 51.59M 14:10:59 Nanny: pid=3203658; memory used=6.00M Total memory used: 68.14M Snapshot creation times: mean=0ms; max=0ms; previous=0ms
It is possible to configure the maximum number of snapshots to keep. This is a trade-off between memory and performance:
Increasing the number of snapshots increases memory usage, but makes reverse-execution and time-travel commands faster.
This is because reverse-execution and time-travel commands are implemented by jumping to the last snapshot and replaying it forward. Fewer snapshots mean that the distance between snapshots is greater and hence more replay is required to reach a given time in execution history.
By default, the maximum number of snapshots is 35. This can be configured using
udb --max-snapshots command-line option, or the
UNDO_snapshots environment variable. For example:
$ udb --max-snapshots 20 --args examples/hashtable
The Undo Engine makes a best effort to retain no more than the configured number of snapshots, but this is subject to its own minimum requirements.
Snapshots are automatically pruned as the program runs, to avoid exhausting
system resources. If the available memory in the system falls below a
threshold, the Undo Engine prunes snapshots until the threshold is restored
(subject to the minimum requirements). The default threshold is 10%. This can
be changed by setting the
variable to the threshold percentage, up to a maximum value of 75. Set this to
0 to disable this feature.
This means that the longer the program has been running, the greater the distance between snapshots. Thus the first reverse-execution command after a long run is likely to be slow. Subsequent reverse-execution commands should be faster.