Snapshots¶
A snapshot captures the state of a program at a particular time in its execution history.
Reverse execution is implemented by replaying a program’s execution from previously captured snapshots. Since replay is guaranteed to be deterministic (due to the event log), when a snapshot is played forwards it will eventually reach the next snapshot.
In order to minimize overhead, snapshots are created by forking the original process. This means they benefit from Linux’s copy-on-write memory semantics, so that that the memory cost of a snapshot is proportional to the amount of memory that was subsequently modified by the program.
Viewing snapshot information¶
List the snapshots in UDB using the info snapshots command.
info snapshots¶
Table of snapshots, ordered by recording time.
The columns are as follows:
number: The snapshot number.
recording time: The time in execution history represented by the snapshot.
pid: The process ID of the snapshot.
memory: The memory used by the snapshot. This is the “proportional set size” (PSS), which takes the copy-on-write memory into account by allocating it proportionally among the processes that share it. (Note that tools such as ps and top typically show the “resident set size” (RSS) which is much larger for snapshot processes as it does not take into account the sharing of copy-on-write memory.)
created: The wall-clock time at which the snapshot was created.
One of the snapshots has an arrow pointing to it, and this is the snapshot that is currently being used for recording or replay. For example:
24% 24,600> info snapshots number recording time pid memory created 0 1:0x00007ffff7fe59c0 1895587 704.00K 17:23:03 1 8,004:0x00007ffff7fd8e53 1895593 793.00K 17:23:03 => 2 24,600:0x0000555555555374 1895591 1.37M 17:23:03 3 65,537:0x00005555555552a5 1895589 1.00M 17:23:03 4 99,000:0x00005555555552a5 1895579 1.40M 17:23:03 Nanny: pid=1895583; memory used=4.38M Total memory used: 9.61M Snapshot creation times: mean=0ms; max=1ms; previous=0ms
Configuring snapshots¶
It is possible to configure the maximum number of snapshots to keep. This is a trade-off between memory and performance:
Reducing the number of snapshots reduces memory usage, but makes reverse-execution commands and time-travel commands slower.
Increasing the number of snapshots increases memory usage, but makes reverse-execution and time-travel commands faster.
Note
This is because reverse-execution and time-travel commands are implemented by jumping to the last snapshot and replaying it forward. Fewer snapshots mean that the distance between snapshots is greater and hence more replay is required to reach a given time in execution history.
By default, the maximum number of snapshots is 35. This can be configured using
the udb --max-snapshots
command-line option, or the
UNDO_snapshots
environment variable. For example:
$ udb --max-snapshots 20 --args examples/hashtable
Note
The Undo Engine makes a best effort to retain no more than the configured number of snapshots, but this is subject to its own minimum requirements.
Snapshots adaptation¶
Snapshots are automatically pruned as the program runs, to avoid exhausting
system resources. If the available memory in the system falls below a
threshold, the Undo Engine prunes snapshots until the threshold is restored
(subject to the minimum requirements). The default threshold is 10%. This can
be changed by setting the UNDO_memory_pressure_threshold
environment
variable to the threshold percentage, up to a maximum value of 75. Set this to
0 to disable this feature.
This means that the longer the program has been running, the greater the distance between snapshots. Thus the first reverse-execution command after a long run is likely to be slow. Subsequent reverse-execution commands should be faster.