Snapshots¶
A snapshot captures the state of a program at a particular time in its execution history.
Reverse execution is implemented by replaying a program’s execution from previously captured snapshots. Since replay is guaranteed to be deterministic (due to the event log), when a snapshot is played forwards it will eventually reach the next snapshot.
In order to minimize overhead, snapshots are created by forking the original process. This means they benefit from Linux’s copy-on-write memory semantics, so that that the memory cost of a snapshot is proportional to the amount of memory that was subsequently modified by the program.
Viewing snapshot information¶
List the snapshots in UDB using the info snapshots command.
info snapshots¶
Table of snapshots, ordered by recording time.
The columns are as follows:
number: The snapshot number.
recording time: The time in execution history represented by the snapshot.
pid: The process ID of the snapshot.
memory: The memory used by the snapshot. This is the “proportional set size” (PSS), which takes the copy-on-write memory into account by allocating it proportionally among the processes that share it. (Note that tools such as ps and top typically show the “resident set size” (RSS) which is much larger for snapshot processes as it does not take into account the sharing of copy-on-write memory.)
created: The wall-clock time at which the snapshot was created.
One of the snapshots has an arrow pointing to it, and this is the snapshot that is currently being used for recording or replay. For example:
24% 24,596> info snapshots number recording time pid memory created 0 1:0x00007ffff7fe4320 3747615 802.00K 14:38:19 1 8,004:0x00007ffff7fd6b58 3747621 894.00K 14:38:20 => 2 24,596:0x0000555555555374 3747619 1.54M 14:38:20 3 65,537:0x00005555555552a5 3747617 1.15M 14:38:20 4 99,000:0x00005555555552a5 3747608 1.60M 14:38:19 Nanny: pid=3747611; memory used=3.75M Total memory used: 9.70M Snapshot creation times: mean=2ms; max=3ms; previous=2ms
Configuring snapshots¶
It is possible to configure the maximum number of snapshots to keep. This is a trade-off between memory and performance:
Reducing the number of snapshots reduces memory usage, but makes reverse-execution commands and time-travel commands slower.
Increasing the number of snapshots increases memory usage, but makes reverse-execution and time-travel commands faster.
Note
This is because reverse-execution and time-travel commands are implemented by jumping to the last snapshot and replaying it forward. Fewer snapshots mean that the distance between snapshots is greater and hence more replay is required to reach a given time in execution history.
By default, the maximum number of snapshots is 35. This can be configured using
the udb --max-snapshots
command-line option, or the
UNDO_snapshots
environment variable. For example:
$ udb --max-snapshots 20 --args examples/hashtable
Note
The Undo Engine makes a best effort to retain no more than the configured number of snapshots, but this is subject to its own minimum requirements.
Snapshots adaptation¶
Snapshots are automatically pruned as the program runs, to avoid exhausting
system resources. If the available memory in the system falls below a
threshold, the Undo Engine prunes snapshots until the threshold is restored
(subject to the minimum requirements). The default threshold is 10%. This can
be changed by setting the UNDO_memory_pressure_threshold
environment
variable to the threshold percentage, up to a maximum value of 75. Set this to
0 to disable this feature.
This means that the longer the program has been running, the greater the distance between snapshots. Thus the first reverse-execution command after a long run is likely to be slow. Subsequent reverse-execution commands should be faster.