Querying recordings

After using LiveRecorder for some time, you may find that you’ve built up a library of Undo recordings which you need to query in order to find a recording of interest, to understand the context in which a recording was made, or to statistically analyze aspects of the behavior of the recorded program.

Questions like these can often be answered using undo recording-json, which outputs a description of the contents of an Undo recording in JSON (JavaScript Object Notation). The output is not intended for human inspection, but should be piped into a suitable program for further processing, for example, jq, which is convenient for simple queries on the command line, or python, using the json module.

Note

jq may be installed using your distribution’s package manager.

On Fedora and Red Hat, using Extra Packages for Enterprise Linux (EPEL):

$ sudo yum install epel-release
$ sudo yum install jq

On Ubuntu:

$ sudo apt install jq

Usage synopsis

undo recording-json [OPTIONS ...] RECORDING

Options

--help, -h

Show help summary.

--sections SECTIONS, -s SECTIONS

Output only the specified sections of the recording, where SECTIONS is a string containing one or more of these letters:

Letter

Section

b

blocks

c

cpuinfo and vDSO entry points

d

debuggee and replayable information

e

events and synthetic events

f

files

h

header and sysinfo

l

layout summaries

m

maps

n

notes

--version, -v

Print version information and exit.

Output format

The output is a single JSON object with the following top-level keys.

"blocks"

List of blocks making up the recording file.

"config"

Product configuration parameters captured when the recording was saved. These are not needed for replaying the recording, but may be useful for understanding the context of the recording, or diagnosing problems in UDB itself.

"cpuinfo"

Object describing features of the CPU on which the recording was made that need to be queried when replaying the recording. (Contrast the "sysinfo" section, which contains general information about the recording machine.)

"debuggee"

General description of the recorded program.

"files"

List of files embedded in the recording that contain symbols and other debugging information needed when replaying the recording.

"header"

General information about the recording, including its format version, the Undo release that created it, the date and time that it was recorded, its compression scheme, and so on.

"notes"

List of notes added to the recording. Notes are key-value pairs that contain arbitrary data not required to replay the recording.

"replay_features"

Modifications to features when the program was recorded.

"replayables"

List of objects describing the “replayables”, data structures describing the recorded program at a particular point in execution history. There is always at least one replayable, describing the recorded program at the start of execution history. This object has keys:

"events"

List of events describing non-deterministic behavior of the recorded program.

"layouts"

Memory layout summaries used to ensure that the Undo Engine’s internal data structures do not collide with addresses used by the recorded program.

"maps"

Memory maps in the recorded program.

"replayable"

Information about the replayable.

"synthetic"

List of synthetic events (that is, not corresponding to events that happened when the program was recorded) that ensure that when the recording is replayed, threads and memory are the same as when the program was recorded.

"sysinfo"

Files, environment variables, logs, and other data captured when the recording was saved. These are not needed for replaying the recording, but may be useful for understanding the context of the recording, or diagnosing problems in UDB itself. (Contrast the "cpuinfo" and "files" sections, which contain information that is needed when replaying the recording.)

"vdso"

List of vDSO entry points in the recorded program.

Example queries

Using jq

Which Undo version made a recording?

$ undo recording-json -s h recording.undo | jq -r .header.undo_version
7.2.0

On which host was a recording made?

$ undo recording-json -s h recording.undo | jq -r .header.uname_nodename
test-x64-ykjfxtcr.example.com

Which instruction set extensions were used by the recorded program?

$ undo recording-json -s h recording.undo | jq -r .header.isaexts
avx avx2 bmi1 lm sse sse2 xsave xsavec

Pretty-print the vDSO memory map.

$ undo recording-json recording.undo | jq '.replayables[0].maps[] | select(.path=="[vdso]")'
{
  "begin": 140731969982464,
  "end": 140731969986560,
  "sparse": false,
  "path": "[vdso]",
  "read": true,
  "write": false,
  "execute": true,
  "shared": false,
  "grows_down": false,
  "inode": 0,
  "dev_major": 0,
  "dev_minor": 0,
  "is_shmat": false,
  "offset": 0,
  "pkey": 0
}

Print all events for all the replayables, one per line using the -c (compact) output format.

$ undo recording-json -s e recording.undo | jq -c .replayables[].events[] | tail -5
{"type":"pread64","size":80,"bbcount":770084,"syscall":17,"result":0}
{"type":"pread64","size":80,"bbcount":770091,"syscall":17,"result":0}
{"type":"close","size":72,"bbcount":771087,"syscall":3,"result":0}
{"type":"munmap","size":128,"bbcount":771114,"syscall":11,"result":0}
{"type":"write","size":80,"bbcount":771303,"syscall":1,"result":91}

Using python

What was the current directory when a recording was made?

import json, sys
for line in json.load(sys.stdin)["sysinfo"]["Child Environment"].splitlines():
    if line.startswith("PWD="):
        print(line.removeprefix("PWD="))
$ undo recording-json -s h recording.undo | python get_original_pwd.py
/testfarm/jobs/ykjfxtcr/workspace

What are the most common types of event in a recording?

import collections, json, sys
replayables = json.load(sys.stdin)["replayables"]
counter = collections.Counter(e["type"] for r in replayables for e in r["events"])
print(counter.most_common(4))
$ undo recording-json -s e recording.undo | python most_common_events.py
[('XSAVE', 1236), ('write', 151), ('open', 57), ('mmap', 52)]

Cautions

  • The exact format of the output is subject to change without notice in new UDB releases: we do not promise to maintain backwards compatibility. Changes will be noted in the UDB changelog.

  • The output includes addresses in the recorded program, which can be arbitrary 64-bit numbers. Some JSON-processing tools, notably jq, cannot represent integers larger than 253 exactly, rounding them to the nearest double-precision floating-point number instead. However, Python is safe in this respect.