Troubleshooting

Application failures are not being captured in recordings

  • If the program is multi-threaded, use the Thread Fuzzing feature. This permutes the scheduling of threads in the program while it is being recorded, to increase the frequency with which concurrency bugs are reproduced. Use the --thread-fuzzing option to live-record, or the UNDO_tf environment variable.

  • Run the program many times (tens, hundreds, or even thousands) until it fails and a recording is saved.

    When using the live-record tool, use the --retry-for option to run the program for a specified duration or until a recording is saved, and the --save-on error option to save a recording only if the program exits on a signal or exits with a non-zero status.

    When using the LiveRecorder API, call undolr_save_on_termination_cancel() if the program succeeds, to avoid unnecessarily saving a recording in this case.

  • Run the stress-ng program in another terminal to keep other CPUs busy. On Ubuntu, stress-ng can be installed using:

    $ apt install stress-ng
    

    For details on using stress-ng, see the stress-ng manual.

Recordings don’t have debug symbols

  • If your build system does not already generate debugging information it must be changed to do so (using for example the -ggdb compile option for GCC and Clang).

  • If the executables that are being recorded have been intentionally stripped the original debug files must be provided separately in the host environment where the recording is replayed.

    • Your build system must ensure that the original debug files are retained before the executables are stripped, for example using the -gsplit-dwarf option to GCC.

    • Depending on how the original debug files are retained you may need to use different UDB commands to load them (set sysroot, set debug-file-directory, symbol-file, add-symbol-file).

    • It can be helpful to create an integration script to automate the process of locating the original debug files and configuring UDB to load them.

Recordings are filling up the disks in Continuous Integration (CI)

Generating a large number of recordings can cause the storage on your test systems to fill up.

  • Consider saving recordings only if there’s evidence of an application failure. If using the LiveRecorder command-line tool, consider using the --save-on option so that recordings are only saved in specified circumstances.

  • If using the LiveRecorder API, consider saving recordings in crash-handling code by calling undolr_save_on_termination() and undolr_save_on_termination_cancel() rather than in your normal termination code path.

  • Consider uploading recordings to elastic cloud storage immediately after they are generated. They can then be deleted locally to save disk space.