Automation API: examples

To demonstrate how to use the Automation API we will write a simple script which counts how many times a function is hit during the execution of a recorded program.

Preparation

First of all, we need to write the code on which to demonstrate the Automation API. This is a simple hello world program which prints the “Hello world” string three times:

#include <stdio.h>

int
main()
{
    for (int i = 0; i < 3; i++)
    {
        printf("[%d] Hello world!\n", i);
    }

    return 0;
}

Assuming the file with the above source code is called print-hello-world.c, it can be compiled with:

gcc -O0 -ggdb -o print-hello-world print-hello-world.c

Where:

  • -O0 disables all compiled optimisations.
  • -ggdb generates debug symbols in the most expressive format which can be loaded by GDB (and UDB).
  • -o print-hello-world makes the output file be named print-hello-world.
  • print-hello-world.c is the name of the source file.

Executing the program will produce the following output:

$ ./print-hello-world
[0] Hello world!
[1] Hello world!
[2] Hello world!

For this example, we are going to use a LiveRecordef recording of the print-hello-world program. This can be generated with the live-record tool:

$ live-record --recording-file print-hello-world.undo ./print-hello-world
live-record: Maximum event log size is 1G
[0] Hello world!
[1] Hello world!
[2] Hello world!
live-record: Saving to '/.../print-hello-world.undo'...
live-record: Termination recording written to /.../print-hello-world.undo

This generates a print-hello-world.undo recording file in the current directory.

Extending UDB to count the calls to a function

First of all, we need to start UDB and use the uload command to load the recording produced in the previous section:

$ udb
[...]
(udb) uload print-hello-world.undo

The debugged program is at the beginning of recorded history. Start debugging
from here or, to proceed towards the end, use:
    continue - to replay from the beginning
    ugo end  - to jump straight to the end of history

No more reverse-execution history.
0x00007f9ef3bb8100 in _start ()
      from /tmp/undodb.[...]/ld-linux-x86-64.so.2
(udb)

Then we need to write the code which counts the number of times a function is executed. This can be done by:

  1. Ensuring that execution is at the beginning of history.
  2. Setting a breakpoint on the function using the gdb.Breakpoint class.
  3. Continuing replay until the end of the recording is reached.
  4. Checking the value of gdb.Breakpoint.hit_count.

This can be done using the following code (saved in a file called counter.py):

import gdb

from undodb.debugger_extensions import udb


def count_calls(func_name):
    """
    Counts how many times func_name is hit during the replay of the currently
    loaded recording.
    """
    # Go to the beginning of the recording so we count function during the whole
    # execution history.
    udb.time.goto_start()

    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Do "continue" until we have gone through the whole recording, potentially
    # hitting the breakpoint several times.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while udb.time.get().bbcount < end_of_time:
        gdb.execute("continue")

    print(f'The recording hit "{func_name}" {bp.hit_count} time(s).')

It’s now possible to source the file inside UDB and call the count_calls function to count how many times the printf function is called:

(udb) source counter.py
(udb) python count_calls('printf')

No more reverse-execution history.
0x00007fbc0a76d100 in _start ()
   from /tmp/undodb.[...]/ld-linux-x86-64.so.2
Breakpoint 1 at 0x56103061c050

Breakpoint 1, __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
28   printf.c: No such file or directory.
[0] Hello world!

Breakpoint 1, __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
28   in printf.c
[1] Hello world!

Breakpoint 1, __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
28   in printf.c
[2] Hello world!

Program received signal SIGSTOP, Stopped (signal).
Have reached end of recorded history.
The recorded program has exited.
(You may use undodb commands to go backwards.)
0x00007fbc0a656fe4 in __GI__exit (status=status@entry=0) at ../sysdeps/unix/sysv/linux/_exit.c:31
31   ../sysdeps/unix/sysv/linux/_exit.c: No such file or directory.
The recording hit "printf" 3 time(s).
(udb)

In particular:

  • source counter.py sources the file into UDB (which recognises automatically that the file is a Python file).
  • python count_calls('printf') calls the count_calls function with the 'printf' string as argument.
  • The recording hit "printf" 3 time(s) is printed by the count_calls function and shows how many times printf was called in the recording.

Rather then calling a Python function, it’s possible to create a new UDB command by inheriting from the gdb.Command class:

import gdb

from undodb.debugger_extensions import udb


def count_calls(func_name):
    """
    Counts how many times func_name is hit during the replay of the currently
    loaded recording and returns the hit count.
    """
    # Go to the beginning of the recording so we count function during the whole
    # execution history.
    udb.time.goto_start()

    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Do "continue" until we have gone through the whole recording, potentially
    # hitting the breakpoint several times.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while udb.time.get().bbcount < end_of_time:
        gdb.execute("continue")

    return bp.hit_count


class CountCalls(gdb.Command):
    def __init__(self):
        # Register this class for the command called "count-calls" in the
        # category of user commands.
        super().__init__("count-calls", gdb.COMMAND_USER)

    def invoke(self, args, from_tty):
        # This method is called by GDB when the command is used.
        # The args argument is what the user types as command arguments which,
        # for this command, is the function name.
        hit_count = count_calls(args)

        print(f'The recording hit "{args}" {hit_count} time(s).')


# Register the command by allocating its class.
CountCalls()

After sourcing the modified file, it’s possible to call the count-calls command passing the function name to count as argument:

(udb) source counter.py
(udb) count-calls printf
[...]
The recording hit "printf" 3 time(s).

Automating UDB’s execution

Adding commands in UDB is helpful for interactive use but, in some cases, it’s more helpful to have a script usable from the shell which hides UDB completely.

For instance, in the case of the example above, it would be easier to use a script which can be invoked like this:

$ ./count_calls.py print-hello-world.undo printf
The recording hit "printf" 3 time(s).

This can be achieved by having two Python files:

  • A launcher script which sets UDB up and executes it.
  • An extension script which extends UDB and runs inside it.

The script running outside UDB can use the udb_launcher.UdbLauncher class to execute UDB. This class allows the script to:

  • Configure which command line arguments are passed to UDB.
  • Configure which program or recording to load inside UDB.
  • Configure which extension Python scripts to load inside UDB.
  • Pass data to the Python scripts running inside UDB through the run_data attribute of the UdbLauncher instance. run_data is a dictionary which can contain arbitrary data as long as it can be serialized by Python. See Python object serialization for details.
  • Redirect the output of UDB to a file or collect it in memory.
  • Launch UDB with the run_debugger() method.

Note that the script launching UDB cannot use the gdb and debugger_extensions modules as those are available only inside UDB itself.

To execute scripts using the UdbLauncher class, you need to use the udb-automate program which makes sure that the Python environment is set up correctly to execute Undo code. The easiest way to do this is to make the script executable and then add the following line at the very top of the script:

#! /usr/bin/env udb-automate

See the udb-automate section for more details.

How to write a separate runner script

Based on what was explained in the previous section, we can write a count_calls.py file which uses UdbLauncher:

#! /usr/bin/env udb-automate
"""
UDB Automation command-line script that counts the calls to a function in a
LiveRecorder recording.
"""

import sys
import textwrap

from undodb.udb_launcher import (
    REDIRECTION_COLLECT,
    UdbLauncher,
)


def main(argv):
    # Get the arguments from the command line.
    try:
        recording, func_name = argv[1:]
    except ValueError:
        # Wrong number of arguments.
        print(f"{sys.argv[0]} RECORDING_FILE FUNCTION_NAME", file=sys.stderr)
        raise SystemExit(1)

    # Prepare for launching UDB.
    launcher = UdbLauncher()
    # Make UDB run with our recording.
    launcher.recording_file = recording
    # Make UDB load the count_calls_extension.py file from the current
    # directory.
    launcher.add_extension("count_calls_extension")
    # Tell the extension which function name it needs to check.
    # The run_data attribute is a dictionary in which arbitrary data can be
    # stored and passed to the extension (as long as it can be serialised using
    # the Python pickle module).
    launcher.run_data["func_name"] = func_name
    # Finally, launch UDB!
    # We collect the output as, in normal conditions, we don't want to show it
    # to the user but, in case of errors, we want to display it.
    res = launcher.run_debugger(redirect_debugger_output=REDIRECTION_COLLECT)

    if res.exit_code == 0:
        # All good as UDB exited with exit code 0 (i.e. no errors).
        print(
            'The recording hit "{}" {} time(s).'.format(
                func_name,
                # The result_data attribute is analogous to UdbLauncher.run_data but
                # it's used to pass information the opposite way, from the extension
                # to this script.
                res.result_data["hit-count"],
            )
        )
    else:
        # Something went wrong! Print a useful message.
        print(
            textwrap.dedent(
                """\
                Error!
                UDB exited with code {res.exit_code}.

                The output was:

                {res.output}
                """
            ).format(res=res),
            file=sys.stderr,
        )
        # Exit this script with the same error code as UDB.
        raise SystemExit(res.exit_code)


if __name__ == "__main__":
    main(sys.argv)

The count_calls.py script needs to be made executable so it can be run directly:

$ chmod +x ./count_calls.py

Now we need to write a count_calls_extension.py file (based on the counter.py file written in the previous section) to be loaded inside UDB:

"""
UDB Automation extension module for counting the number of calls to a
function.
"""

import gdb

from undodb.debugger_extensions import udb


def count_calls(func_name):
    """
    Counts how many times func_name is hit during the replay of the currently
    loaded recording and returns the hit count.
    """
    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Do "continue" until we have gone through the whole recording, potentially
    # hitting the breakpoint several times.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while udb.time.get().bbcount < end_of_time:
        gdb.execute("continue")

    return bp.hit_count


# UDB will automatically load the modules passed to UdbLauncher.add_extension
# and, if present, automatically execute any function (with no arguments) called
# "run".
def run():
    # The function where to stop is passed to us from the outer script in the
    # run_data dictionary.
    func_name = udb.run_data["func_name"]

    hit_count = count_calls(func_name)

    # Pass the number of time we hit the breakpoint back to the outer script.
    udb.result_data["hit-count"] = hit_count

It’s now possible to execute the script:

$ ./count_calls.py print-hello-world.undo printf
The recording hit "printf" 3 time(s).
$ ./count_calls.py print-hello-world.undo main
The recording hit "main" 1 time(s).
$ ./count_calls.py print-hello-world.undo this_does_not_exist
The recording hit "this_does_not_exist" 0 time(s).

Advanced output control

The UdbLauncher.run_debugger() method offers many ways of redirecting the output of the debugger and of programs running in the debugger.

In the example in the previous section, the output from UDB is quite verbose, for instance printing some information every time a breakpoint is hit. This is of little interest to the user who only cares about the final result (i.e. how many times a function is called) so the output is redirected and collected to memory rather than printed on standard output. The redirection is achieved by setting the redirect_output argument to udb_launcher.REDIRECTION_COLLECT. In case of error, to help debugging, the collected output can be printed to standard error.

It’s also possible to redirect the output to a file-like object (a file opened with open(), a StringIO, etc.) or discard it completely. See UdbLauncher.run_debugger() for details.

There are cases in which it’s convenient to redirect the output like in the example, but still allow some output from the extension script to be printed directly on standard output. For instance, in count_calls_extension.py, we may want to print some extra information (the current time in execution history and the backtrace) every time the function being counted is called.

This can be achieved using the debugger_extensions.debugger_io.redirect_to_launcher_output() context manager. Anything executed in the with block will produce output directly on standard output.

The count_calls_extension.py example can be modified like this:

"""
UDB Automation extension module for counting the number of calls to a
function.
"""

import gdb

from undodb.debugger_extensions import udb
from undodb.debugger_extensions.debugger_io import redirect_to_launcher_output


def count_calls(func_name):
    """
    Counts how many times func_name is hit during the replay of the currently
    loaded recording and returns the hit count.

    Every time func_name is hit the current backtrace and time in execution
    history is printed.
    """
    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Do "continue" until we have gone through the whole recording, potentially
    # hitting the breakpoint several times.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while True:
        gdb.execute("continue")

        # Rather than having the check directly in the while condition we have
        # it here as we don't want to print the backtrace when we hit the end of
        # the recording but only when we stop at a breakpoint.
        if udb.time.get().bbcount >= end_of_time:
            break

        with redirect_to_launcher_output():
            # The output from the code in this "with" block is going to go to
            # the standard output of count_calls.py instead of being redirected
            # with the rest of the output.
            print(f"The backtrace at time {udb.time.get()} is:")
            gdb.execute("backtrace")
            print()

    return bp.hit_count


# UDB will automatically load the modules passed to UdbLauncher.add_extension
# and, if present, automatically execute any function (with no arguments) called
# "run".
def run():
    # The function where to stop is passed to us form the outer script in the
    # run_data dictionary.
    func_name = udb.run_data["func_name"]

    hit_count = count_calls(func_name)

    # Pass the number of time we hit the breakpoint back to the outer script.
    udb.result_data["hit-count"] = hit_count

The count_calls.py script will now show this output:

$ ./count_calls.py print-hello-world.undo printf
The backtrace at time 17,268:0x7fbc0a5d5d70 is:
#0  __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
#1  0x000056103061c174 in main () at print-hello-world.c:7

The backtrace at time 37,063:0x7fbc0a5d5d70 is:
#0  __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
#1  0x000056103061c174 in main () at print-hello-world.c:7

The backtrace at time 37,239:0x7fbc0a5d5d70 is:
#0  __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
#1  0x000056103061c174 in main () at print-hello-world.c:7

The recording hit "printf" 3 time(s).

Other examples

Other examples of using the Automation API are available in the undoio/addons GitHub repository.

All the addons are free and reusable, and distributed under the 3-clause BSD licence.