Automation API: examples

To demonstrate the Automation API we will count how many times a function was called in a program’s execution history.

Preparation

This is a simple program which prints “Hello, world!” three times:

#include <stdio.h>

int
main(void)
{
    for (int i = 0; i < 3; i++)
    {
        printf("[%d] Hello, world!\n", i);
    }
    return 0;
}

Use the live-record tool to make a LiveRecorder recording of the program’s execution:

$ live-record -o hello.undo examples/hello-world
live-record: Maximum event log size is 1G
[0] Hello, world!
[1] Hello, world!
[2] Hello, world!
live-record: Saving to 'hello.undo'...
live-record: Termination recording written to hello.undo
live-record: Detaching...

Counting calls

Start UDB and use the uload command to load the recording:

$ udb -q
not running> uload hello.undo

The debugged program is at the beginning of recorded history. Start debugging
from here or, to proceed towards the end, use:
    continue - to replay from the beginning
    ugo end  - to jump straight to the end of history

No more reverse-execution history.
0x0000557b3eb92550 in _start ()

To count the number of times a function is executed, we must:

  1. Go to the start of execution history, using udb.time.goto_start().
  2. Set a breakpoint on the function, using gdb.Breakpoint.
  3. Replay to the end of execution history, using gdb.execute() in a loop.
  4. Print the breakpoint’s hit_count.

This is implemented in examples/automation/count_calls.py:

import gdb

from undodb.debugger_extensions import udb


def count_calls(func_name: str):
    """
    Return count of calls to func_name in execution history.
    """
    # Go to the start of execution history so we count all calls.
    udb.time.goto_start()

    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Repeatedly "continue" until we reach the end of execution history.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while udb.time.get().bbcount < end_of_time:
        gdb.execute("continue")

    print(f"The recording hit {func_name!r} {bp.hit_count} time(s).")

Use the source command to load this file, and the python command to execute the count_calls() function.

start 1> source examples/automation/count_calls.py
start 1> python count_calls("printf")

No more reverse-execution history.
0x0000557b3eb92550 in _start ()
Breakpoint 1 at 0x7f744b957f70: file printf.c, line 28.

Breakpoint 1, __printf (format=0x557b3eb92724 "[%d] Hello, world!\n") at printf.c:28
28      printf.c: No such file or directory.
[0] Hello, world!

Breakpoint 1, __printf (format=0x557b3eb92724 "[%d] Hello, world!\n") at printf.c:28
28      in printf.c
[1] Hello, world!

Breakpoint 1, __printf (format=0x557b3eb92724 "[%d] Hello, world!\n") at printf.c:28
28      in printf.c
[2] Hello, world!

Program received signal SIGSTOP, Stopped (signal).
Have reached end of recorded history.

The recorded program has exited.
You can use UDB reverse commands to go backwards; see "help udb" for details.

0x00007f744b9d7bd6 in __GI__exit (status=0) at ../sysdeps/unix/sysv/linux/_exit.c:31
31      ../sysdeps/unix/sysv/linux/_exit.c: No such file or directory.
The recording hit 'printf' 3 time(s).

If the syntax of the python command proves inconvenient, you can implement a new UDB command by deriving from the gdb.Command class:

import gdb

from undodb.debugger_extensions import udb


def count_calls(func_name: str) -> int:
    """
    Return count of calls to func_name in execution history.
    """
    # Go to the start of execution history so we count all calls.
    udb.time.goto_start()

    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Repeatedly "continue" until we reach the end of execution history.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while udb.time.get().bbcount < end_of_time:
        gdb.execute("continue")

    return bp.hit_count


class CountCalls(gdb.Command):
    def __init__(self):
        # Register this class for the command called "count-calls" in the
        # category of user commands.
        super().__init__("count-calls", gdb.COMMAND_USER)

    def invoke(self, args: str, from_tty: bool) -> None:
        # This method is called by GDB when the command is used.
        # The args argument is what the user types as command arguments which,
        # for this command, is the function name.
        count = count_calls(args)

        print(f"The program called {args!r} {count} time(s).")


# Register the command by calling the constructor.
CountCalls()

This defines a count-calls command that takes the name of the function as an argument and passes it to count_calls().

end 4,502> source examples/automation/count_calls_command.py
end 4,502> count-calls printf
The program called 'printf' 3 time(s).

Automating UDB’s execution

For repetitive tasks, it can be more convenient to control UDB directly than to implement new interactive commands. For instance, we might want a script that takes a LiveRecorder recording and a function name and outputs the number of calls to the function in the recording.

$ examples/automation/count_calls_script.py hello.undo printf
The recording hit 'printf' 3 time(s).

This can be achieved by having two Python files:

  • A launcher script which sets UDB up and executes it.
  • An extension script which extends UDB and runs inside it.

The script running outside UDB can use the udb_launcher.UdbLauncher class to execute UDB. This class allows the script to:

  • Configure which command line arguments are passed to UDB.
  • Configure which program or recording to load inside UDB.
  • Configure which extension Python scripts to load inside UDB.
  • Pass data to the Python scripts running inside UDB through the run_data attribute of the UdbLauncher instance. This is a dictionary which can contain arbitrary data as long as it can be serialized by Python. See Python object serialization for details.
  • Redirect the output of UDB to a file or collect it in memory.
  • Launch UDB with the run_debugger() method.

Note that the script launching UDB cannot use the gdb and debugger_extensions modules as those are available only inside UDB itself.

To execute scripts using the UdbLauncher class, you need to use the udb-automate program which makes sure that the Python environment is set up correctly to execute Undo code. The easiest way to do this is to make the script executable and then add the following line at the very top of the script:

#! /usr/bin/env udb-automate

See the udb-automate section for more details.

How to write a separate runner script

Based on what was explained in the previous section, we can write a count_calls_script.py file which uses UdbLauncher:

#! /usr/bin/env udb-automate
"""
UDB Automation command-line script that counts the calls to a function in a
LiveRecorder recording.
"""

import os
import sys
import textwrap

from undodb.udb_launcher import (
    REDIRECTION_COLLECT,
    UdbLauncher,
)


def main(argv):
    # Get the arguments from the command line.
    try:
        recording, func_name = argv[1:]
    except ValueError:
        # Wrong number of arguments.
        print(f"{sys.argv[0]} RECORDING_FILE FUNCTION_NAME", file=sys.stderr)
        raise SystemExit(1)

    # Prepare for launching UDB.
    launcher = UdbLauncher()
    # Make UDB run with our recording.
    launcher.recording_file = recording
    # Make UDB load the count_calls_extension.py file from the same directory
    # as this script.
    sys.path.append(os.path.dirname(__file__))
    launcher.add_extension("count_calls_extension")
    # Tell the extension which function name it needs to check.
    # The run_data attribute is a dictionary in which arbitrary data can be
    # stored and passed to the extension (as long as it can be serialised using
    # the Python pickle module).
    launcher.run_data["func_name"] = func_name
    # Finally, launch UDB!
    # We collect the output as, in normal conditions, we don't want to show it
    # to the user but, in case of errors, we want to display it.
    res = launcher.run_debugger(redirect_debugger_output=REDIRECTION_COLLECT)

    if res.exit_code == 0:
        # All good as UDB exited with exit code 0 (i.e. no errors).

        # The result_data attribute is analogous to UdbLauncher.run_data but
        # it's used to pass information the opposite way, from the extension
        # to this script.
        count = res.result_data["hit-count"]
        print(f"The recording hit {func_name!r} {count} time(s).")
    else:
        # Something went wrong! Print a useful message.
        print(
            textwrap.dedent(
                """\
                Error!
                UDB exited with code {res.exit_code}.

                The output was:

                {res.output}
                """
            ).format(res=res),
            file=sys.stderr,
        )
        # Exit this script with the same error code as UDB.
        raise SystemExit(res.exit_code)


if __name__ == "__main__":
    main(sys.argv)

Now we need to write a count_calls_extension.py file (based on the count_calls.py file written in the previous section) to be loaded inside UDB:

"""
UDB Automation extension module for counting the number of calls to a
function.
"""

import gdb

from undodb.debugger_extensions import udb


def count_calls(func_name):
    """
    Counts how many times func_name is hit during the replay of the currently
    loaded recording and returns the hit count.
    """
    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Do "continue" until we have gone through the whole recording, potentially
    # hitting the breakpoint several times.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while udb.time.get().bbcount < end_of_time:
        gdb.execute("continue")

    return bp.hit_count


# UDB will automatically load the modules passed to UdbLauncher.add_extension
# and, if present, automatically execute any function (with no arguments) called
# "run".
def run():
    # The function where to stop is passed to us from the outer script in the
    # run_data dictionary.
    func_name = udb.run_data["func_name"]

    hit_count = count_calls(func_name)

    # Pass the number of time we hit the breakpoint back to the outer script.
    udb.result_data["hit-count"] = hit_count

It’s now possible to execute the script:

$ examples/automation/count_calls_script.py hello.undo printf
The recording hit 'printf' 3 time(s).
$ examples/automation/count_calls_script.py hello.undo main
The recording hit 'main' 1 time(s).
$ examples/automation/count_calls_script.py hello.undo no_such_function
The recording hit 'no_such_function' 0 time(s).

Advanced output control

The run_debugger() method offers many ways of redirecting the output of the debugger and of programs running in the debugger.

In the example in the previous section, the output from UDB is quite verbose, for instance printing some information every time a breakpoint is hit. This is of little interest to the user who only cares about the final result (i.e. how many times a function is called) so the output is redirected and collected to memory rather than printed on standard output. The redirection is achieved by setting the redirect_output argument to udb_launcher.REDIRECTION_COLLECT. In case of error, to help debugging, the collected output can be printed to standard error.

It’s also possible to redirect the output to a file-like object (a file opened with open(), a StringIO, etc.) or discard it completely. See UdbLauncher.run_debugger() for details.

There are cases in which it’s convenient to redirect the output like in the example, but still allow some output from the extension script to be printed directly on standard output. For instance, in count_calls_extension.py, we may want to print some extra information (the current time in execution history and the backtrace) every time the function being counted is called.

This can be achieved using the debugger_extensions.debugger_io.redirect_to_launcher_output() context manager. Anything executed in the with block will produce output directly on standard output.

The count_calls_extension.py example can be modified like this:

"""
UDB Automation extension module for counting the number of calls to a
function.
"""

import gdb

from undodb.debugger_extensions import udb
from undodb.debugger_extensions.debugger_io import redirect_to_launcher_output


def count_calls(func_name):
    """
    Counts how many times func_name is hit during the replay of the currently
    loaded recording and returns the hit count.

    Every time func_name is hit the current backtrace and time in execution
    history is printed.
    """
    # Set a breakpoint for the specified function.
    bp = gdb.Breakpoint(func_name)

    # Do "continue" until we have gone through the whole recording, potentially
    # hitting the breakpoint several times.
    end_of_time = udb.get_event_log_extent().max_bbcount
    while True:
        gdb.execute("continue")

        # Rather than having the check directly in the while condition we have
        # it here as we don't want to print the backtrace when we hit the end of
        # the recording but only when we stop at a breakpoint.
        if udb.time.get().bbcount >= end_of_time:
            break

        with redirect_to_launcher_output():
            # The output from the code in this "with" block is going to go to
            # the standard output of count_calls.py instead of being redirected
            # with the rest of the output.
            print(f"The backtrace at time {udb.time.get()} is:")
            gdb.execute("backtrace")
            print()

    return bp.hit_count


# UDB will automatically load the modules passed to UdbLauncher.add_extension
# and, if present, automatically execute any function (with no arguments) called
# "run".
def run():
    # The function where to stop is passed to us form the outer script in the
    # run_data dictionary.
    func_name = udb.run_data["func_name"]

    hit_count = count_calls(func_name)

    # Pass the number of time we hit the breakpoint back to the outer script.
    udb.result_data["hit-count"] = hit_count

The count_calls.py script will now show this output:

$ ./count_calls.py print-hello-world.undo printf
The backtrace at time 17,268:0x7fbc0a5d5d70 is:
#0  __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
#1  0x000056103061c174 in main () at print-hello-world.c:7

The backtrace at time 37,063:0x7fbc0a5d5d70 is:
#0  __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
#1  0x000056103061c174 in main () at print-hello-world.c:7

The backtrace at time 37,239:0x7fbc0a5d5d70 is:
#0  __printf (format=0x56103061d004 "[%d] Hello world!\n") at printf.c:28
#1  0x000056103061c174 in main () at print-hello-world.c:7

The recording hit "printf" 3 time(s).

Other examples

Other examples of using the Automation API are available in the undoio/addons GitHub repository.

All the addons are free and reusable, and distributed under the 3-clause BSD licence.