Post Failure Logging

Post Failure Logging is a tool to insert logging statements into a deployed application. For instance, it’s possible to analyse a LiveRecorder recording of a failed program execution and log the value of a variable when a function is called or some other condition is met.

Post Failure Logging enables debugging of applications in settings with Personally Identifiable Information (PII) and Intellectual Property (IP) concerns.

Two features of Post Failure Logging make this possible:

  • The ability to load LiveRecorder recordings containing encrypted symbols.
  • A straight-forward probe file syntax.

These features support one of the primary use cases for Post Failure Logging. This is when an engineer without either complete knowledge of the system being debugged or access to necessary systems must diagnose a failure, potentially on a customer site.

In such situations the on-site engineer’s lack of specific system knowledge might make a traditional debugging session with UDB impractical. Yet provided with probe files written by a more experienced engineer off-site, it may be possible to gain some useful insights into why the application is failing.

Since probe files are simple text files they can easily be sent from the off-site engineer to the on-site engineer by email, they could even be dictated over the phone if necessary.

Usage

The Post Failure Logging functionality is implemented as a command line tool called postfailurelog. Tool usage is shown below:

usage: postfailurelog
       [-h] [--start TIME] [--end TIME] [--enc-symbols SYMBOLS_FILE]
       RECORDING PROBE_FILE

The postfailurelog tool takes two mandatory command line arguments: a path to a LiveRecorder Recording, and a path to a Post Failure Logging probe file.

Probe files

An example probe file:

hello.c:12 => log i

The probe file above contains a single probe entry, which specifies that when line 12 of hello.c is reached, Post Failure Logging logs the value of the variable i.

The syntax for a probe entry is as follows:

breakpoint => action [,ARGS...]

breakpoint can be anything that would also be a valid breakpoint in UDB. For details on breakpoint syntax, see the breakpoint documentation.

action is a command that will be invoked when a breakpoint is hit.

A probe file can contain multiple probe entries, each one should be written on its own line. Each probe entry associates an UDB breakpoint with an action, available actions are:

  • log [,ARGS...]

    Evaluate and print ARGS.

  • printf FORMAT [,ARGS...]

    Evaluate and print ARGS using FORMAT as a printf-style format string.

  • backtrace

    Print a backtrace.

  • fail [MESSAGE]

    Print MESSAGE followed by a backtrace. postfailurelog will exit if this command is invoked by any breakpoint.

  • call FUNCTION [,ARGS...]

    Calls the function FUNCTION, optionally with a number of ARGS.

Examples

Assuming we have a simple program hello.c

1   #include <stdio.h>
2
3   int f(int x, double y)
4   {
5       printf("in function f! x: %d, y: %f\n", x, y);
6   }
7
8   int main(int argc, char *argv[])
9   {
10      int i;
11      for (i = 0; i < 5; i++) {
12          f(42, 3.14);
13      }
14      printf("Hello world\n");
15  }

We can use postfailurelog to debug a recording of the program’s execution.

$ gcc -O0 -g -o hello hello.c
$ live-record ./hello

A probe file must also be supplied to postfailurelog:

hello.c:12 => log i

This probe entry prints the value of the variable i when the program reaches line 12 of hello.c.

Running postfailurelog with this recording and probe file as arguments produces:

$ postfailurelog hello.undo example.probe
[22,425:0x5578242aa1b4  hello.c:12] i = 0
[42,875:0x5578242aa1b4  hello.c:12] i = 1
[43,496:0x5578242aa1b4  hello.c:12] i = 2
[44,117:0x5578242aa1b4  hello.c:12] i = 3
[44,738:0x5578242aa1b4  hello.c:12] i = 4

Each line has the format:

[BBCOUNT:PC BREAKPOINT] OUTPUT

where BBCOUNT is the bbcount, PC is the address of the program counter, BREAKPOINT is the breakpoint, and OUTPUT is the output of the probe entry. For more details on bbcount and program counter, see the time notation page.

Since a breakpoint in Post Failure Logging can be any valid UDB breakpoint, conditional breakpoints may also be used. For example:

hello.c:12 if i % 2 == 0 => log i

will print the value of i only for values of i which are even numbers.