Log Jump

The Log Jump feature bridges the gap between log files and LiveRecorder recordings by matching a log entry with a time in the execution history of a program. This matching can be done in two different ways:

Using wall-clock times

The ugo wallclock command in UDB jumps to the time in execution history corresponding to a wall-clock time.

The info wallclock command displays the wall-clock time corresponding to the current time in execution history.

The info wallclock-extent command displays the range of wall-clock times covered by execution history.

ugo wallclock wall-clock-time

Jump to a time in execution history near wall-clock-time.

For example, suppose that a program was recorded by live-record and its log contains a failure:

[2022-01-27 15:51:13.388566] pass x=5.6919e-07 sinf=0.000001 sine=0.000001
[2022-01-27 15:51:13.392085] pass x=2.0994 sinf=0.863510 sine=0.863510
[2022-01-27 15:51:13.392128] pass x=0.456745 sinf=0.441029 sine=0.441029
[2022-01-27 15:51:13.392154] pass x=0.767667 sinf=0.694458 sine=0.694458
[2022-01-27 15:51:13.392159] pass x=1.60536e-06 sinf=0.000002 sine=0.000002
[2022-01-27 15:51:13.392163] pass x=4.20805e-08 sinf=0.000000 sine=0.000000
[2022-01-27 15:51:13.392169] pass x=3.8617e-10 sinf=0.000000 sine=0.000000
[2022-01-27 15:51:13.392270] fail x=2.08741e+09 sinf=-0.247028 sine=-0.247028
[2022-01-27 15:51:13.392492] pass x=834758 sinf=-0.898864 sine=-0.898864
[2022-01-27 15:51:13.392544] pass x=422311 sinf=-0.865702 sine=-0.865702

To investigate the failure, load the recording (see Loading and saving recordings) and jump to the wall-clock time for the log entry:

$ udb
not running> uload recording.undo

The debugged program is at the beginning of recorded history. Start debugging
from here or, to proceed towards the end, use:
    continue - to replay from the beginning
    ugo end  - to jump straight to the end of history

No more reverse-execution history.
0x000055f8d20ab130 in _start ()
start 1> ugo wallclock 2022-01-27 15:51:13.392270
Going to time 6,205.
0x00007ffc5abb6850 in gettimeofday ()
76% 6,205> backtrace
#0  0x00007ffc5abb6850 in gettimeofday ()
#1  0x000055f8d20ab286 in log_message (format=0x55f8d20ac085 "%s x=%g sinf=%f sine=%f") at sine.c:21
#2  0x000055f8d20ab694 in check (x=2.08740941e+09) at sine.c:80
#3  0x000055f8d20ab83d in main (argc=3, argv=0x7ffc5ab64688) at sine.c:112

The ugo wallclock command jumps to the time in execution history where the gettimeofday() function was called. The backtrace shows that this time is, as expected, inside the call to the logging function: to go back to the code which called log_message(), use the reverse-finish command:

76% 6,205> reverse-finish
0x000055f8d20ab281 in log_message (format=0x55f8d20ac085 "%s x=%g sinf=%f sine=%f") at sine.c:21
21          ssize_t e = gettimeofday(&tv, NULL);
76% 6,203> reverse-finish
0x000055f8d20ab68f in check (x=2.08740941e+09) at sine.c:80
80          log_message("%s x=%g sinf=%f sine=%f", s1 == s2 ? "pass" : "fail", x, s1, s2);
76% 6,202> print x
$1 = 2.08740941e+09

The ugo wallclock command supports many commonly used date and time formats with and without timezones, including the standard ISO 8601 or RFC 3339 format (2021-03-14T21:08:58Z). The handling of ambiguous formats is undefined.

This command is affected by the Log Jump limitations.

info wallclock

The interpolated wall-clock time for the current time in execution history.

For example:

76% 6,202> info wallclock
2022-01-27T15:51:13.392269Z +/- 101 microseconds
76% 6,202> ugo wallclock 2022-01-27 15:51:13.392270
Going to time 6,205.
0x00007ffc5abb6850 in gettimeofday ()
76% 6,205> info wallclock
2022-01-27T15:51:13.392270Z +/- 0 microseconds

This command is affected by the Log Jump limitations, which are expressed in the form of a confidence interval after the wall-clock time. In the above example, the program is at a call to gettimeofday() and so the wall-clock time is known exactly and the confidence interval is +/- 0 microseconds.

But after moving to a nearby point in execution history, for example by using the finish command, the wall-clock time is only known approximately, and so is reported with a wider confidence interval:

76% 6,205> finish
Run till exit from #0  0x00007ffc5abb6850 in gettimeofday ()
log_message (format=0x55f8d20ac085 "%s x=%g sinf=%f sine=%f") at sine.c:21
21          ssize_t e = gettimeofday(&tv, NULL);
76% 6,207> info wallclock
2022-01-27T15:51:13.392271Z +/- 222 microseconds

info wallclock-extent

The wall-clock times at the start and end of execution history.

The times are printed in the default timezone. See the set timezone command. For example:

76% 6,207> info wallclock-extent
Start time: 2022-01-27T15:51:13Z
End time:   2022-01-27T15:51:13Z

Timezones

Log Jump understands wall-clock times with explicit timezones, for example, a wall-clock time ending with +01:00 refers to Central European Time (one hour ahead of UTC), while one ending with -06:00 refers to Central Standard Time (six hours behind UTC).

If no timezone information is specified, the time is interpreted using the default timezone, which is UTC (Coordinated Universal Time) by default.

You can change the default timezone using the set timezone command.

set timezone timezone

Set the default timezone for wall-clock times.

The default timezone is used to interpret the argument to the ugo wallclock command if it lacks a timezone, and to format the output of the info wallclock and info wallclock-extent commands.

The default timezone is UTC by default. If your logs contain timestamps in local time but without an explicit timezone, use this command to set an more convenient default timezone.

Specify timezone using the format “Area/Location” employed by the IANA timezone database. See Wikipedia’s List of tz database time zones for a full list of timezone names.

For example:

76% 6,207> info wallclock
2022-01-27T15:51:13.392271Z +/- 222 microseconds
76% 6,207> set timezone Asia/Tokyo
Default timezone set to Asia/Tokyo.
76% 6,207> info wallclock
2022-01-28T00:51:13.392271+09:00 +/- 222 microseconds

After setting the default timezone, all times without an explicit timezone are interpreted as if belonging to the default timezone. For example:

76% 6,207> ugo wallclock 2022-01-28T00:51:13.388566
Going to time 670.
0x00007ffc5abb6850 in gettimeofday ()
8% 670> info wallclock
2022-01-28T00:51:13.388566+09:00 +/- 0 microseconds

The default timezone can be displayed with the show timezone command, and the default timezone can be reset to UTC with the unset timezone command.

show timezone

The default timezone for wall-clock times.

For example:

8% 670> show timezone
Default wall-clock timezone is Asia/Tokyo.

unset timezone

Reset the default timezone for wall-clock times to UTC.

Log Jump limitations

  • Log Jump correlates wall-clock time to execution history based on the presence of calls to gettimeofday() or clock_gettime() made by the program being recorded. If the program makes no calls to these functions, the Log Jump feature is not available.

    Note

    clock_gettime() is supported only when used with clk_id set to CLOCK_REALTIME. This is because a clk_id of CLOCK_REALTIME_COARSE generates low-precision wall-clock times that cannot be combined with high-precision wall-clock times, and other clk_id values such as CLOCK_MONOTONIC produce times which are not related to the wall-clock time.

    This means that an exact match between wall-clock time and execution history can be obtained only at the exact point when one of the supported functions is called. For all other points, the wall-clock time is estimated by linearly interpolating the results from nearby calls to the supported functions. This means that if calls to the supported functions are few and far between, the confidence interval can be large.

    Warning

    The interpolation algorithm assumes monotonicity of wall-clock times, meaning that if the system time is adjusted during the execution of the program, this can produce unreliable results.

Using UDB time notation

Log Jump works with any recorded program which uses the supported wall-clock time functions. However, if it is possible to modify the program, a more powerful option is to include timestamps in UDB’s time notation in log entries.

Adding timestamps in UDB’s time notation to log entries

This functionality is provided by the undoex_get_bbcount() function declared in undoex/undoex-engine-status.h in the libundoex library. The time in execution history is returned as an unsigned 64-bit integer representing the bbcount.

undoex_get_bbcount() can be used as follows:

#include <errno.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>

#include "undoex-engine-status.h"

static void
print_current_bbcount(void)
{
    uint64_t bbcount = undoex_get_bbcount();
    if (bbcount > 0)
    {
        printf("The current time in execution history is: %" PRIu64 "\n", bbcount);
    }
    else if (errno == ENOTSUP)
    {
        printf("Not currently recording\n");
    }
    else
    {
        perror("Couldn't access the time in execution history");
    }
}

int
main(void)
{
    print_current_bbcount();
    return EXIT_SUCCESS;
}

The functions in the libundoex library work when the program is recorded under UDB and when recorded through the LiveRecorder API. If called when the program is not being recorded, undoex_get_bbcount() returns 0 (which is not a valid bbcount) and sets errno to ENOTSUP.

You can find the libundoex library and header files in the undoex subdirectory of the UDB release. The library is available as both a statically linked library (libundoex_pic_<arch>.a) and a dynamically linked library (libundoex_<arch>.so). There are no other dependencies, so the compilation command will be similar to this:

$ gcc -ggdb -Iundoex undoex/examples/simple_ex.c undoex/libundoex_pic_x64.a -o simple_ex

Using the bbcount

The bbcount returned by undoex_get_bbcount() can be used in UDB, after loading the corresponding LiveRecorder recording file, to jump to the bbcount in execution history when the function was called.

When debugging the code above, the output looks like this:

$ udb ./simple_ex
Reading symbols from ./simple_ex...
not running> run
Starting program: simple_ex
The current time in execution history is: 12585

Program received signal SIGSTOP, Stopped (signal).

The program has exited, but is still being debugged.
You can use UDB reverse commands to go backwards; see "help udb" for details.

0x00007ffff7eb2699 in _exit () from /lib/x86_64-linux-gnu/libc.so.6

Now you can use the ugo time command to jump to the reported bbcount. UDB automatically reverses execution from undoex_get_bbcount() to the caller, so that it ends up in your code and not inside the undoex_get_bbcount() implementation. For example:

end 13,172> ugo time 12585
0x00007ffff7edba90 in ioctl () from /lib/x86_64-linux-gnu/libc.so.6

Moving out from libundoex into user code.
You can disable this behaviour by using the "-exact" option.

print_current_bbcount () at undoex/examples/simple_ex.c:11
11          uint64_t bbcount = undoex_get_bbcount();
95% 12,522> backtrace
#0  print_current_bbcount () at undoex/examples/simple_ex.c:11
#1  0x0000555555563766 in main () at undoex/examples/simple_ex.c:29