Multi-Process Correlation for Shared Memory

Live Recorder can record accesses to shared memory from multiple processes to a single shared memory log file. This can be later used to track which process modified a region of shared memory.

First of all, set up your program to record itself using Live Recorder, as documented in using the Live Recorder API.

Enabling shared memory logging

Multi-Process Correlation for Shared Memory is currently used in the following way:

  • A process creates one or more shared maps.
  • It calls undolr_shmem_log_filename_set to set the file where shared memory accesses are going to be logged.
  • It forks some child processes which share the previously created shared memory maps.
  • All the processes call undolr_start to record themselves.
  • All the processes save their recordings by either calling undolr_save_on_termination or undolr_save.

At its simplest, this feature can be used (omitting all error checking for simplicity) as follows:

#include "undolr.h"

#include <stdlib.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <unistd.h>

int main(int argc, char **argv)
{
    undolr_shmem_log_filename_set("example.shmem");

    const int n_elements = 100;
    int *shmem_data = mmap(NULL,
                           sizeof(*shmem_data) * n_elements,
                           PROT_READ | PROT_WRITE,
                           MAP_ANONYMOUS | MAP_SHARED,
                           0, 0);

    pid_t pid = fork();
    if (pid == 0)
    {
            // Child.
            undolr_start(NULL);
            undolr_save_on_termination("child.undo");

            // Do some work and access shmem_data.
            shmem_data[0] = 42;
            shmem_data[1] = 123;
    }
    else
    {
            // Parent.
            undolr_start(NULL);
            undolr_save_on_termination("parent.undo");

            // Do some work and potentially access shmem_data.
            // ...

            // Wait for the child process.
            waitpid(pid, NULL, 0);

            // Do some more work.
            shmem_data[0] = 100;
    }

    return 0;
}

This example program will generate three files:

parent.undo
The recording containing the execution history for the main (parent) process.
child.undo
The recording containing the execution history for the child process.
example.shmem
Containing the details of the shared memory accesses made by both processes.

The recording files (which should have a .undo extension) can then be loaded into UndoDB.

Once a recording is loaded, the ublame command can be used to print the list of accesses to the specified address which lead to the current value in memory. For instance, after loading parent.undo and going at the end of the main function, it’s possible to do:

(udb) ublame &shmem_data[0]
    PID  Recording ID    Time in recording    Type    Address           Len
  -----  --------------  -------------------  ------  --------------  -----
    130  abcdef01-2345   354:0x55eeabe09804   write   0x7f894cb6c000      4
*   126  6789abcd-ef01   360:0x55eeabe09848   write   0x7f894cb6c000      4

This shows all accesses which lead shmem_data[0] to acquire its current value in the current process, ordered by access time to the memory.

The columns in the table printed by ublame are:

PID

The process identifier for the process which executed the access to shared memory.

The operating system may reuse a process identifier after the corresponding process terminated, so there’s a chance of two identical process identifiers for different recording IDs. Because of this, the PID column is provided only for convenience and the recording ID value should be used to uniquely identify processes and recordings.

Recording ID

A unique identifier for the recording file for the process which executed the access to shared memory.

Time in recording

The time in the history of the specified recording at which the access happened. For details on the format see the time notation documentation.

Note that times grow monotonically for each recording, but times for different recordings are unrelated. That is, each recording has its own time and times from different recordings cannot be directly compared.

Type

Whether the access was a read, write or read/write operation.

Read/write operations correspond to machine instructions which both read and write a value, for instance the cmpxchg instruction on x86 architectures used to implement atomic operations.

Address

The start address at which the access was executed.

This doesn’t necessarily match the argument passed to the ublame command as the same memory can be accessed through pointers to variables of different sizes. For instance, using ublame on a pointer to uint8_t could also report accesses made to more bytes using a pointer to an overlapping larger integer.

Len

The size of the access.

Accesses made by the process corresponding to the currently loaded recording are marked with an asterisk (*).

The ublame command only shows accesses leading to the current value of a certain memory area. This means that:

  • No accesses will be reported if the memory being blamed was never (or not yet) accessed by the current recording.
  • Accesses made at a later time in the recording are not shown.

Debugging an example program

A simple example program is provided in the examples/ directory of UndoDB. The source is in shmem_producer_consumer.c and executables are provided for the supported platforms.

  • The example program forks into two processes, a producer and a consumer.
  • The producer generates several numbers in the [0, 100) open range and writes them in an array in shared memory.
  • The consumer reads the values from shared memory and uses them.

The example program often succeeds, but it occasionally fails and aborts:

$ ./shmem_producer_consumer
Will save shared memory log to: ./log-133.shmem
Will save recording for producer to: ./producer-133.undo
Will save recording for consumer to: ./consumer-133.undo
shmem_producer_consumer: shmem_producer_consumer.c:99: consumer_run: Assertion `value >= 0 && value < 100' failed.
Child process failed.

The crash happened in the consumer, so we can load its recording in UndoDB:

$ udb consumer-133.undo
...

No more reverse-execution history.
0x00005618d2d15a65 in record_me (process_name=0x5618d2db1c0b "consumer") at shmem_producer_consumer.c:43
43      int e = undolr_start(NULL);

The current point in history is at the start of the recording, but the crash happens at the very end, so we can go to the end and check what happened:

(udb) ugo end
udb: Have reached end of recorded history.

Program received signal SIGTRAP, Trace/breakpoint trap.
udb: Have reached end of recorded history.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(udb) backtrace
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f61b01bd801 in __GI_abort () at abort.c:79
#2  0x00007f61b01ad39a in __assert_fail_base (fmt=0x7f61b03347d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5618d2db1c2e "value >= 0 && value < 100",
    file=file@entry=0x5618d2db1c14 "shmem_producer_consumer.c", line=line@entry=99, function=function@entry=0x5618d2db1d08 <__PRETTY_FUNCTION__.4403> "consumer_run") at assert.c:92
#3  0x00007f61b01ad412 in __GI___assert_fail (assertion=0x5618d2db1c2e "value >= 0 && value < 100", file=0x5618d2db1c14 "shmem_producer_consumer.c", line=99, function=0x5618d2db1d08 <__PRETTY_FUNCTION__.4403> "consumer_run")
    at assert.c:101
#4  0x00005618d2d15bef in consumer_run () at shmem_producer_consumer.c:99
#5  0x00005618d2d15da7 in main (argc=1, argv=0x7ffe6a2c91e8) at shmem_producer_consumer.c:176

We are deep inside the call stack of the assert function. To debug the problem, we need to go back to the example code. This can be done by using reverse-finish multiple times. Each time the command is used, execution is continued backwards until the calling point of the current function.

(udb) reverse-finish
79  abort.c: No such file or directory.
(udb) reverse-finish
92  assert.c: No such file or directory.
(udb) reverse-finish
101 in assert.c
(udb) reverse-finish
99          assert(value >= 0 && value < 100);

Now that we are back inside the consumer_run function, we can print the value of the variable which caused the assertion to fail:

(udb) print value
$1 = 100

The value is invalid, but what caused it? It’s possible to use a watchpoint on the value variable and continue execution backwards until the source of that value is found:

(udb) watch value
Hardware watchpoint 1: value
(udb) reverse-continue
Continuing.

Hardware watchpoint 1: value

Old value = 100
New value = 83
0x00005618d2d15bc1 in consumer_run () at shmem_producer_consumer.c:96
96          int value = shmem_data_array[index];

Unfortunately the bad value comes from shared memory and was generated by another process.

At this point we could load the recording for the producer and look for the time when the bad value was generated, but, in a less trivial program, it may be difficult to find the exact moment at which it happened.

With Multi-Process Correlation for Shared Memory, it’s possible to use the ublame command to find out when the bad value was generated:

(udb) ublame &shmem_data_array[index]
    PID  Recording ID    Time in recording     Type    Address           Len
  -----  --------------  --------------------  ------  --------------  -----
    133  292b1631-c83d   2,822:0x5618d2d15c61  write   0x7f61b0794118      4
*   137  97dd29dd-2c8f   3,040:0x5618d2d15bbf  read    0x7f61b0794118      4

From the output we can see that the current process (marked with an asterisk) read the value at time 3,040:0x5618d2d15bbf (relative to the currently loaded recording), but the value was originally written by another process, with PID 133 and recording ID 292b1631-c83d, at time 2,822:0x5618d2d15c61 (relative to that recording, not the current one).

We can switch to the other recording at the specified time to continue the investigation:

(udb) ugo inferior 292b1631-c83d 2,822:0x5618d2d15c61
udb: Loading a new recording may take a while. Are you sure? [y or n] y

...

0x00005618d2d15c61  124             shmem_data_array[index] = new_value;

Alternatively, it’s possible to load the other recording in a separate terminal using the command line printed by ublame --show-command-line.

Now we can use a watch point again to find the source of the wrong value:

(udb) print new_value
$2 = 100
(udb) watch new_value
Hardware watchpoint 2: new_value
(udb) reverse-continue
Continuing.

Hardware watchpoint 2: new_value

Old value = 100
New value = 95
0x00005618d2d15c47 in producer_run () at shmem_producer_consumer.c:123
123         int new_value = random_in_range(0, 100);

This is the source of the bug!

Looking at the random_in_range function we can see it accepts a closed range as arguments, so it will return values between 0 and 100 (both included), while the consumer only accepts values in the range 0 to 99.

APIs

Note

See the undolr.h header file for the most detailed information about available APIs.

  • undolr_shmem_log_filename_set - Set the path where to save the log for shared memory access. If this function is not called, shared memory access across processes is not saved. This needs to be called before recording starts (before undolr_start is called).
  • undolr_shmem_log_filename_get - Get the path where to save the log for shared memory access, that is the value previously set with undolr_shmem_log_filename_set.
  • undolr_shmem_log_size_set - Set the maximum size (in bytes) for the shared memory log. If not called, a suitable default will be picked. Once the maximum size is reached, old entries are discared and overwritten with new ones.
  • undolr_shmem_log_size_get - Get the current maximum size (in bytes) for the shared memory log.

Displaying the status of shared memory

UndoDB cannot know when a different, and potentially not recorded, process accesses shared memory, so its local understanding of the content of shared memory is lazily updated only when the recorded process accesses a region of memory, and only for that accessed area of memory. Due to this, printing at replay time (for instance using the print command) an area of shared memory which was not accessed earlier cannot report the correct value.

Similarly, accessing memory some time after the recorded process read it, may mean that another process has already overwritten that value, but UndoDB cannot know about it until the next time the recorded program accesses it.

To make this situation clearer, UndoDB prints extra information about variables in shared memory:

(udb) print &struct_in_shared_memory->value_we_never_read
$1 = <never accessed>
(udb) print &struct_in_shared_memory->value_we_read_a_while_ago
$2 = 42 <stale>
<never accessed>
At the current time in the recording, the memory the user is trying to print has not been accessed, so its value is not available.
… <stale>
The memory the user is trying to print was accessed at an earlier time, but, in the meantime, another process may have already overwritten it.

Current limitations

Currently, only memory mapped at the same address in all processes is supported. In practice, this means that all processes should be forked from the same initial process and that shared memory must be set up before calls to fork.