Recording an application

Requirements

Your application or test case running under an OpenJDK-based or Oracle JRE on a supported Linux distribution.

  • JRE and its debug symbols installed:

    RHEL/CentOS/SuSE-supplied JRE:

    java-<N>-openjdk-headless package, where <N> is 1.8.0, 11 or 17.

    Debian/Ubuntu-supplied JRE:

    openjdk-<N>-jre-headless and openjdk-<N>-dbg packages, where <N> is 8, 11 or 17.

    Oracle, Azul Zulu, or other OpenJDK-based JRE:

    Debug symbols are included with the JRE download.

  • The ability to copy the Recording software into this environment.

  • If the environment is a container:
    You have access to the container build environment to modify the Java command-line arguments that launch the application.
    The container needs to have a VOLUME bind mount set up so the recording can be saved or copied out of the container.

Generating a recording

Unzip the file LR4J-Record-*.zip. This contains the LiveRecorder recording agent lr4j-record-1.0.so.

Edit the command-line used to launch Java (e.g. in the service/ant/mvn/gradle/Jenkins configuration) to add the following arguments to the start of the Java command-line 1.

-XX:-Inline -XX:TieredStopAtLevel=1 -XX:UseAVX=2 -agentpath:/path/to/lr4j-record-1.0.so=save_on=always

The record agent takes a number of parameters that may be used to define when the recording is made, the recording filename and so on. Alternatively, you can use the LiveRecorder API in your application to programmatically start/stop recording. The following parameters are recognized:

save_on=always

always saves a recording on exit.

save_on=failure

only generate a recording if the Java exit code is non-zero.

save_on=success

only generate a recording if the Java exit code is zero.

output=filename

specifies the name of the recording file to save to (recommended).

max_event_log_size=nnnn

specifies the maximum event log size (default 1G). You can use k, K, m, M, g and G. (See Configuring event log size for more details)

start_after_classload=classname

start recording when the given class is loaded. The name should be in canonical format e.g. java.lang.System. This option is useful if, for instance, startup is slow. See When to start recording.

min_classes_loaded=nnn

start recording once a given number of classes have been loaded. This can be useful to avoid the cost of recording startup. See When to start recording.

command_filename=filename

only start recording when a SIGQUIT signal is sent to the process. See Using a signal to start recording.

verbose

prints each class as they are loaded together with the running total to stderr.

thread_mode=nnn

sets the thread fuzzing mode. See Thread Fuzzing.

save_callback_class=classname

specifies the name of a class to be notified when a recording is saved. See Recording save callbacks.

save_callback_jar=filename

specifies the name of the jar file containing save_callback_class. See Recording save callbacks.

command_port=nnn

specifies a port to listen on for commands. See REST API.

See below for advice specific to Microservices, or to Recording test failures.

Recording filename

The recording file is saved out to disk at application exit. By default the recording file is saved in the application’s current working directory and named after the main Java class plus the current date/time. To set the filename explicitly choose one of the following:

  • Add an output parameter on the Java command-line. e.g.:

-agentpath:/path/to/lr4j-record-1.0.so=save_on=always,output=/path/to/recording.undo

This is useful when recording an application/service and you want the recording file saved to a specific directory.

  • In your application code set the Java system property io.undo.output at any point prior to application exit. e.g.:

System.setProperty("io.undo.output", "/path/to/recording.undo");

If both the output command-line parameter and io.undo.output system property are specified the command-line argument wins.

The output directory must already exist and be writable by the application’s UID/GID. If you don’t specify a fully qualified file name the recording file will be saved relative to the application’s current working directory.

Recording size

By default, LiveRecorder uses 1GB of system RAM while capturing the recording for an “event log”. You can increase this with the max_event_log_size parameter. For example to set the “event log” to 2GB:

-agentpath:/path/to/lr4j-record-1.0.so=save_on=always,max_event_log_size=2G

The larger the “event log” size, the further back in time from the end of the recording that you will be able to replay.

When to start recording

By default LiveRecorder starts recording as soon as the JVM starts. You can reduce your application startup time by telling LiveRecorder to delay starting recording until your application’s initialisation is complete. To assist with this there are three options:

  • verbose=true : Output the names of classes as they are loaded.
  • start_after_classload=classname : Only start recording once the named class has been loaded.
    The name should be a fully qualified class name such as org.springframework.boot.StartupInfoLogger.
  • min_classes_loaded=nnn : Only start recording once the given number of classes have been loaded.

Unless you already have a class name in mind, first run with verbose=true and wait until the application has completed initialisation. Pick a suitable class name and supply it to start_after_classload in subsequent runs. LiveRecorder will only start recording once the named class is loaded. Or you can use min_classes_loaded to start recording once the given number of classes have been loaded (be careful to choose a sufficiently low enough value though)

Alternatively, use the LiveRecorder API in your application to give you full programmatic control over when to start recording or use a signal as described in Using a signal to start recording.

When to save a recording file

Adjust the save_on parameter to control the conditions under which a recording is generated:

  • save_on=always : Generate a recording at application exit.
    To force the application to exit and generate a recording use ^C, kill <application_PID> (note: not kill -9)
    or, if the application is running as a Systemd service, sudo systemctl stop <service>.
  • save_on=failure : Generate a recording at application exit only if the application/test exits with a non-zero status from System.exit().
    This is useful for recording failing tests within a CI system.
  • save_on=success : Generate a recording at application exit only if the application/test exits with a zero status from System.exit().

  • save_on omitted : Generate a recording under your application’s control using the LiveRecorder API.

The LiveRecorder API

The LiveRecorder API allows you to start and stop recording and save recording file(s) under your application’s control. This is useful if you want to record only those parts of your application’s execution or tests cases that are failing. It is a Java wrapper round the C++ LiveRecorder API.

The API is provided by lr4j_api.jar and is supplied in LR4J-Record-*.zip. Include lr4j_api.jar in your project and add calls in your application code to the API to start recording, save out a recording file, and stop recording:

import io.undo.lr.UndoLR;
UndoLR.start();
...
UndoLR.save(String filename);
UndoLR.stop();

The following calls are available:

UndoLR.start()

start recording the current process

UndoLR.save(String filename)

save the recording to the given filename

UndoLR.stop()

stop recording. After this another call to UndoLR.start may be made if required

UndoLR.saveOnTermination(String filename)

save the recording to the given filename when Java exits. May be used as an alternative to UndoLR.save

UndoLR.setEventLogSize(long size)

sets the maximum event log size to size

UndoLR.setThreadMode(int mode)

sets the thread fuzzing mode. See Thread Fuzzing.

The Java command line must include the following arguments:

-XX:-Inline -XX:TieredStopAtLevel=1 -XX:UseAVX=2 -agentpath:/path/to/lr4j-record-1.0.so

Refer to the Hands-on Undo GitHub project for an example of using the API in a Jakarta EE based microservice.

Refer to Recording test failures for examples of using the API in test.

Using a signal to start recording

For cases where using The LiveRecorder API is not convenient — for instance if you do not wish to modify the code or if you only want to start recording once it has been observed that the application is misbehaving — there is a way to start and stop recording by sending a signal to the Java process.

Start the Java program using the following option:

-agentpath:/path/to/lr4j-record-1.0.so=command_filename=filename

where filename is the name of a file containing a one-shot command. This file is read when a SIGQUIT signal is sent to the process and the command is then obeyed. To send a second command, the file should be overwritten with the new command before sending another SIGQUIT. The following commands are recognised:

START

starts recording (see also UndoLR.start in The LiveRecorder API)

STOP

stops recording

SAVE_AND_STOP recording_file

first saves the recording to recording_file and then stops recording

So, for instance, to start and save a recording for a Java process with a pid of 63056 you could issue the following sequence of commands from the Linux shell (assuming we started with command_filename=/tmp/undo_command):

echo "START" > /tmp/undo_command
kill -3 63056
# wait a while
echo "SAVE_AND_STOP /tmp/recording.undo" > /tmp/undo_command
kill -3 63056

The file /tmp/recording.undo will then contain a recording of everything that happened between the two commands.

Note that SIGQUIT has the side-effect of also outputting a thread dump to stdout but this can be ignored.

REST API

The agent startup option command_port=nnnn may be used to specify a port that a background agent thread will listen on for REST API commands. This may be used to start recording, and to stop and save a recording and is an alternative to using the The LiveRecorder API or signals.

The following commands are supported:

undolr_start

starts recording.

undolr_save/filename

stops and saves a recording to the given filename.

So if, for instance, you start Java with the following option:

-agentpath:/path/to/lr4j-record-1.0.so=command_port=9000

then to start recording, you could do:

curl http://localhost:9000/undolr_start

and to stop and save the recording to /tmp/recording.undo:

curl http://localhost:9000/undolr_save//tmp/recording.undo

(if a single / is used the recording will be saved to a location relative to the process current working directory)

Recording save callbacks

When not using the The LiveRecorder API it can be useful to be notified when a recording has been saved so that further action may be taken, for instance, to upload the recording from a docker container to an S3 bucket. This could be when the Java program being recorded exits or when an external stimulus is received to save the recording (for example via the REST API). You may, for instance, want to upload the recording when Java exits, but before the container is destroyed.

To do this, use the save_callback_class option as follows:

-agentpath:/path/to/lr4j-record-1.0.so==save_on=always,save_callback_class=classname

This class must contain a static method recordingSaved(String filename), where filename is the name of the recording which has just been saved. If the class is not in the system classpath (which is all the agent has access to) you may also pass the name of the jar file containing the class, using the save_callback_jar option. Here is a complete example of such a class:

package io.undo.examples;

import java.io.File;
import software.amazon.awssdk.auth.credentials.SystemPropertyCredentialsProvider;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;

/* This is an example of a class that post-processes a saved recording to upload it to an S3 bucket */
public class UploadToS3Bucket {

    public static void recordingSaved(String filename) {
        try {
            String bucketName = System.getProperty("aws.bucketName");
            if (bucketName == null) {
                System.err.println("Property 'aws.bucketName' not set");
                return;
            }
            S3Client s3Client =
                    S3Client.builder()
                            .region(Region.EU_WEST_2)
                            .credentialsProvider(SystemPropertyCredentialsProvider.create())
                            .build();
            PutObjectRequest request =
                    PutObjectRequest.builder().bucket(bucketName).key(filename).build();
            File file = new File(filename);
            s3Client.putObject(request, file.toPath());
            System.out.println("Recording saved to " + filename);
        } catch (Exception e) {
            System.err.println("UploadToS3Bucket: caught exception " + e);
            e.printStackTrace();
        }
    }
}

To run this example you must also set the system properties aws.accessKeyId, aws.secretAccessKey and aws.bucketName.

Thread Fuzzing

We expose the ability to set the thread-fuzzing mode, either at LiveRecorder start up by passing the thread_mode=nnn option to the agent, or by calling UndoLR.setThreadMode using the The LiveRecorder API. This can be useful when trying to reproduce concurrency issues. See Thread Fuzzing for more details. The possible values are as follows:

typedef enum
{
    undolr_thread_mode_NORMAL = 0,
    undolr_thread_mode_IN_BB = 1 << 0,
    undolr_thread_mode_STARVE = 1 << 1,
    undolr_thread_mode_RANDOM = 1 << 2,
    undolr_thread_mode_SYNC = 1 << 3,
    undolr_thread_mode_THREAD_FUZZING_DEFAULT = undolr_thread_mode_RANDOM | undolr_thread_mode_STARVE | undolr_thread_mode_IN_BB,
} undolr_thread_mode_t;

Recording IntelliJ

For IntelliJ Plugin developers on Linux, we have added a convenient way to start recording IntelliJ itself. Navigate to Tools › Recording Actions › Start Recording to start recording IntelliJ and Tools › Recording Actions › Save Recording to stop and save the recording.

(Note that the performance will be slower. If it is more than about ten times slower, check the contents of /tmp/undo_debug.<pid>.log for messages containing the word unoptimized and if found, please contact Undo Support).

Log files

There are two (usually small) log files created during recording:

/tmp/undo_agent.<pid>.log

generated by the agent itself, this contains details about recording start and stop and any errors found.

/tmp/undo_debug.<pid>.log

contains messages from the Undo core engine, only of interest to Undo support if something went wrong.

1

If it is not possible to modify the Java command-line, another alternative is to use the JAVA_TOOL_OPTIONS environment variable which is picked up at runtime, e.g.:

JAVA_TOOL_OPTIONS="-XX:-Inline -XX:TieredStopAtLevel=1 -XX:UseAVX=2 -agentpath:/path/to/lr4j-record-1.0.so=save_on=always"