Usage statistics collection

UDB collects some usage statistics to protect Undo from unlicensed or unlawful use of our software and services, plus some additional usage data to help us improve our products. This does not include any data about or from the programs being debugged.

By default, the additional usage data is anonymized. However, we encourage you to opt in to share full, and personally-identifying, usage statistics; this will help us to improve our products and to improve our customer support.

For more details, see Undo’s privacy policy. To change which information is shared with Undo, use the set share-usage-statistics command.

Implementation details

This page explains how our usage statistics collection works and what is collected.

While running, UDB collects information about the features used, and saves them into a JSON file in $XDG_DATA_HOME/undo/udb/telemetry/v2/ where $XDG_DATA_HOME, if unset, defaults to ~/.local/share/. The filename is comprised of a random UUID and the .json extension. The JSON files are formatted in a human-readable way to make it easier to inspect them.

When UDB quits, the JSON file is submitted to our api.undo.io server. If delivery fails (for instance, because the user is not connected to the Internet) the file is cached on disk and is submitted the next time UDB is started.

Once the usage statistics JSON file is received by the server, personally-identifiable data and additional usage data are split into separate collections (i.e. tables). With the default setting of share-usage-statistics, the two sets of information cannot be linked and are totally independent thus anonymizing the additional usage data. With explicit user permission (that is, when the user agrees to share full usage statistics in the prompt shown by UDB, or by using set share-usage-statistics on), the telemetry_id, license and user_id fields (described below) are used to create a relationship between the two sets of data.

Changed in version 7.0: In previous versions of UDB, usage statistics collection was implemented in a different way, without support for anonymized data. The current system is meant to provide Undo with more data while improving on users’ privacy.

JSON format description

Each bullet point corresponds to a key in the JSON dictionary submitted to our API server. Indented bullet points represent nested values.

  • comment (string)

    A string referring to this page for users who accidentally discover the usage statistics JSON files and open them.

  • session_id (string)

    Unique random identifier (UUID) for this UDB session which is used to prevent multiple accidental submissions of the same file.

    This identifier is also used as part of the file name where the JSON is saved on disk before submission.

  • licensing (object)

    License enforcement data.

    This is always included and contains personally-identifiable information which is used to protect Undo from unlicensed use of our software.

    This object contains the following keys:

    • license (string)

      The UID of the license.

      Example:

      "e22bbc6c7ad0e26adb07496f60de4a603e4781d44d6c3ca0d63b96ac"
      
    • username (string or null)

      For licenses configured to use a keyserver, the username. Otherwise, null.

      Example:

      "john_smith"
      
    • keyserver_id (string or null)

      For licenses configured to use a keyserver, the identifier used for communications with the keyserver. Otherwise, null.

    • udb_version (string)

      The version of UDB.

      This information is also included in the additional usage data section (the to_be_anonymized key).

      Example:

      "7.2.1"
      
    • is_redistributable_udb (boolean)

      Whether this usage statistics are generated by Redistributable UDB, a UDB variant with limited features that can be shipped together with customers’ applications.

      This information is also included in the additional usage data section (the to_be_anonymized key).

    • start_time (date and time as string)

      The UTC wall-clock time at which the UDB session started.

      Example:

      "2023-08-10T16:05:35.123456"
      
    • end_time (date and time as string)

      The UTC wall-clock time at which the UDB session ended.

      Example:

      "2023-08-10T18:12:01.654321"
      
    • license_accepted (boolean or null)

      Whether the user has accepted the license, or null if the user could not be asked (for instance, because they are using an IDE that doesn’t support this feature).

    • used_licensable_features (object)

      Which licensable features were used by this UDB session.

      This information is also included in the additional usage data section (the to_be_anonymized key).

      This object contains the following keys:

      • started_process (boolean)

        Whether a live process was started by UDB.

      • attached_to_process (boolean)

        Whether UDB attached to a running process.

      • loaded_core (boolean)

        Whether UDB loaded a core file.

      • loaded_recording (boolean)

        Whether UDB loaded a LiveRecorder recording.

      • saved_recording (boolean)

        Whether UDB saved a process’s execution history to a LiveRecorder recording.

      • remote_debugging (boolean)

        Whether UDB used a remote server for debugging.

    • used_archs (object)

      The CPU architectures for debugged programs or loaded LiveRecorder recordings during this UDB sesssion.

      This information is also included in the additional usage data section (the to_be_anonymized key).

      This object contains the following keys:

      • x64 (boolean)

        Whether any of the debugged programs or loaded LiveRecorder recordings are Linux AMD/Intel x86-64 processes.

      • x32 (boolean)

        Whether any of the debugged programs or loaded LiveRecorder recordings are Linux Intel i386 processes.

        Note that this is unrelated to the rarely used x32 ABI which the Undo Engine doesn’t support.

      • arm64 (boolean)

        Whether any of the debugged programs or loaded LiveRecorder recordings are Linux ARMv8 AArch64 processes.

    • tool (string)

      The tool used for this session.

      This is "udb_plain" for direct uses of UDB but, for instance, it’s "postfailurelogging" if UDB is used via the postfailurelog tool.

      This is similar to the ui field part of the to_be_anonymized object but it’s more coarse as it’s only used to verify that users use tools in accordance with the terms of their licenses.

    • auto_quit (string)

      Auto-quit configuration applied to this session.

      New in version 7.1.0.

    • auto_quit_attempts (list of objects)

      Record of occasions where the auto-quit timeout was reached.

      New in version 7.1.0.

      • time (date and time as string)

        The UTC wall-clock time at which the session either quit or was cancelled.

      • result (string)

        The result of triggering auto-quit.

    • auto_quit_longest_inactive_time (elapsed seconds as floating point number or null)

      The longest inactive time during this sessions or null if auto-quit is not enabled.

      New in version 7.2.0.

  • to_be_anonymized (object)

    Additional usage data.

    By default, share-usage-statistics is set to anonymized, which means this part of the usage statistics is anonymized once the server receives it.

    If the share-usage-statistics setting is set to licensing-only, additional usage data is not collected and this field is set to null.

    This object contains the following keys:

    • telemetry_id (string)

      A randomly-generated UUID that is preserved across runs of UDB by the same user on the same machine. This is used to identify the same user across UDB sessions without revealing the user’s identity or personal data.

      Example:

      "bc8bbbcf-2e97-4de9-9f83-28e556433735"
      
    • license (string or null)

      Either the UID of the license (for larger customers) or null (for smaller ones).

      Identifying the customer in usage data allows Undo to target improvements at customer-specific use cases.

      For smaller customers, where a license could identify a single user or a few users, identifying the customer would defeat anonymization, so this field is null.

      Example:

      "e22bbc6c7ad0e26adb07496f60de4a603e4781d44d6c3ca0d63b96ac"
      

      New in version 7.2.0.

    • udb_version (string)

      The version of UDB.

      This information is also included in the mandatory licensing section (the licensing key).

      Example:

      "7.2.1"
      
    • is_redistributable_udb (boolean)

      Whether this session was generated by Redistributable UDB, a UDB variant with limited features that can be shipped together with customers’ applications.

      This information is also included in the mandatory licensing section (the licensing key).

    • start_month (string)

      The year and month (in the UTC timezone) when the UDB session started.

      This is part of the same information as the start_time field from the licensing data. The time and day are truncated for privacy reasons, so that this section of the usage statistics (the anonymized additional usage data) cannot be correlated with the licensing data.

      Example:

      "2023-08"
      

      New in version 7.2.0.

    • deferred_recording (boolean)

      Whether UDB was started in deferred-recording mode.

    • ui (string)

      The interface through which the user interacted with UDB.

      For instance, if UDB is used on the terminal with no additional UI, then this is set to "udb_console". If UDB is used via Visual Studio Code, then this is set to "vscode".

    • tui_used (boolean)

      Whether TUI (GDB’s Text User Interface) was used.

    • used_licensable_features (object)

      Which licensable features were used by this UDB session.

      This information is also included in the mandatory licensing section (the licensing key).

      • started_process (boolean)

        Whether a live process was started by UDB.

      • attached_to_process (boolean)

        Whether UDB attached to a running process.

      • loaded_core (boolean)

        Whether UDB loaded a core file.

      • loaded_recording (boolean)

        Whether UDB loaded a LiveRecorder recording.

      • saved_recording (boolean)

        Whether UDB saved a process’s execution history to a LiveRecorder recording.

      • remote_debugging (boolean)

        Whether UDB used a remote server for debugging.

    • used_archs (object)

      The CPU architectures for debugged programs or loaded LiveRecorder recordings during this UDB sesssion.

      This information is also included in the mandatory licensing section (the licensing key).

      • x64 (boolean)

        Whether any of the debugged programs or loaded LiveRecorder recordings are Linux AMD/Intel x86-64 processes.

      • x32 (boolean)

        Whether any of the debugged programs or loaded LiveRecorder recordings are Linux Intel i386 processes.

        Note that this is unrelated to the rarely used x32 ABI which the Undo Engine doesn’t support.

      • arm64 (boolean)

        Whether any of the debugged programs or loaded LiveRecorder recordings are Linux ARMv8 AArch64 processes.

    • commands (list of objects)

      A list of commands which were executed.

      Each object in the list contains the following keys:

      • name (string)

        The command name, without any argument.

        In case of aliases or abbreviations, the full original command name is used.

        Example:

        "reverse-next"
        
      • result (string)

        Whether the command succeeded, failed or was interrupted by the user.

        • "success": The command terminated successfully.

        • "error": The command terminated with an error. For instance, the user tried to use the step command while the debugged program was not running.

        • "interrupted": the command was interrupted by the user with ctrl-C.

      • duration (elapsed seconds as floating point number)

        How long the command took to complete, in seconds.

      • bbcount_delta (integer or null)

        The difference, in BBs, between the time before and after the execution of this command.

        A positive number denotes a movement forward in time, a negative number a movement backward in time, and 0 means that the time was not changed. This value is null if there’s no execution history before or after the command execution (for instance, the debugged program was not running).

      • time_since_previous_command (elapsed seconds as floating point number)

        The inactivity time before this command.

        That is, the time between the invocation of this command and the end of the previous command’s execution. Or, if this is the first executed command after UDB started, the time since startup.

        Note that some commands are not tracked in usage data due to technical reasons or because they are not useful commands to track. These untracked commands are ignored by this measurement.

    • time_inactive_before_quit (elapsed seconds as floating point number or null)

      The inactivity time between the last command and UDB quitting.

      Can be null if UDB doesn’t quit properly (e.g. in case of crash).

    • loaded_recordings (list of objects)

      List where each item contains information about a LiveRecorder recording that was loaded by UDB.

      Each object in the list contains the following keys:

      • wallclock_start (date and time as string)

        The UTC wall-clock time at the start of recorded history.

        Example:

        "2023-08-01T15:05:35.123456"
        
      • wallclock_end (date and time as string)

        The UTC wall-clock time at the end of recorded history.

        Example:

        "2023-08-01T15:35:01.654321"
        
      • recording_undo_version (string)

        The version of the Undo Engine used to produce this recording.

      • recording_size_bytes (integer)

        The size of the recording, in bytes.

      • recording_application_uid (string or null)

        The UID of the license used to create the recording.

      • uuids (dictionary mapping string to string)

        Random identifiers for the recording.

        Currently, the following identifiers are stored:

        • save: A random identifier which changes every time a LiveRecorder recording of a process is saved. This identifies a single recording. Can be null for old recordings.

        • run: A random identifier which persists even if multiple LiveRecorder recordings are saved. Multiple recordings with different save identifiers can share a single run identifier. Can be null for old recordings.

        • shmem_log: A random identifier for the log of shared memory accesses if Multi-Process Correlation for Shared Memory is enabled. null otherwise.

        Example:

        {
            "save": "22859b9a-526e-46ea-a1e4-c8f6eed8c41a",
            "run": "179ccc3a-6014-40e1-8171-33f6d4d97ee9",
            "shmem_log": null
        }
        
      • load_metrics (dictionary mapping string to object)

        A mapping from the name of a metric related to the loading of this recording, to a description of that metric.

        Each value in the dictionary is an object containing the following keys:

        • size (integer)

          A measure of the size of the data processed, typically a count of bytes.

        • duration (elapsed seconds as floating point number)

          How long it took to complete.

      • restored_session (object or null)

        Information about what was restored from a UDB session, or null if there is no saved session.

        New in version 7.1.0.

        • bookmarks_count (integer)

          How many bookmarks were restored.

        • time_changed (boolean)

          Whether restoring the session led to the time in execution history being changed.

        • undo_redo_stack_count (integer)

          How many items were restored into the stack used by the ugo undo and ugo redo commands.

        • breakpoints_restored (boolean)

          Whether breakpoints were restored. This can be false when running under IDEs as they have their own breakpoint saving/restoring system.

        • breakpoints_count (integer)

          How many breakpoints are listed in the session file. This doesn’t necessarily mean that the breakpoints were restored; something could have gone wrong or restoring of breakpoints may have been disabled (see the breakpoints_restored field).

        • time_limits_set (boolean)

          Whether any time limit was restored.

        • telemetry_id (string or null)

          The telemetry ID of the user who created the session file. For session state files loaded via the usession import command, this is the ID of the user who performed the corresponding usession export command. For sessions restored automatically, it is the ID of the current user.

          The telemetry ID is a randomly-generated UUID that is preserved across runs of UDB by the same user on the same machine. This is used to identify the same user across UDB sessions without revealing the user’s identity or personal data.

          Example:

          "bc8bbbcf-2e97-4de9-9f83-28e556433735"
          

          New in version 7.2.0.

      • point_recording (boolean)

        Whether the recording is a point recording.

        New in version 7.1.0.

      • save_stop_code (integer)

        Number of the stop code that describes why the recording was saved.

        New in version 7.1.0.

      • undo_tool (string or null)

        The name of the Undo tool that was used to create the recording, or null if unknown.

        Example:

        "live-record"
        

        New in version 7.2.0.

      • config (dictionary mapping string to anything)

        Configuration parameters (with non-default values) set when the recording was generated.

        The Undo Engine has several configuration parameters, some corresponding to command-line options or environment variables that users can set, others for internal use. The config dictionary contains a small subset of these parameters. The included parameters are useful, for instance, to know which features were enabled or if the configuration could have affected performance.

        Example:

        {
            "parameter_foo": True,
            "parameter_bar": 42
        }
        

        New in version 7.2.0.

      • update_count (integer)

        Count of the number of times the recording was updated (for example, using undo recording-update) since it was originally created.

        New in version 7.2.0.

    • distro (dictionary mapping string to anything)

      Information about the GNU/Linux distribution used to run UDB.

      This information is determined using the Python distro package. In particular, this dictionary is the value returned by the distro.info(best=True) function.

      Example:

      {
          "id": "ubuntu",
          "like": "debian",
          "codename": "jammy",
          "version": "22.04.3",
          "version_parts": {
              "major": "22",
              "minor": "04",
              "build_number": "3"
          }
      }
      
    • crash_logs (dictionary mapping string to string)

      Anonymized crash logs (if any component crashed or failed due to an assertion error).

      Keys are the name of the log file (containing the name of the crashed component and the process ID) and values are the crash logs’ text.

      The crash logs contain the Undo Engine backtrace and, in case of assertion failure, the format string for the assertion message (but not its values). This means that crash logs never contain information about the user or the debugged program.

      Example (simplified and formatted in Python-like syntax for readability):

      {
          "undo_crash_log_123_udbserver.log":
              """
              **************************************************************
              Fatal error: invalid foo %d in "%s"
              Location: src/apps/udbserver/server.cpp:251:ensure_session [123:123]
              **************************************************************
      
              frame  0: frame=0x7fff7477e500 pc=0x44e054 debug_backtrace+0xb4 [debug_libunwind.c:42:debug_backtrace]
              frame  1: frame=0x7fff7477ee10 pc=0x44df8c debug_dump_telemetry_crash_log+0x18c [debug_crash_log.c:75 (discriminator 1):debug_dump_telemetry_crash_log]
              frame  2: frame=0x7fff7477ee80 pc=0x461d31 s_handle_error+0xf1 [error.c:197:s_handle_error]
              frame  3: frame=0x7fff7477eef0 pc=0x4620dd error_handle_vpanic+0x1d [error.c:239:error_handle_vpanic]
              frame  4: frame=0x7fff7477ef10 pc=0x4621e8 error_handle_panic+0x78 [error.c:260:error_handle_panic]
              frame  5: frame=0x7fff7477eff0 pc=0x4111c0 _ZN6Server14ensure_sessionEP8Debuggee.part.45+0x3f0 [server.cpp:251:_ZN6Server14ensure_sessionEP8Debuggee]
              frame  6: frame=0x7fff7477f020 pc=0x40b6b0 _ZL5main2R12ServerConfigiPPcPb.constprop.112+0x410 [main.cpp:1094:_ZL5main2R12ServerConfigiPPcPb.constprop.112]
              frame  7: frame=0x7fff7477f210 pc=0x4069f2 main+0x192 [main.cpp:1267:main]
              """,
      }
      
    • imported_sessions (list of objects)

      Information about session state files imported via the usession import command.

      New in version 7.2.0.

      • bookmarks_count (integer)

        How many bookmarks were restored.

      • time_changed (boolean)

        Whether restoring the session led to the time in execution history being changed.

      • undo_redo_stack_count (integer)

        How many items were restored into the stack used by the ugo undo and ugo redo commands.

      • breakpoints_restored (boolean)

        Whether breakpoints were restored. This can be false when running under IDEs as they have their own breakpoint saving/restoring system.

      • breakpoints_count (integer)

        How many breakpoints are listed in the session file. This doesn’t necessarily mean that the breakpoints were restored; something could have gone wrong or restoring of breakpoints may have been disabled (see the breakpoints_restored field).

      • time_limits_set (boolean)

        Whether any time limit was restored.

      • telemetry_id (string or null)

        The telemetry ID of the user who created the session file. For session state files loaded via the usession import command, this is the ID of the user who performed the corresponding usession export command. For sessions restored automatically, it is the ID of the current user.

        The telemetry ID is a randomly-generated UUID that is preserved across runs of UDB by the same user on the same machine. This is used to identify the same user across UDB sessions without revealing the user’s identity or personal data.

        Example:

        "bc8bbbcf-2e97-4de9-9f83-28e556433735"
        

        New in version 7.2.0.

    • post_failure_logging_options (object or null)

      Information about the options used for the Post Failure Logging tool undo log. null unless that tool was run.

      New in version 7.2.0.

      • standard_streams (boolean)

        Whether the user requested the recorded process’s standard output and standard error. See the --standard-streams command-line option.

        New in version 7.2.0.

    • extra (dictionary mapping string to anything)

      Field for arbitrary data that scripts not officially part of UDB can use to collect usage statistics.

      This is useful, for instance, for our addons or for scripts implementing experimental features which are not yet part of UDB.

      Example:

      {
          "feature_foo": {
              "did_bar": true,
              "did_baz": false
          }
      }
      
  • update_sharing (object or null)

    Set when the user changes the usage statistics sharing setting, so that the server can update its data. Otherwise, null.

    For example, if the user decides to change the share-usage-statistics setting from its default of anonymized to on, then this object will be non-null. Once UDB receives this JSON object, it creates a relationship between the value of the telemetry_id field (from the additional usage data), and the values of the license and username fields (from the licensing data). This means that it’s now possible to correlate usage statistics from the separate licensing and additional usage data.

    If the user then decides to revoke consent via set share-usage-statistics anonymized, the fields of the update_sharing object are set accordingly and, once the server receives this JSON object, the link between telemetry_id and license/username is broken making the two sets of data independent again.

    For some evaluation licenses, consent to sharing full usage data is given at download time and cannot be revoked without stopping the evaluation. In these cases, this field is always set, the value key is always set to on, and the required_by_license key is set to true.

    If not null, this object contains the following keys:

    • value (string)

      The sharing setting chosen by the user.

      See the set share-usage-statistics command for a list of possible values.

      Example:

      "anonymized"
      
    • required_by_license (boolean)

      Whether value being set to on is required by the license. See the description of update_sharing for details.

      New in version 7.1.1.

    • time (date and time as string)

      The time when this setting was changed. This is included to avoid race conditions if the user changes the usage statistics sharing setting in separate UDB processes running at the same time.

      Example:

      "2023-08-10T16:12:00.112233"