Recorder design

Tuesday, June 20, 2017

Blog update

Updated the blog article with recent updates, e.g. switching to C.

Recorder design

Spent some time writing up a recorder design write up, comparing it to the QEMU tracing.

My flight recorder is slightly different compared to what is called a “flight recorder” in the QEMU talk. The way I understand it, you build different QEMU variants with different instrumentation being generated, e.g. “simple” or “stderr” or “nop”. The “flight recorder” mode corresponds to a systemtap build activating selected probes during the init script.

By contrast, my intent is to provide instrumentation that is inexpensive enough to be always on, including in production systems. While the code can be disabled, the only use-case for disabling it should be for performance validation runs. In HPVM, where we used a circular-buffer flight-recorder, I measured overheads below 0.2% for workloads such as SDET or TPC, and that was with all fault handlers instrumented.

Why leave instrumentation in all the time? The reason is to provide meaningful data about the context in case of a fault or crash by “dumping” things that were recorded earlier. So when there is a SIGBUS or assert, you don’t get just a stack trace or a “CRITICAL” message, you get a detailed history of what led you there. Circular buffers being limited in size, you will only get the last N events of each type, but my implementation will sort them according to a shared global order, so for some infrequent, but important events (e.g. program options), that history can go hours or days back.

Another observation is that existing instrumentation often takes the form of printf-lookalikes. Something like FOO_DEBUG(“foo=%d ”, foo). Often, it is categorized, i.e. you may have FOO_DEBUG and BAR_DEBUG with different message prefixes, or (unfortunately more rarely) FOO_DEBUG(foo_category, “foo=%d”, foo). In order to be able to leverage this existing instrumentation in various projects, I designed my flight recorder to take this kind of printf-style input.

There are obviously caveats and limitations. You don’t want to format or allocate memory in the probe. So the printf-format will be rendered late if at all, meaning for example that you have to be careful about %s arguments, and you have a maximum number of arguments (4 in the current implementation).

Now, if we assume that, if only for the flight-recorder, the code contains instrumentation that looks like

	MY_DEBUG(category, “foo=%d bar=%zu”, foo, bar)
then you can do a number of things with this beyond the flight-recorder.

  • Obviously, nop is an option
  • Recording the message in the circular buffer for “category” should be the default operation. It enables crash dumps, but also recorder dumps while in a debugger, or interactively when something “funny” happens e.g. by sending a signal to the application (as simple as hitting Control-T on BSD/macOS)
  • You can test a per-category flag, and print messages of that category if the flag is set. This lets developers show relevant traces without having to filter out irrelevant ones (without being lost, they can still be shown with a recorder dump should they later prove useful)
  • The definition of the macro being in one place, you can #ifdef it to death. So there is no conceptual problem with having conditional code for <sys/sdt.h> style probes, etc. Now, each MY_DEBUG statement automatically becomes a systemtap probe probe, with no additional development cost.
  • Once data is in circular buffers, you can have a low-priority thread that copies the buffers to some file or pumps messages through some standard logging mechanism. Some filtering may happen at that stage (e.g. based on categories, or general concepts such as debug / warning / error. That way, logging is never a bottleneck for the main application threads.
  • A bit more difficult, I’m still working on this one… You can export selected “channels” of data from the ring buffers, to be consumed for example by external visualization tools. This is the place where I think that things like a common trace format could be leveraged, although this seems complex and it's not very clear that it's widely used.

The “printf-format” constraint makes this a bit more difficult. My current thinking is that you’d have a separate “data export” declaration for recorders that would decide what data to expose to the external tool, with information such as name and measurement unit.

Reaction to build

There was a story on Slashdot about 'build'. There were of course a few nasty comments. But there was also a nice one:

Wow!

I'm slashdot user #15,884 and my make-based autoconfiguration system is at least 3x as long. I could really learn from this guy.

macSpice committed

The macOS fixes for Spice-GTK are in. I still need a special environment, which looks like:

#!/bin/sh
export PATH=$PATH:/usr/local/Cellar/gettext/0.19.8.1/bin
export CFLAGS="-ObjC -g"
export CXXFLAGS=" -g"
export CPPFLAGS="-I/usr/local/opt/jpeg-turbo/include -I/usr/local/include -I/usr/local/Cellar/openssl/1.0.2j/include"
export LDFLAGS="-g -L/usr/local/opt/jpeg-turbo/lib -L/usr/local/lib -L/usr/local/Cellar/openssl/1.0.2j/lib"

Christophe Fergeau and I discussed to see if there is a better way to do this. For example, the -ObjC option derives from an error building vncdisplaykeymap.c without it:

In file included from vncdisplaykeymap.c:95:
In file included from /usr/local/Cellar/gtk+3/3.22.7/include/gtk-3.0/gdk/gdkquartz.h:23:
In file included from /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks/AppKit.framework/Headers/AppKit.h:10:
In file included from /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks/Foundation.framework/Headers/Foundation.h:8:
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks/Foundation.framework/Headers/NSObjCRuntime.h:492:1: error:
      expected identifier or '('
@class NSString, Protocol;
^

Arguably, that can be interpreted either as:

  • an issue with using a .c extension for code that depends on Objective-C in vncdisplaykeymap.c. So maybe the Cocoa-dependent code in the macOS version should be placed in some vncdisplaykeymap.m file.
  • or an issue with the .pc files for gtk. If the intent is that file <gdk/gdkquartz.h> can be included from a C source code, then that should be reflected by pkg-config --cflags:
    % pkg-config --cflags --libs gtk+-3.0
    

    -D_REENTRANT -I/usr/local/Cellar/gtk+3/3.22.7/include/gtk-3.0 -I/usr/local/Cellar/glib/2.52.2/include/gio-unix-2.0/ -I/usr/local/Cellar/cairo/1.14.8/include/cairo -I/usr/local/Cellar/libepoxy/1.3.1/include -I/usr/local/Cellar/pango/1.40.3/include/pango-1.0 -I/usr/local/Cellar/harfbuzz/1.4.1/include/harfbuzz -I/usr/local/Cellar/pango/1.40.3/include/pango-1.0 -I/usr/local/Cellar/atk/2.22.0/include/atk-1.0 -I/usr/local/Cellar/cairo/1.14.8/include/cairo -I/usr/local/Cellar/pixman/0.34.0/include/pixman-1 -I/usr/local/Cellar/fontconfig/2.12.1_2/include -I/usr/local/opt/freetype/include/freetype2 -I/usr/local/Cellar/libpng/1.6.29/include/libpng16 -I/usr/local/Cellar/gdk-pixbuf/2.36.6/include/gdk-pixbuf-2.0 -I/usr/local/Cellar/libpng/1.6.29/include/libpng16 -I/usr/local/Cellar/glib/2.52.2/include/glib-2.0 -I/usr/local/Cellar/glib/2.52.2/lib/glib-2.0/include -I/usr/local/opt/gettext/include -I/usr/local/Cellar/pcre/8.40/include -L/usr/local/Cellar/gtk+3/3.22.7/lib -L/usr/local/Cellar/pango/1.40.3/lib -L/usr/local/Cellar/atk/2.22.0/lib -L/usr/local/Cellar/cairo/1.14.8/lib -L/usr/local/Cellar/gdk-pixbuf/2.36.6/lib -L/usr/local/Cellar/glib/2.52.2/lib -L/usr/local/opt/gettext/lib -lgtk-3 -lgdk-3 -lpangocairo-1.0 -lpango-1.0 -latk-1.0 -lcairo-gobject -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lintl -Wl,-framework -Wl,CoreFoundation

Christophe will ask the maintainers of gtk-osx.