State transformation

Software updates frequently change the representation of state between versions. Types may be added, modified, or deleted, and state will change as a result. To fully support all possible updates, a dynamic software updating system must provide some mechanism for transforming existing state into the new version’s representation. This chapter introduces Kitsune’s mechanism, xfgen.

Introduction to xfgen

xfgen is a tool to generate transformation code. The migration features in Kitsune discussed in the last chapter automatically call this transformation code when migration occurs.

xfgen is a combination of annotations defined in the Kitsune C library, and an external tool with the binary name xfgen which, when given an xf file, and the type summaries of the old and new versions as generated by ktcc, will output a C file containing transformation code.

This output is governed by the annotations provided in the program source, and by the specifications given in the xf-file. These specifications are of the following form:

spec1 -> spec2: {
    /* C transformation code */
}

INIT spec1: {
    /* C initialization code */
}

(xfgen also allows developers to provide code to initialize a newly added variable, but for simplicity’s sake we will refer to all code as transformation code.)

The rest of this chapter will provide a step-by-step introduction to using xfgen. At the end, we include a reference for the xfgen language and source annotations.

Running xfgen

The xfgen program has three arguments:

xfgen dsu.c old-types.ktt new-types.ktt program.xf
xfgen <output C file> <old version types> <new version types> <xf file>

These ktt files are Kitsune type summary files. They tell xfgen what types were changed, added, or deleted between versions, and what C variables have those types. By default, for a C file named “file.c”, the Kitsune compiler outputs type summary files to “file.ktt”. You can configure any name with the --typesfile-out=<name> argument – note that this should be some function of the C source file name, since the Kitsune compiler will overwrite the file if one exists.

Before passing these type summaries to xfgen, we need to join them with the kttjoin tool:

new-types.ktt: $(OBJS)
    kttjoin program.ktt $(subst .o,.ktt,OBJS) 

The old-types.ktt file is typically specified in a Makefile variable, pointing into the source directory of the old version.

The program.xf file is a file containing xfgen-language transformations as described above, and in the rest of this chapter. The .xf file is the source file for the C output of xfgen. Usually, a Makefile rule like this is used to compile the .xf file:

dsu.c: program.xf new-types.ktt
    xfgen $@ old-types.ktt new-types.ktt program.xf

(In Makefile syntax, $@ is a variable that stands for the target of the rule – in this case, “dsu.c”.)

Given these two type summary files, xfgen will generate a C file that defines transformation functions between old and new types. This file should be compiled without the Kitsune compiler:

dsu.o: dsu.c
    $(CC) $(CFLAGS) -fPIC -c $^

(Where $^ is a Makefile variable specifying the dependencies of the rule – in this case, dsu.c.)

This means that the rule to build your program’s shared object will look like this:

program.so: $(OBJS) dsu.o
    $(CC) -o $@ -shared $(LDFLAGS) $^

Adding a new variable

Initializing a newly added variable with xfgen has slightly different, and slightly simpler, syntax than full transformation.

INIT variable: {
    $out = ...;
}

Here, variable is the name of a newly added variable. $out is an xfgen extension representing the result of the transformation.

This is a trivial example, but between the two above braces can be any arbitrary amount of C code. This code can access symbols and types in the old and new versions, and bring to bear knowledge of any aspect of the program state in order to determine the correct initial value of the variable given.

Initialization is also possible with types and type fields:

INIT struct toy.field: { $out = "new field"; }

Exercise 5.1

We revisit the key-value server for the exercises in chapter 5. Your first task is simple: add an INIT stanza for the newly added variable connected_count.

After you make this change, test that it works by building and running the exercise with the Kitsune driver, and initiate an update with doupd as described in chapter 1.

Refactoring with xfgen

A common, and simple, case of state changing between version is a variable being renamed or moved between files. This can be handled with a single line in xfgen:

oldname -> newname

Variables can also be scoped to a single file:

main.c/old_static_variable -> main.c/new_static_variable

Or moved between files:

main.c/old_static_variable -> other.c/new_static_variable

Or lifted from functions:

function#local_variable -> main.c/global_static
other.c/function#local -> global_variable 

Or any other combination. The full set of Kitsune names is as follows:

file.c/function#local_variable
file.c/global_static_variable
global_variable

All of the above renaming rules are missing a block of transformation code, indicating that they need no transformation, but any could be used with a block of code to indicate both a renaming and a transformation.

Exercise 5.2

Add an xfgen rule to keyvalue.xf to handle the renaming of server_sock and client_sock to server and client. Handle the refactoring that elevated server from a local variable in main to a global variable.

After you make this change, test that it works by building and running the exercise with the Kitsune driver, and initiate an update with doupd as described in chapter 1.

Transformation Code

Transformation blocks can contain arbitrary C code. Further, all xf-file may begin with a block that contains C code, to define functions or state common to all transformers:

{
    int common_xform(void) {
        ...
    }
}

INIT var: {...}
var1 -> var2
...

Kitsune provides several API functions and macros to make transformation code easier.

$xform(oldtype, newtype) and closures

XF_INVOKE(c, in, out)

/* yield closures */
XF_PTR(xf)
XF_ARRAY(nmemb, from_elem_size, to_elem_size, xf)
XF_NTARRAY(from_elem_size, to_elem_size, xf)
XF_NTSTR()
XF_FPTR()

/* yields xf-function */
$xform(oldtype, newtype)

Frequently, transformed types or variables are containers for other types. Since C is weakly typed, xfgen frequently is unable to determine which transformation is acceptable, especially when void * types are in play.

XF_INVOKE allows you to invoke a closure, c, on two sub-arguments in and out. Frequently, &$in and $outare used here, to form a simple single-line transformation that ‘redirects’ the existing transformer to use a builtin xfgen transformer such as one of the above. For example:

struct foo.bar -> struct foo.bar: {
    XF_INVOKE(XF_PTR($xform(struct baz,struct baz)), &$in, &$out);
}

This transformation causes the bar field of struct foo to be transformed as a pointer to the instance of struct baz.

The various other compatriots to XF_PTR can be used similarly to yield closures for use with XF_INVOKE. Most take a transformer function as a pointer, named above xf.

XF_ARRAY yields a closure for an array of nmemb elements that are from_elem_size in the first version, and to_elem_size in the later version.

XF_NTARRAY is identical to XF_ARRAY, but presumes that the array is terminated by a NULL element. XF_NTSTR is identical to XF_NTARRAY, but presumes that the state to be transformed is a C string.

XF_FPTR yields a closure that can be used to transform function pointers. This uses Kitsune’s symbol table to find the corresponding next-version function given the address from the previous version.

In the argument list of the above macros, xf stands for transformation function. This argument is generated by the $xform statement. $xform(oldtype, newtype) will yield a function pointer to a transformation function between the corresponding types.

Type annotations

Since C is weakly typed, xfgen can only generate very simple transformers based on the type information available by default in C. To allow xfgen to generate more and more correct transformations, programmers can annotate types and variables with several macros Kitsune provides.

E_OPAQUE
E_PTRARRAY(S)
E_ARRAY(S)

E_OPAQUE causes a variable or type to be treated as opaque and copied as raw data. Transformation ceases at that point. This is most useful for library types for which the complete information is unavailable at compile-type.

E_PTRARRAY and E_ARRAY cause a variable or type to be marked as either a pointer to array of size S, or an array of size S.

xfgen can also automatically generate transformers for the common use of void pointers as generic types in C. Consider the following linked list:

struct linked_list {
    void *data;
    struct linked_list *next;
};

Even though this list may be used with more than one type over the course of the program, each instantiation of the list is usually only a single type. xfgen provides three annotations to exploit this common behavior:

E_GENERIC(@t)
E_T(@t)
E_G(@t)

The clearest way to explain these annotations is to show their use:

struct linked list {
    void E_T(@t) *data;
    struct linked_list E_G(@t) *next;
} E_GENERIC(@t);

Read from top to bottom, this states that data is a variable of type @t, and next is an instance of a linked list of type @t, for all types @t.

Types can have arbitrary numbers of generic types. Consider an associative list:

struct linked list {
    void E_T(@k) *key;
    void E_T(@v) *val;
    struct linked_list E_G(@k,@v) *next;
} E_GENERIC(@k,@v);

Once a type is properly annotated, you can signal that a variable is of a particular generic type with the E_G annotation as such:

struct linked_list E_G(int, [opaque]) *list;

This example demonstrates the Kitsune [opaque] psuedo-type, which will here cause the val field to be copied as an opaque value when it is encountered during traversal.

Exercise 5.3

For this exercise, we have changed the type of store in keyvalueserver.c, allowing us to demonstrate generic datastructure annotations.

Add type annotations to the struct tree type in tree.c, allowing it to be automatically traversed by xfgen.

Test these changes by building, running, and updating the exercise, as before. To update the exercise, you may need to connect to the listening port (5000 by default) via nc or telnet.

Exercise 5.4

Add transformation code to handle the addition of the entry_time field to struct treenode. Take advantage of xfgen’s traversal of the tree enabled by 5.4.

Test these changes by building, running, and updating the exercise, but this time, connect a client before the update, update the program, and observe the behavior after the update.