Part V
Automating the testing process

The authors of dont are Jade Alglave and Luc Maranget (INRIA Paris–Rocquencourt).

21 Preamble

Following Part II, we describe our tests via cycles, built from the candidate relaxations they involve. We consider a candidate relaxation to be relaxed, or non-global, when it corresponds to the weaknesses that can be observed on a system implementing A. We consider a candidate relaxation to be safe, or global, when it is guaranteed, e.g. by the documentation, never to be relaxed.

In the following, we consider an architecture A to be a pair (Relax_A, Safe_A), where Relax_A (resp. Safe_A) are the candidate relaxations relaxed (resp. safe) for A. The automated front-end dont mechanises the task of checking that a machine or executable model conforms to such an architecture, and of exploring architectures. We provide some experiment reports elsewhere. This document is intended to be a gentle introduction to dont and a partial reference.

22 A tour of dont

22.1 Checking conformance

We want to check that a given machine M is conform to an architecture A. By conform, we mean that the machine M does not exhibit more behaviours than the architecture A actually allows.

For example, let us consider an x86 machine with 2 processors. Suppose that we have been told that x86 machines are TSO, and that we want to check that. As the default values of dont options handle that very situation, we type:

$ dont -mode conform
** Step 0 **
Phase 2 in A (6 tests)
...
Phase 2 in A (6 tests)
** Step 5 **
Safe set {Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWR} is conform

The automated front-end dont, assumed the TSO safe set (the default for x86), called the diy tool (see Part II) to generate all the tests that are forbidden by TSO — up to 2 processors; ran them (5 times) with our companion litmus tool, (see Part I) against our x86 machine; and observed that the machine does not exhibit any outcome forbidden by TSO. In effect, dont in conformance check mode automates the safe tests of Sec. 6.2.

22.2 Checking non-conformance

Now, we wish to prove that an x86 machine is not sequentially consistent. To that end, we write the following configuration file x86.sc:

#General behaviour
arch = X86
mode = conform
stabilise = 1
#Cycle control
safe = Rfe,Fre,Wse,Pod**,[Rfi,PodRR]
nprocs = 2
#External tool control
litmus_opts = -a 2 -i 0
run_opts = -s 100000 -r 10,-s 5000 -r 200 -i 1
build = make -j 2 -s

Most of dont controls are set, sometimes to their default values:

arch = X86 sets the targeted architecture, mode = conform sets conformance check mode, and stablise = 1 commands performing the check round once (the default is five times, cf. supra).
safe = Rfe,Fre,Wse,Pod**,[Rfi,PodRR] defines the set of safe relaxation candidates used to generate litmus tests (up to 2 processors, by nprocs = 2).
The front-end dont calls litmus and runs the tests with the specified options. The setting litmus_opt = -a 2 -i 0 specifies that two processors are available and enables affinity control (see Sec. 4 for the description of litmus options). Tests will be run twice per check round, once with options -s 100000 -r 10, and once with options -s 5000 -r 200 -i 1 (see Sec. 2.3 for the description of test executable options). Finally, the setting build = make -j 2 -s specifies the command to use to compile the C source files that litmus produces.

We run dont configured by x86.sc as follows:

$ dont x86.sc 
** Step 0 **
Phase 2 in A (9 tests)
...
** Step 1 **
Safe set {[Rfi,PodRR], Rfe, Fre, Wse, PodWW, PodWR, PodRW, PodRR} is not conform
++ Invalidating tests ++
A006: 'Fre PodWR Fre PodWR' {Fre, PodWR}
A007: 'Fre PodWW Wse PodWR' {Fre, Wse, PodWW, PodWR}
A001: 'Rfi PodRR Fre PodWR Fre' {[Rfi,PodRR], Fre, PodWR}
A002: 'Rfi PodRR Fre PodWW Wse' {[Rfi,PodRR], Fre, Wse, PodWW}
A000: 'Rfi PodRR Fre Rfi PodRR Fre' {[Rfi,PodRR], Fre}
++++++++

The conformance check failed and the tests that invalidate the hypothesis “x86 is sequentially consistent” are listed. The check took place in directory A. Directory A contains the actual logs of litmus runs as files A.00, A.01 etc., in addition to the sources of the litmus tests:

$cat A/A006.litmus
X86 A006
"Fre PodWR Fre PodWR"
Cycle=Fre PodWR Fre PodWR
Relax=
Safe=Fre PodWR
{ }
 P0          | P1          ;
 MOV [x],$1  | MOV [y],$1  ;
 MOV EAX,[y] | MOV EAX,[x] ;
exists (0:EAX=0 /\ 1:EAX=0)

Notice that, since tests are described by their cycles, the source of tests can also be reconstructed with diyone:

% diyone -arch X86 Fre PodWR Fre PodWR
X86 a
"Fre PodWR Fre PodWR"
{ }
 P0          | P1          ;
 MOV [y],$1  | MOV [x],$1  ;
 MOV EAX,[x] | MOV EAX,[y] ;
exists (0:EAX=0 /\ 1:EAX=0)

22.3 Automatically exploring the memory model exhibited by a machine

Now suppose that we have no idea of the memory model of our 2 processors x86 machine. Another mode of our dont tool automatically explores a given machine, and outputs an architecture (i.e. a pair (Relax_A, Safe_A)) to which the machine conforms. The following configuration file x86.explo instructs dont to perform such an exploration.

#General behaviour
arch = X86
mode = explo
#Cycle control
testing = Rfe,Pod**,MFenced**,[Rfi,PodR*]
safe = Fre,Wse
nprocs = 2
#External tool control
litmus_opts = -a 2 -i 0
run_opts = -s 100000 -r 10,-s 5000 -r 200 -i 1
build = make -j 2 -s

With respect to conformance check, new or changed settings are the selection of exploration mode by mode = explo, the definition of the initial safe set by safe = Fre,Wse, and and the definition of the candidate relaxations to be tested (testing = Rfe,Pod**,MFenced**,[Rfi,PodR*]).

We launch the exploration as:

$ dont x86.explo

The whole process only takes a few minutes, mostly due to the limited number of tests induced by the setting nprocs = 2.

We now detail dont output (complete log of the exploration: x86-log.txt) . We start by a first exploration round:

** Step 0 **
Testing: {[Rfi,PodRW], [Rfi,PodRR], Rfe, PodWW, PodWR, PodRW, PodRR, MFencedWW, 
MFencedWR, MFencedRW, MFencedRR}
Relaxed: {}
Safe   : {Fre, Wse}
Phase 1 in A (6 tests)
Actually tested: {[Rfi,PodRW], [Rfi,PodRR], PodWW, PodWR, MFencedWW, MFencedWR}
Added relax: {[Rfi,PodRR], PodWR}
Added safe: {[Rfi,PodRW], PodWW, MFencedWW, MFencedWR}
Phase 2 in B (6 tests)

The log above first indicates the current status of exploration as three sets: testing, relaxed and safe. Initially, no candidate relaxation has yet been observed to be relaxed, while the testing and safe sets are as assumed. Each exploration round is divided in two phases. The aim of Phase 1 (performed in directory A) is to classify some candidate relaxations as either relaxed or safe. It here succeeds for 6 candidate relaxations, whose observed status is indicated. Phase 2 (performed in directory B) basically is a conformance check of the current safe set. The conformance check succeeds and all safe candidate relaxations found at phase 1 make it to the next round:

** Step 1 **
Testing: {Rfe, PodRW, PodRR, MFencedRW, MFencedRR}
Relaxed: {[Rfi,PodRR], PodWR}
Safe   : {[Rfi,PodRW], Fre, Wse, PodWW, MFencedWW, MFencedWR}
Phase 1 in C (10 tests)
Actually tested: {Rfe, PodRW, PodRR, MFencedRW, MFencedRR}
Added safe: {Rfe, PodRW, PodRR, MFencedRW, MFencedRR}
Phase 2 in D (17 tests)

Phase 1 (performed in directory C) can now target new candidate relaxations, because of the increased safe set. All of targeted candidate relaxations are observed to be safe, which is confirmed by phase 2. As a consequence, there does not remain any candidate relaxation to be tested and the next round reduces to a conformance check:

** Step 2 **
Testing: {}
Relaxed: {[Rfi,PodRR], PodWR}
Safe   : {[Rfi,PodRW], Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWW, MFencedWR, MFencedRW, MFencedRR}
Phase 1 in E (0 tests)
Phase 2 in D (17 tests)

The same check is performed for 4 additional rounds as governed by the default value of 5 for the setting of stabilise. Round number 6 then shows the result of exploration, (i.e. the pair (Relax_A, Safe_A)), prefixed by the list of tests that justify observed relaxations:

** Step 6 **
...
++ Witness(es) for relaxed [Rfi,PodRR] ++
A001: 'Rfi PodRR Fre Rfi PodRR Fre' {[Rfi,PodRR], Fre}
++++++++
++ Witness(es) for relaxed PodWR ++
A003: 'Fre PodWR Fre PodWR' {Fre, PodWR}
++++++++
Observed relaxed: {Rfi, PodWR}
Observed safe: {Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWW, MFencedWR, MFencedRW, MFencedRR}

And we go again for 5 additional rounds of pure conformance check:

** Now checking safe set conformance **
** Step 7 **
Phase 2 in F (17 tests)
...
* Step 12 **
Observed relaxed: {Rfi, PodWR}
Observed safe: {Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWW, MFencedWR, MFencedRW, MFencedRR}

Once exploration is complete, all litmus tests and logs of litmus runs are still present in their directories A, B, etc. For instance, the directory F contain the 10 logs of the final conformance check, as the files F.01, …, F.09:

$ ls F/F.??
F/F.00  F/F.01  F/F.02  F/F.03  F/F.04  F/F.05  F/F.06  F/F.07  F/F.08  F/F.09

The tool dont offers a convenient replay feature:

$ dont -restart
** Step 0 *
...
* Step 12 **
Observed relaxed: {Rfi, PodWR}
Observed safe: {Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWW, MFencedWR, MFencedRW, MFencedRR}

The command above takes a few seconds of time, since experiments are not run again. Instead, the logs of litmus runs are read and their interpretation is re-performed. Notice that the restart feature also permits to pursue interrupted experiments.

23 Usage of dont

In effect, the tool dont automates the complete testing procedure described in the documentation of diy proper (Sec. 6). It is to be noticed that dont requires a fully functional installation of the diy tool suite. In particular, the commands diy and litmus must be installed and runnable as “diy” and “litmus” (i.e. installed in path).

23.1 Command-line options

The automated front-end dont is configured mostly by the means of a configuration file, which dont takes as a command-line argument. Nevertheless, dont accepts the following, limited, set of options:

-v: Be verbose, repeat to increase verbosity.
-version: Show version number and exit.
-arch (X86|PPC|ARM): Set architecture. Default is X86. ARM is untested.
-mode (conform|explo): Set main mode, either conformance check or exploration. Default is explo.
-nprocs <n>: Generate tests up to <n> processors (defaults: X86=2, PPC=4)
-restart: Restart the experiment in hand in current directory.

Except for -restart command lines options are not intended for normal use. In particular, command-line options do not override values defined in configuration files.

Namely, there are many parameters to set and appropriate values for them will depend on the tested machine. In particular, litmus parameters need to be chosen carefully, by the means of preliminary experiments. For instructions on configuring litmus, refer to Sec. 2 of litmus documentation.

23.2 Configuration files

The general syntax of configurations files is a sequence of lines key = value. Comment lines are introduced by #. The tool dont recognises the following keys:

General behaviour

mode = (conform|explo): Main operating mode. Default is explo
arch = (X86|PPC|ARM): Target architecture. Default is X86.
run = (local|ssh <addr>|cross <addr1> <addr2>): Give access to the tested machine, which can be either the machine where dont runs, or remote machine <addr>, or compile C files on remote machine addr1 and execute on tests on remote machine addr2. Machine addresses are [user@]machine[:port] expressing connection elements for both ssh and scp. Default is local.
work_dir = dir: Directory for temporary files, default is /var/tmp.
stabilise = <n>: In conformance check mode, dont performs n rounds of conformance testing. In exploration mode, dont ends the exploration after n rounds without state change. Default is 5.
interactive = <bool>: In exploration mode and after n rounds without state change, dont will either assume that the whole current testing set is safe (false), or ask the user (true) to decide for some of the elements of this set to be safe. Default is true, i.e. ask user.

Controlling Cycle Generation

nprocs = <n>: Generate cycles up to n processors. Default is 2 for x86 and 4 for Power.
diy_sz = <m>: Upper limit on the size of cycles of candidate relaxations. Default is 2 × n, where n is the number of processors. With decent values of the initial candidate relaxations sets (see below), this default commands the generation of all (critical, see Sec. 10.5) cycles that involve up to n processors.
safe = <relax-list>: Define the safe set S. In exploration mode, S is the initial value of the safe set (default Fre, Wse). In conformance mode, S is the safe set checked. Ddefault is Rfe, Fre, Wse, PodR*, PodWW, MFencedWR for x86, and unspecified for other architectures.
testing = <relax-list>: Define the tested set of candidate relaxations. The tested set is relevant only in exploration mode. Default values are Rfe,Pod**,MFenced**,[Rfi,MFencedR*],[Rfi,PodR*] for x86 and unspecified for other architectures.

The syntax for relax-list above is a comma (or space) separated list of candidate relaxations. Candidate relaxations are introduced by the documentation of diy (see Part II)

Control of external tools

litmus_opts = <opts>: Define options used by dont when it calls litmus. Default is the empty string, i.e. use litmus defaults.
run_opts = <opts₁,…,opts_n>: Define options used for running litmus tests. Any set of litmus tests generated and compiled by dont, will be run n times, with specified options. More concretely, dont will run the litmus tests with commands sh run.sh opts₁, …, sh run.sh opts_n. The default is the empty string, i.e. run tests once with no option.
build = <command>: Defines the command issued by dont to compile the C source files produced by litmus. The default is sh comp.sh, i.e. runs the compilation script produced by litmus. An interesting alternative is make -s -j n for concurrent compilation, with up to n concurrent tasks.

Part V Automating the testing process