Previous Up Next

Part IV
Converting tests with jingle7

The tool jingle7 is a litmus test converter. It translates tests from one architecture to another. For instance a LISA test may be translated into an AArch64 (or ARMv8) test.

16  Writing conversions rules

Transation is directed by user-specified conversion rules. All conversion rules of a specific (i.e. one architecture to another) are regrouped in a specific “theme” files.

16.1  Mimimal conversion rules

A theme file starts by specififing source and target architectures. For instance:

LISA to AArch64

Then conversion rules follow. For instance here are two rules that translate LISA loads and stores to AArch64 loads and stores:

"r[] %x %y" -> "LDR %x,[%y]"

"w[] %x %y" -> "STR %y,[%x]"

More generally, conversions rules are akin to rewrite rules "L" -> "R". In the simplest case, a rule associates a source instruction pattern L to a target instruction pattern R. Patterns may contain identifiers, which are bound to concrete items during matching. Identifiers starting with the character “%" represent registers.

The above rules suffice to translate the LISA (register indirect) load instruction “r[] r0 r1 into the equivalent AArch64 instruction “LDR R3,[R4]”. Observe that registers are translated. Here LISA r0 and r1 are translated into AArch64 R3 and R4. If a later LISA instruction uses, say, r1 again, that register will be translated to the same AArch64 register R4. As a result, the sequence r[] r0 r1; w[] r2 r1 will get translated to LDR R3,[R4]; STR R5,[R4].

16.2  More on identifiers

There are additional categories of identifiers:

Identifiers starting with “& represent constants. For instance, the following rule translates register load by a constant:

"mov %r &c" -> "MOV %r,&c"

Identifiers starting with an alphabetical letter represent memory locations. Consider for instance the following LISA to AArch64 converstion rule:

"w[] x %r" -> "STR %r,[%x]"

The identifier “x” will match any symbol. In LISA, such symbols embedded in code are memory locations. By constrast modern architectures, such as AArch64, do not permit to embed memory addresses in code. As a consequence, the matched symbolic location from the left hand-side pattern is replaced by a register, written “%x” above. The tool jingle7 will initialise the corresponding AArch64 register appropriately. For instance the LISA instrucation w[] z r0 will get translated to say STR R1,[R2], with R2 initial value being specified as z in the final target litmus test.

16.3  Multiple instructions patterns

Sometimes, a single instruction with a couple of symbolic register is not nearly enough to express every conversion in a good fashion.

A single rule ought to be enough for anybody to understand:

"w[] x &c" -> "MOV %tmp,&c;
               STR %tmp,[%x]"

In AArch64, it is not possible to directly store a constant value in memory, thus we have to express the LISA instruction in two AArch64 instructions with a register picked on the fly.

This example illustrates at the same time:

In case of ambiguity, jingle7 choose the rule to apply according to their order in the file: the higher the rule, the higher its priority.

16.4  Multiple level pattern and structured languages

The last section covers what is necessary to convert most tests from any assembly language to another. However, we might want to allow our tool to work on higher level languages.

The suite currently support a relevant subset of the C language. That means our tool must not only convert sequences of instructions but also potential control structures.

For this purpose, we must allow the expression of chunks of code:

C to LISA

"if(x==constvar:c)
   codevar:t;
 else
   codevar:e;"     -> "mov %test (eq %x &c);
                      b[] %test then;
                      codevar:e;
                      b[] 1 end;
                      then : codevar:t;
                      end :"

This awfully looks like a compilation process (and it is!), but in practice, conditionals are used to express control dependencies which can have a much simpler form in assembly code.

Now, the important point here is the use of codevar: to state that arbitrary code is expected. Such code will also be converted by the same given set of rules, thus allowing us to convert arbitrarily deep code.

Notice that labels too are subject to identification and that is perfectly fine to end a pattern with it since the tool will see a nop-like instruction.

The special keyword constvar: is used in C for the obvious reason that & has an entirely different meaning.

17  Rewriting algorithm

Now with a well defined file, we can let the burden of converting our thousands of tests to jingle7.

In order to fully understand its behaviour, we shall explore more in detail its mechanisms.

17.1  Rule application and substitution mechanism

The rules we define are nothing but generic patterns, for them to hold any meaning we have to find an application in the source program. Such application is simply an instance of the conversion of a part of the source program. The source part must match the pattern in the left side of the rule, the instance of the conversion is the code defined in the right side plus a set of what we call substitutions.

The substitutions are the link between the identifiers in the rule and their actual representations in both the source and target programs.

To roughly formalise, App(R, P) = (Rright, {(id,Srcid,Tgtid) ∣ id is an identifier of R}) would be the application of the rule R on a part of the source program P, where P is a possible instance of Rleft and Srcid (Tgtid) is the source (respectively the target) representation.

17.2  Linear processing

Now that we have a first step of local rewriting, we want to convert an entire program. Thanks to relative simplicity of the supported languages, decomposing a source program linearly is a good enough approach for our needs.

The process can be divided in two steps:

Decomposing

The program is decomposed witha greedy algorithm applying the rules according to their priority order.

A recursive definition would be:

     
  Decomp(•,Rs) = •      where R is the highest possible element of the rule set Rs       
  Decomp(P | Sourcerest,Rs) = App(R,P) | Decomp(Sourcerest,Rs)      and P ∈ Instances(Rleft)        

Of course, we assume the given rule set is sufficient to assure that all part of the program will be matched. If not, users have to refine it.

Recomposing

With the result of the previous step, which already looks like a converted program, we have to actually substitute the abstract identifiers for their representation in the target language given by the associated substitutions, for each part. Then simply append the results to one another in order.

     
  Recomp(•) = •         
  Recomp((RrightSubs) | Partsrest) = Rright[{id ↦ Tgtid ∣ (id,Srcid,Tgtid)∈ Subs}] | Recomp(Partsrest)          
Remark The substitutions can also include code since patterns allow it. This code is converted following the same procedure thus any code substitution have the form (id,Srcid,Recomp(Decomp(Srcid,Rs))).

17.3  Representation coherence and environment

There is one key aspect that we have yet to cover. Until now, the coherence between the source and target representation of a substitution in regards to the others was assumed, i.e.:

∀ id1,id2,   Srcid1 = Srcid2 ⇒ Tgtid1 = Tgtid2

However, if this property in the substitutions of a single application is given by the rule itself, ensuring it between different applications is not as obvious because it would not make sense to compare pattern identifiers from different rules. Thus, we need to keep track of any Src,Tgt association made by applications through the whole program.

To do so, we define a global environment that preserves the property by delivering a target language representation for each source value:

     
  Get_repr(Γ,Srcid) = fresh_repr(Srcid)     if Srcid ∉ Γ        
  Get_repr(Srcid↦ Tgtid∣Γ,Srcid) = Tgtid             

Nothing exotic here, this is a part of the application of a rule as it is supposed to be the only safe way to obtain a target representation.

18  Example

Considering the LISA litmus test MP+poplainplain+fencedmballsyplainplain.litmus:

LISA MP+poplainplain+fencedmballsyplainplain
"PodWWPlainPlain RfePlainPlain FenceDmbAllSydRRPlainPlain FrePlainPlain"
Cycle=RfePlainPlain FenceDmbAllSydRRPlainPlain FrePlainPlain PodWWPlainPlain
Relax=
Safe=Rfe Fre PodWW FencedRR
Prefetch=0:x=F,0:y=W,1:y=F,1:x=T
Com=Rf Fr
Orig=PodWWPlainPlain RfePlainPlain FenceDmbAllSydRRPlainPlain FrePlainPlain
{
}
 P0      | P1            ;
 w[] x 1 | r[] r0 y      ;
 w[] y 1 | f[dmb,all,sy] ;
         | r[] r1 x      ;
exists
(1:r0=1 /\ 1:r1=0)

and the theme file BelltoAArch64.theme in which the only relevant rules, in regards of the test above, are:

"r[] %x y" -> "LDR %x,[%y]"

"w[] x &c" -> "MOV %tmp,&c;
               STR %tmp,[%x]"

"f[dmb,all,sy]" -> "DMB SY"

the output of a call to jingle7 with those in arguments will be:

AArch64 MP+poplainplain+fencedmballsyplainplain
"PodWWPlainPlain RfePlainPlain FenceDmbAllSydRRPlainPlain FrePlainPlain"
Mapping=1:X2=r1,1:X0=r0
Hash=6945a3af44248d1d826a14b204ccf067
Cycle=RfePlainPlain FenceDmbAllSydRRPlainPlain FrePlainPlain PodWWPlainPlain
Relax=
Safe=Rfe Fre PodWW FencedRR
Prefetch=0:x=F,0:y=W,1:y=F,1:x=T
Com=Rf Fr
Orig=PodWWPlainPlain RfePlainPlain FenceDmbAllSydRRPlainPlain FrePlainPlain

{0:X1=y; 0:X0=x; 1:X3=x; 1:X1=y;}

 P0          | P1          ;
 MOV X2,#1   | LDR X0,[X1] ;
 STR X2,[X0] | DMB SY      ;
 MOV X2,#1   | LDR X2,[X3] ;
 STR X2,[X1] |             ;



exists (1:X0=1 /\ 1:X2=0)

According to the algorithm described in the last section, jingle7 has converted the three instructions of P1 with, in order, the load rule, the fence rule and the load again. For P0 it apply the store rule twice.

Notice that the same AArch64 register X2 is used as the temporary register for both applications. It is safe to do so as long as its value is not needed but outside the scope of each application.

The initialisation and test conditions are converted as well: the registers are from the right architecture and the location, directly used in LISA, are now bound to specific registers.

A Mapping metadatum is also added to allow comparison tools to work properly.

19  Usage of jingle7

19.1  Arguments

The command jingle7 handles its arguments as file names, just as herd7. Those files are either a single litmus test when having extension .litmus, or a list of file names when prefixed by @.

19.2  Options

There is one option that must always be used:

-theme <name>
Read the conversion rules file <name>. By convention, such files have the extension .theme.

General behaviour

-v
Be verbose.
-o <dest>
Instead of printing the result on the standard output, output test files in the existing <dest> directory. Those files have the same name as the input tests.

19.3  Regarding conversion errors

When the tool fails to find a conversion in a program, it will print the remaining instructions. It makes easy for the user to pin down missing rules in his .theme file as the first instruction printed is likely the one that cannot be matched.

Using those error might be helpful to build such file instead of trying to figure it out as a whole beforehand.


Previous Up Next