Previous Up Next

Part V
Developer documentation

21 Introducing new languages in herdtools7

A language can be a high-level programming language (e.g. C++, Java), an assembly language (e.g. AArch64, x86), or a specialised language (e.g. BPF).

To handle a target language (TL) (e.g. run litmus tests against the herd7 simulator, or against hardware using litmus7, or generate litmus tests using diy7), it is required to introduce the following files in the lib/ folder:

The lib/ folder contains modules that are shared between the three other relevant directories: herd/, gen/ and litmus/.

The TLBase.ml file defines the basic types that all tools (herd7, diy7 and litmus7) use. Using AArch64Base.ml as an example, it defines the following:

It might be further required to define memory orders for the language: an example can be found in lib/memOrder.ml, which is relative to C++.

After defining the lib/ files, the next steps are:

It does not matter which tool you should to update first: some people prioritise the simulator herd7 for rapid modeling development, others prioritise the test harness litmus7 for experimenting with hardware.

21.1 The herd7 tool

In herd/, three files are required:

21.1.1 Monadic operators in herd

Monadic operators are defined in herd/eventsMonad.ml and are used to combine different Effects and add relations among them. Memory accesses and ordering constraints enforced by synchronization operations are examples of Effects.

The monadic operators use various base operators, such as =**=, =*$=, =$$=, +|+, =|=, which are defined in herd/event.ml. These operators provide various ways of combining two Effect graphs.

Effect graphs can either represent the execution of a whole litmus test or be very small, corresponding to some of the Effects contributing to the semantics of a single instruction or command.

The operators build on three definitions:

We detail each in turn below.

Operator para_comp

Unions Effects from both Effect graphs, and their relations.

For example, para_comp is used in the semantics of a store (see do_str and its call in the definition of str in herd/AArch64Sem.ml). A store takes as input a source register rs, which contains the value v to be stored, and a target register rd, which contains the address a to be written to. In do_str, two things happen in parallel using >>|, which makes use of para_comp: an address a is extracted from a register rd (ma in do_str, instantiated to get_ea in str) and a value v is extracted from register rs (via read_reg_data in do_str).

Base operators that build on para_comp are as follows:

Operator data_comp

Introduces a causality link between an Effect graph es1 and an Effect graph es2 (see intra_causality_data). This means that the Effect(s) of es1 is (intrinsically-)before the Effect(s) of es2 in the order of operation of the instruction or command described.

Using the same str example as above, after extracting the value v and the address a in parallel, the value v is written to the address a via do_write_mem in do_str. This last Effect do_write_mem occurs after the two extracting Effects. In the herd diagrams, this is depicted with an iico arrow, which stands for intra-instruction causality order.

Base operators that build on data_comp are as follows:

Base operators that build on control_comp are as follows:

21.2 Memento of frequently used operators

OperatorDescription
>>|Intuitively a >>| b means “do a in parallel with b”. This is used when accesses do not need to be ordered. The code snippet below reads the register rs and in parallel reads the register rd, gets a value v from rs and a location x from rd, and writes that value v back to that memory location x.
(read_reg rs >>| read_reg rd) >>= fun (v,x) -> write_mem x v    
>>||>>|| is similar to >>|, but this operator does not check whether the two argument Effect graphs are disjoint.
>>=Intuitively a >>= b means “do a, collect its result and feed it into b”. The code snippet below reads the register rs, gets a value v, and writes that value v back to memory location x.
read_reg rs >>= fun v -> write_mem x v                          
>>==The operator >>= corresponds to =*$=, while >>== corresponds to =$$=. They both do data_comp over disjoint graphs, with the latter performing transitive closure on iico causality link.
>>*= and >>*==The operator >>*= corresponds to =**=, while >>*== corresponds to =*$$=. They both do control_comp over disjoint graphs, with the latter performing transitive closure on iico_control links.

21.3 The diy7 tool

The gen/ directory is used for the diy7 tool suite which enables one or multiple tests generation.

The algorithm that generates litmus tests from cycles of relaxations is parametric in the architecture, resulting in little to no additions required when introducing a new language.

To add a new backend to the general test generation algorithm, two modules are required:

To better understand TLArch_gen.ml, an example of such file is CArch_gen.ml, with function dump_typ and type exp as example candidates.

To better understand TLCompile_gen.ml, an example of such file is CCompile_gen.ml. It contains the function type_event, which checks whether an extremity is atomic or not, in the sense defined by the architecture. It also contains the function compile_access, which could, for example, output a store using function compile_store as per the architecture definition, when given W as an input. See also functions dump_exp, dump_ins, dump_c_test which print the right syntax. There needs to be an equivalent of these functions for each new language.

21.4 The litmus7 tool

The litmus/ directory contains the sources of the litmus7 tool which enables running litmus tests on hardware.

To enable the litmus7 tool to run litmus tests on hardware, it is required to add new skeletons that the litmus assembly can be slotted into.

In this case, two files are required:

These enable the litmus7 tool to create a program in the programming language that is being introduced and then run it on the target machine.


Previous Up Next