Exploration of machines and models

We ran our dont tool against several different machines and executable models, x86 and Power ones. We provide here the experimental results.

Contents

1  Description of provided files

The presentation of results follows a systematic scheme, which we introduce on the simple example of exploring a 2 core x86 machine. For each experiment, we give:

  1. A brief description of the machine: conti is an Intel Core 2 Duo (2 logical processors) running Linux.
  2. The resulting log file: for instance conti/log. The verdict of the experiment — i.e. the conformance or non conformance of the machine to the model, or the model found by exploration — is at the end of the file. For instance:
    ...
    Observed relaxed: {Rfi, PodWR}
    Observed safe: {Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWW, MFencedWR, MFencedRW, MFencedRR}
    
    That is, we get TSO described as relaxed and safe candidate relaxations.
  3. The configuration file we used: for instance conti.auto. In other words, we ran dont as “dont conti.auto”.
  4. All files produced during the experiment: for instance conti. The link points to a directory that contains subdirectories A, B, etc. Each subdirectory contains:
    1. a configuration file for diy, for instance A.conf;
    2. the tests themselves, for instance A000.litmus, A001.litmus, etc.
    3. the files created by running the tests, for instance A.00 and B.00.
    We provide those subdirectories for completeness, they are not needed for interpreting the verdict of an experiment.

    Nevertheless, one may extract interesting information from subdirectories. For instance, according to the verdict at the end of conti/log, PodWR is relaxed. Now, looking approximately at the middle of the same log file we see:

    ...
    ++ Witness(es) for relaxed PodWR ++
    A002: 'Fre PodWR Fre PodWR' {Fre, PodWR}
    ++++++++
    Observed relaxed: {Rfi, [Rfi,PodRR], PodWR}
    Observed safe: {Rfe, Fre, ...}
    ...
    

    And we know how dont observed PodWR to be relaxed: running test A002, observing the relaxed outcome (check for A002 in file A.00 for instance) and assuming that Fre is safe.

2  Exploration of machines

2.1  Exploration of x86 machines

All experiments produce the same model: TSO exprimed with relax and safe sets:

RelaxedSafe
Rfi, PodWR        Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWR

Here are experiment logs:

2.2  Exploration of Power machines

Given the variety of Power implementations we have tested, we found various models. However, as can be checked by verifying the inclusion of safe sets, all those model conform to the simplified Power model obtained by exploring our axiomatic memory model for Power.

During Power experiments we have detailed dependency candidate relaxations (“dp” in paper) as follows:

DpDatadW
Read to write data dependency: there is a data flow path from a memory load to the data stored by a following memory store.
DpAddrdW
Read to write address dependency: there is a data flow path from a memory load to the effective address of a following memory store.
DpAddrdR
Read to read address dependency: there is a data flow path from a memory load to the data stored by a following memory load.
DpCtrld{R,W}
Control dependency from read to read (resp. write). The execution of a memory load (resp. store) depends on the value read by a preceding memory load. In practice, there is a data flow dependency from a memory load to a following conditional branch, which precedes a following memory load (resp. store).
DpCtrlIsyncdR
Control dependency from read to read, with additional isync barrier between the conditional branch and the following read. Notice that DpCtrlIsyncdW is omitted, as DpCtrldW is always safe and subsumes it.

Notice that the dependency candidate relaxations defined above show only one “R” or “W”. This direction is the one of the target of the candidate relaxation, as the source always is a read and is thus left implicit.

We also performed a complete exploration of the isync barrier by including ISyncdRR, ISyncdRW, ISyncdWR and ISyncdWW in tested candidate relaxations, which we do not show in the paper.

The following table gives the safe sets of candidate relaxations found in all Power experiments:

Safesquale=Rfe1 , Fri, Fre, Wsi, Wse, PodRW, SyncdWW, SyncdWR, SyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri]
Safeabducens=ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, ABCLwSyncdRW, Fri, Fre, Wsi, Wse, PodRW, PodRR, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrldR, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri]
Safevargas=ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, ABCLwSyncdRW, Fri, Fre, Wsi, Wse, PodRW, PodRR, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrldR, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri]
Safepower7=ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, Fri, Fre, Wsi, Wse, PodRW, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri]
Safeppc=ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, Fri, Fre, Wsi, Wse, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri]

It should be noticed that Safeppc defines a Power model slightly weaker (but also more simple) than Power2010. Namely Safeppc allows some behaviours that Safeppc rejects. This is due mostly to the sophisticated (RMO-inspired) “preserved program order” defined in “Fences in Weak Memory Models” (Fig. 6).

As for any machines m Safeppc is included in Safem, we draw one important conclusion from experiments: all tested machines conform to our model: Said otherwise, no behaviour forbidden by ppc will2 show up on the tested Power machines.

One may also notice some particularities of the various implementations of the Power architecture:

2.3  Exploration of ARM dual-core machines

We test two machines: a Toshiba Folio 100 tablet and a Trim-Slice computer. Both machines are build around a NVIDIA Tegra 2 chip featuring a dual-core Cortex-A9 ARM designed CPU.

It is to be noticed that there are two cores only, which limits the scope of our tests. In particular, there is no test on two cores that targets the Rfe candidate relaxation. In other words multi-copy atomicity cannot be tested testable a dual core setting. For simplicity we assume Rfe to be safe.

The experiment verdict is the same for both machines. We give the complete classification of candidate relaxations as either relaxed or safe:

RelaxedSafe
Rfi, [Rfi,DpAddrdR],PodWW, PodWR, PodRW, PodRR, ISBdWW, ISBdWR, ISBdRW, ISBdRR, DpCtrldRRfe, Fre, Wse, Fri, Wsi, DMBdWW, DMBdWR, DMBdRW, DMBdRR, DSBdWW, DSBdWR, DSBdRW, DSBdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsbdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpDatadW;Wsi]

Notice that we here use refined candidate relaxations, similar to the ones of the Power experiments above, with the instruction synchronisation barrier isb being the ARM equivalent of Power isync.

Form this preliminary experiment, we draw a few conclusions:

3  Exploration of models

3.1  Operational model for Power

We automatically show that the (operational) model Power model introduced in “Understanding POWER Multiprocessors” (Power2011 for short), does not conform to the axiomatic model introduced in “Fences in Weak Memory Models” (Power2010 for short). To that aim we use dont in conformance mode. Given a safe set, dont in conformance mode generates tests using safe candidate relaxations only and runs the tested machine (or executable model) on them, attempting to find tests that invalidate the safe set.

A complete account of our comparison of two Power models is available as another document.

3.2  Model x86-CC+fences

x86-CC+fences the x86-CC model of “The Semantics of x86-CC Multiprocessor Machine Code”augmented with an mfence instruction that maintains write-read pairs in program order, following the semantics sketched in the x86-CC paper.

First, we automatically show that X86-CC+fences does not conform to TSO: experiment log: cc/log (look at the end for counter-examples, test A055 is sb+fence), configuration file: cc.auto and all files: cc.

We also explore X86-CC+fences, resulting in the discovery that PodRR is relaxed and that MFencedRR is still relaxed, thereby automatically finding a shortcoming of X86-CC+fences. Experiment log: cc/log configuration file: cc.auto and all files: cc.

To prove that X86-CC+fences is not a valid model of x86 machines, an additional experiment is required — see this other document.


1
Notice that for squale where Rfe is safe, we have omitted cumulativity relaxations. For instance, ACSyncdRR = [Rfe;SyncdRR] is absent, which makes sense since Rfe and SyncdRR are safe by themselves.
2
with usual disclaimer due to black box testing
3
Accordingly CtrldR is also safe for Power6

This document was translated from LATEX by HEVEA.