Exploration of machines and models |
We ran our dont tool against several different machines and executable models, x86 and Power ones. We provide here the experimental results.
The presentation of results follows a systematic scheme, which we introduce on the simple example of exploring a 2 core x86 machine. For each experiment, we give:
... Observed relaxed: {Rfi, PodWR} Observed safe: {Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWW, MFencedWR, MFencedRW, MFencedRR}That is, we get TSO described as relaxed and safe candidate relaxations.
Nevertheless, one may extract interesting information from subdirectories. For instance, according to the verdict at the end of conti/log, PodWR is relaxed. Now, looking approximately at the middle of the same log file we see:
... ++ Witness(es) for relaxed PodWR ++ A002: 'Fre PodWR Fre PodWR' {Fre, PodWR} ++++++++ Observed relaxed: {Rfi, [Rfi,PodRR], PodWR} Observed safe: {Rfe, Fre, ...} ...
And we know how dont observed PodWR to be relaxed: running test A002, observing the relaxed outcome (check for A002 in file A.00 for instance) and assuming that Fre is safe.
All experiments produce the same model: TSO exprimed with relax and safe sets:
Relaxed | Safe | |
Rfi, PodWR | Rfe, Fre, Wse, PodWW, PodRW, PodRR, MFencedWR |
Here are experiment logs:
Given the variety of Power implementations we have tested, we found various models. However, as can be checked by verifying the inclusion of safe sets, all those model conform to the simplified Power model obtained by exploring our axiomatic memory model for Power.
During Power experiments we have detailed dependency candidate relaxations (“dp” in paper) as follows:
Notice that the dependency candidate relaxations defined above show only one “R” or “W”. This direction is the one of the target of the candidate relaxation, as the source always is a read and is thus left implicit.
We also performed a complete exploration of the isync barrier by including ISyncdRR, ISyncdRW, ISyncdWR and ISyncdWW in tested candidate relaxations, which we do not show in the paper.
The following table gives the safe sets of candidate relaxations found in all Power experiments:
Safesquale | = | Rfe1 , Fri, Fre, Wsi, Wse, PodRW, SyncdWW, SyncdWR, SyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri] |
Safeabducens | = | ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, ABCLwSyncdRW, Fri, Fre, Wsi, Wse, PodRW, PodRR, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrldR, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri] |
Safevargas | = | ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, ABCLwSyncdRW, Fri, Fre, Wsi, Wse, PodRW, PodRR, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrldR, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri] |
Safepower7 | = | ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, Fri, Fre, Wsi, Wse, PodRW, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, ISyncdRW, ISyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri] |
Safeppc | = | ACSyncdRW, ABCSyncdRW, ACSyncdRR, ACLwSyncdRW, Fri, Fre, Wsi, Wse, SyncdWW, BCSyncdWW, SyncdWR, SyncdRW, BCSyncdRW, SyncdRR, LwSyncdWW, LwSyncdRW, BCLwSyncdRW, LwSyncdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsyncdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpAddrdR,Fri] |
It should be noticed that Safeppc defines a Power model slightly weaker (but also more simple) than Power2010. Namely Safeppc allows some behaviours that Safeppc rejects. This is due mostly to the sophisticated (RMO-inspired) “preserved program order” defined in “Fences in Weak Memory Models” (Fig. 6).
As for any machines m Safeppc is included in Safem, we draw one important conclusion from experiments: all tested machines conform to our model: Said otherwise, no behaviour forbidden by ppc will2 show up on the tested Power machines.
One may also notice some particularities of the various implementations of the Power architecture:
By contrast PowerG5 and Power6 are significantly less relaxed implementations of the model.
We test two machines: a Toshiba Folio 100 tablet and a Trim-Slice computer. Both machines are build around a NVIDIA Tegra 2 chip featuring a dual-core Cortex-A9 ARM designed CPU.
It is to be noticed that there are two cores only, which limits the scope of our tests. In particular, there is no test on two cores that targets the Rfe candidate relaxation. In other words multi-copy atomicity cannot be tested testable a dual core setting. For simplicity we assume Rfe to be safe.
The experiment verdict is the same for both machines. We give the complete classification of candidate relaxations as either relaxed or safe:
Relaxed | Safe |
Rfi, [Rfi,DpAddrdR],PodWW, PodWR, PodRW, PodRR, ISBdWW, ISBdWR, ISBdRW, ISBdRR, DpCtrldR | Rfe, Fre, Wse, Fri, Wsi, DMBdWW, DMBdWR, DMBdRW, DMBdRR, DSBdWW, DSBdWR, DSBdRW, DSBdRR, DpAddrdW, DpAddrdR, DpDatadW, DpCtrldW, DpCtrlIsbdR, [DpAddrdR;Fri], [DpAddrdW;Wsi], [DpDatadW;Wsi] |
Notice that we here use refined candidate relaxations, similar to the ones of the Power experiments above, with the instruction synchronisation barrier isb being the ARM equivalent of Power isync.
Form this preliminary experiment, we draw a few conclusions:
We automatically show that the (operational) model Power model introduced in “Understanding POWER Multiprocessors” (Power2011 for short), does not conform to the axiomatic model introduced in “Fences in Weak Memory Models” (Power2010 for short). To that aim we use dont in conformance mode. Given a safe set, dont in conformance mode generates tests using safe candidate relaxations only and runs the tested machine (or executable model) on them, attempting to find tests that invalidate the safe set.
A complete account of our comparison of two Power models is available as another document.
x86-CC+fences the x86-CC model of “The Semantics of x86-CC Multiprocessor Machine Code”augmented with an mfence instruction that maintains write-read pairs in program order, following the semantics sketched in the x86-CC paper.
First, we automatically show that X86-CC+fences does not conform to TSO: experiment log: cc/log (look at the end for counter-examples, test A055 is sb+fence), configuration file: cc.auto and all files: cc.
We also explore X86-CC+fences, resulting in the discovery that PodRR is relaxed and that MFencedRR is still relaxed, thereby automatically finding a shortcoming of X86-CC+fences. Experiment log: cc/log configuration file: cc.auto and all files: cc.
To prove that X86-CC+fences is not a valid model of x86 machines, an additional experiment is required — see this other document.
This document was translated from LATEX by HEVEA.