The Phat ExperimentJade Alglave Luc Maranget |
Abstract: We ran The Phat Experiment from December 2009 to January 2010. We tested 3 Power machines with our diy tool. We present here the experimental protocol and results.
Tests in plain, nc, ac and bc are allowed to yield Ok, thereby confirming our hypothesis on the relaxations of the Power architecture. We call the corresponding relaxations expected relaxations.
By contrast, tests in safe should never yield Ok. We call the corresponding relaxations non-relaxations.
Reflecting that aim, expected relaxation are tested individually, while non-relaxations are tested together. See the example of X86 in diy manual.
litmus src/@all
…During our experiment on hpcx, some tests in safe yielded Ok — see Sec. 2.2. We then followed a different procedure — see Sec. 5
Relax | Definition1 | hpcx | squale | vargas |
PosRR | rℓ →po r′ℓ | 2/40M | 3/2M | 0/4745M |
PodRR | rℓ →po r′ℓ′ | 2275/320M | 12/2M | 0/16725M |
PodRW | rℓ →po w′ℓ′ | 8653/400M3 | 178/2M3 | 0/6305M |
PodWW | wℓ →po w′ℓ′ | 2029/4M | 2299/2M | 2092501/32M |
PodWR | wℓ →po r′ℓ′ | 51085/40M | 178286/2M | 672001/32M |
Rfi | →rfi | 7286/4M | 1133/2M | 145/32M |
Rfe | →rfe | 177/400M | 0/1776M | 9/21M |
LwSyncsWR | wℓ →lwsync r′ℓ | 243423/600M | 2/40M | 385/32M |
LwSyncdWR | wℓ →lwsync r′ℓ′ | 103814/640M | 11/2M | 117670/32M |
ACLwSyncsRR | wℓ →rfe r′ℓ →lwsync r″ℓ | 11/320M | 0/960M | 1/21M |
ACLwSyncdRR | wℓ →rfe r′ℓ →lwsync r″ℓ′ | 124/400M | 0/7665M | 2/21M |
BCLwSyncsWW | wℓ →lwsync w′ℓ →rfe r″ℓ | 68/400M | 0/560M | 2/160M |
BCLwSyncdWW | wℓ →lwsync w′ℓ′ →rfe r″ℓ′ | 158/400M | 0/11715M | 1/21M |
When a relaxed relaxation is confirmed, the array cell above holds a link to the log of one run of one litmus test that confirms the relaxation. The cell also shows the number of outcomes that yielded Ok/total number of outcomes, for the selected log.
When a relaxed relaxation is not confirmed, the array cell holds 0/total number of outcomes for all tests of all performed complete runs.
Incidentally, those anomalies explain why we selected such a high number of outcomes per run for hpcx.
squale is a PowerPC G5 with 4 cores, running Mac Os X (a Power Mac G5).
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
PosRR | n/a | posrr1 | [003] 3/2M | 00 | [003] DpdR Fre SyncdWW Rfe DpdR PosRR |
Rfi | p. 661, 1st col, 1st bullet | rfi | [000] 388/2M, [001] 140/2M, [002] 1133/2M | 00 | [000] DpdR Fre Rfi DpdR Fre Rfi |
PodRR | n/a | podrr1 | [000] 12/2M | 00 | [000] Fre SyncdWW Rfe PodRR |
PodRW | n/a | podrw1 | 0/1796M3 | Logs: 01, 02, 03, 04, 05, 06, 07, 08, 09, 10,11,12 | |
PodWW | n/a | podww | [000] 2299/2M [001] 3/2M | 00 | [000] Wse PodWW Wse PodWW |
PodWR | n/a | podwr | [000] 178286/2M [001] 93911/2M | 00 | [000] Fre PodWR Fre PodWR |
Rfe | n/a | rfe | 0/1776M | Logs: 01, 02, 03, 04, 05, 06, 07, 08, 09, 10,11,12 |
When an expected relaxation is confirmed, the relevant line in the table above gives the number of Ok states observed (and the number of outcomes collected) for a few typical tests, with a link to the complete litmus log of the run that yielded Ok. When an expected relaxation is not confirmed, we rather give the total number of outcomes collected for this relaxation, and links to all logs. This number may vary a lot, since it critically depends on the number of tests produced by diy. See for instance ACLwSyncsRR and ACLwSyncdRR below.
We notice that no relaxation of Rfe is observed.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
LwSyncsWR | p. 700, 1st col, L=1 case | lwswr | [002] 2/40M | 04 | [002] DpdR Fre LwSyncsWR DpdR Fre LwSyncsWR |
LwSyncdWR | p. 700, 1st col, L=1 case | lwdwr | [020] 11/2M, [020] 81/40M | 01,02 | [020] Fre LwSyncdWR Fre LwSyncdWR |
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
ACLwSyncsRR | p. 700, 2nd col, 3rd bullet | aclwsrr | 0/960M | Logs: 04, 05, 06, 07, 08, 09, 10,11,12 | |
ACLwSyncdRR | p. 700, 2nd col, 3rd bullet | aclwdrr | 0/7665M | Logs: 02, 03, 04, 05, 06, 07, 08, 09, 10,11,12 |
Not observing expected cumulativity relaxations is compatible with the result of experiments on Rfe.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
BCLwSyncsWW | p. 700, 2nd col, 3rd bullet | bclwsww | 0/560M | Logs: 02, 04, 05, 06, 07, 08, 09, 10,11,12 | |
BCLwSyncdWW | p. 700, 2nd col, 3rd bullet | bclwdww | 0/11715M | Logs: 02, 03, 04, 05, 06, 07, 08, 09, 10,11,12 |
Again, we did not observed the expected cumulativity relaxations.
The nouts entry above counts the total number of final states (or outcomes) collected for a given test in safe. The number of outcomes varies from test to test, depending on the number of threads in the test. It also varies from log to log due to changes in litmus parameters. This number is to be compared to the number of outcomes needed to observe a relaxation (typically a few millions).
The last column shows the total number of outcomes collected.
vargas is a Power 6 with 112 nodes of 32 cores. The operating system is AIX. We run our experiments on one node, i.e. on 32 cores.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
PosRR | n/a | posrr1 | 0/4745M | Logs: 00, 01, 02, 03, 04, 10, 11, 12, 13, 14, 15 | |
Rfi | p. 661, 1st col, 1st bullet | rfi | [000] 948/32M [001] 377/32M [002] 145/32M | 00 | [002] DpdW Wse Rfi DpdW Wse Rfi |
PodRR | n/a | podrr | 0/16725M | Logs: 00, 01, 02, 03, 04, 05 10, 11, 12, 13, 14, 15 | |
PodRW | n/a | podrw | 0/6305M | Logs: 00, 01, 02, 03, 04, 10, 11, 12, 13, 14, 15 | |
PodWW | n/a | podww | [000] 2092501/32M [001] 37638/21M | 00 | [000] Wse PodWW Wse PodWW |
PodWR | n/a | podwr | [000] 672001/32M, [001] 5762/21M | 00 | [000] Fre PodWR Fre PodWR |
Rfe | n/a | rfe | [002] 1/21M [003] 9/21M | 00 | [002] DpdW Wse Rfe DpdW Rfe |
Our experiments do not confirm PodRR, which is confirmed on Power 5. This may be related to Power 6 being more in-order.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
LwSyncsWR | p. 700, 1st col, L=1 case | lwswr | [001] 17/21M [002] 385/32M [003] 5/21M | 00 | [003] DpdR Fre LwSyncsWR DpdR Fre LwSyncsWR |
LwSyncdWR | p. 700, 1st col, L=1 case | lwdwr | [020] 117670/32M | 00 | [020] Fre LwSyncdWR Fre LwSyncdWR |
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
ACLwSyncsRR | p. 700, 2nd col, 3rd bullet | aclwsrr | [000] 1/21M [002] 3/16M | 00 | [000] DpdR Fre SyncdWW Wse Rfe LwSyncsRR |
ACLwSyncdRR | p. 700, 2nd col, 3rd bullet | aclwdrr | [010] 1/16M [017] 2/21M | 00 | [010] Fre Rfe LwSyncdRR Fre Rfe LwSyncdRR |
It is to be noticed that the test aclwdrr010 is iriw with lwsync.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
BCLwSyncsWW | p. 700, 2nd col, 3rd bullet | bclwsww | [000] 2/160M | 13 | [000] DpdR Fre LwSyncsWW Rfe DpdR Fre LwSyncsWW Rfe |
BCLwSyncdWW | p. 700, 2nd col, 3rd bullet | bclwdww | [011] 1/21M | 01 | [011] LwSyncdRR Fre LwSyncdWW Wse LwSyncdWW Rfe |
The nouts entry above counts the total number of final states (or outcomes) collected for a given test in safe. The number of outcomes varies from test to test, depending on the number of threads in the test. It also varies from log to log due to changes in litmus parameters. This number is to be compared to the number of outcomes needed to observe a relaxation (typically 16–32 millions).
Due to the discovery of relaxations that do not follow our Power model, we have altered our procedure:
As a result, we find anomalous relaxations for A-cumulativity, and none for B-cumulativity nor safe plain. Thus, we may venture to assume that the problem originates in either our model of A-cumulativity, or in hpcx implementation of A-cumulativity.
hpcx is a Power 5 with 160 nodes of 16 physical cores. The operating system is AIX. We run our experiments on one node, i.e. on 16 cores, with the possibility to enable SMT.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
PosRR | n/a | posrr1 | [003] 2/40M, [005] 1/40M | 00 | [003] DpdR Fre SyncdWW Rfe DpdR PosRR |
Rfi | p. 661, 1st col, 1st bullet | rfi | [000] 15/4M, [001] 2848/4M, [002] 7286/4M | 00 | [000] DpdR Fre Rfi DpdR Fre Rfi |
PodRR | n/a | podrr12 | [000] 54328/640M, [001] 2275/320M, [002] 3596/320M,… | 02 | [000] Fre SyncdWW Rfe PodRR |
PodRW | n/a | podrw1 | 0/31680M3 | Logs: 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15, 16, 17 | |
PodWW | n/a | podww | [000] 2029/4M, [001] 26/4M | 00 | [001] Wse PodWW Wse PodWW Wse PodWW |
PodWR | n/a | podwr | [000] 5714/40M, [001] 51085/40M | 00 | [001] Fre PodWR Fre PodWR Fre PodWR |
Rfe | n/a | rfe2 | [002] 21/400M, [003] 177/400M, [004] 8/320M, [005] 6/320M, [006] 29/320M | 00 | [002] DpdW Wse Rfe DpdW Rfe |
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
LwSyncsWR | p. 700, 1st col, L=1 case | lwswr2 | [000] 137/230M, [001] 757/400M, [002] 243423/640M, [003] 944/400M | 04 | [002] DpdR Fre LwSyncsWR DpdR Fre LwSyncsWR |
LwSyncdWR | p. 700, 1st col, L=1 case | lwdwr2 | [020] 103814/640M | 02 | [020] Fre LwSyncdWR Fre LwSyncdWR |
We confirm expected relaxations.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
ACLwSyncsRR | p. 700, 2nd col, 3rd bullet | aclwsrr2 | [000] 11/320M | 19 | [000] DpdR Fre Rfe LwSyncsRR DpdR Fre Rfe LwSyncsRR |
ACLwSyncdRR | p. 700, 2nd col, 3rd bullet | aclwdrr2 | [000] 2/320M, [001] 3/320M, [002] 33/400M, [003] 2/320M,… | 02 | [002] Fre SyncdWW Wse Rfe LwSyncdRR |
All the remaining cases of A-cumulativity are anomalously relaxed.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
ACSyncsRR | p. 700, 2nd col, 3rd bullet | acssrr2 | [002] 1/320M | 04 | [002] DpdR Fre Rfe SyncsRR DpdR Fre Rfe SyncsRR |
ACSyncdRR | p. 700, 2nd col, 3rd bullet | acsdrr2 | [017] 2/400M | 02 | [017] Fre SyncdWR Fre Rfe SyncdRR |
ACSyncdRW | p. 700, 2nd col, 3rd bullet | acsdrw2 | [027] 0/400M + [027] 1/400M = 1/800M | 02, 03 | [027] Wse LwSyncdWW Wse Rfe SyncdRW |
ACLwSyncdRW | p. 700, 2nd col, 3rd bullet | aclwdrw2 | [031] 1/400M | 02 | [031] Wse SyncdWR Fre Rfe LwSyncdRW |
We easily confirm the expected relaxations involving the B-cumulativity of lwsync.
Documentation [2] | Sources | Ok | Log(s) | Example of cycle | |
BCLwSyncsWW | p. 700, 2nd col, 3rd bullet | bclwsww2 | [000] 7/320M, [001] 3/320M, [002] 6/320M, [003] 19/400M,… | 02 | [003] DpdW Wse SyncdWR Fre LwSyncsWW Rfe |
BCLwSyncdWW | p. 700, 2nd col, 3rd bullet | bclwdww2 | [001] 11/320M, [002] 8/320M, [003] 11/320M, [004] 19/400M,… | 02 | [020] pdR Fre LwSyncdWW Rfe DpdR Fre LwSyncdWW Rfe |
None of the remaining cases of B-cumulativity is observed to be relaxed.
Relax | Sources | Logs | nouts | total |
BCSyncsRW | bcssrw2 | 08, 09, 10, 11, 12, 13, 14, 16, 17 | 3360M–6400M | 172960M |
BCSyncdRW | bcsdrw2 | 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 16, 17 | 5280M–10560M | 22440M |
BCSyncsWW | bcssww2 | 02, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 16, 17 | 4640M–6200M | 66640M |
BCSyncdWW | bcsdww2 | 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 16, 17 | 5280M–10560M | 245520M |
BCLwSyncsRW | bclwsrw2 | 08, 09, 10, 11, 12, 13, 14, 16, 17 | 3360M–6400M | 172960M |
BCLwSyncdRW | bclwdrw2 | 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 16, 17 | 5280M–10560M | 22440M |
The safe set is restricted to plain relaxations.
To confirm the relaxation of PodRW, which our models predicts,
we used an advanced feature of diy:
composite relaxations.
Namely, the configuration
file podrwposwr.conf
sets the relax list to [PodRW,PosWR]
.
That is, PodRW followed by PosWR.
Observe that PosWR is not safe, hence absent from the cycles of standard experiments on PodRW. The litmus tests are in podrwposwr.
Machine | Ok | Log(s) | Example of cycle |
squale | [023] 178/2M | 13 | [0231] DpdR Fre SyncdWW Rfe PodRW PosWR |
vargas | 0/1244M | Logs: 00 | |
hpcx | [003] 179/320M,…[030] 8653/400M,… | 202 | [0301] DpdR Fre SyncdWR Fre SyncdWW Rfe PodRW PosWR |
The experiments show the sequence “PodRW followed by PosWR” to be relaxed on squale and hpcx. If PodRW was not relaxed, then PodRW followed by PosWR would not be either, by the definition of preserved programming order we give in the associated paper. As the sequence is observed to be relaxed, we consider that PodRW is observed to be relaxed.
This document was translated from LATEX by HEVEA.