1 A Compositional Approach to Verifying Hierarchical Cache Coherence Protocols Xiaofang Chen 1 Yu Yang 1 Ganesh Gopalakrishnan 1 Ching-Tsun Chou 2 1 University of Utah 2 Intel Corporation * Supported in part by Intel SRC Customization Award 2005-TJ-1318
FMCAD Hierarchical Cache Coherence Protocols Chip-level protocols Inter-cluster protocols Intra-cluster protocols dir mem dir mem …
FMCAD Verification Challenges No public domain benchmarks More complicated with more Corner cases State space
FMCAD Outline Two hierarchical protocols Inclusive Non-inclusive A compositional approach Abstraction Counter-example guided refinement Soundness
FMCAD A Multicore Coherence Protocol RAC L2 Cache+Local Dir L1 Cache L1 Cache Global Dir Main Memory Home ClusterRemote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir L1 Cache L1 Cache RAC L2 Cache+Local Dir L1 Cache L1 Cache
FMCAD Protocol Features Both levels use MESI protocols Level-1: FLASH Level-2: DASH Silent drop on non-Modified cache lines Network channels are non-FIFO
FMCAD Livelock Problem Dir Agent1Agent2 1. Req_E 2. Grant_E 4. Req_S 3. Silent-drop 5. Fwd_Req6. NACK Invld Excl
FMCAD Blocking WB + NACK_SD Dir A1A2 Req_E Gnt_E Req_S Modify WB Fwd_S WB_Ack NAck_SD NAck (I) (E) (M) (I)
FMCAD Complexity of the Protocol Multiplicative effect of four protocols running concurrently Model check failed after 161,876,000 of states
FMCAD Outline Two hierarchical protocols Inclusive Non-inclusive A compositional approach Abstraction Counter-example guided refinement Soundness
FMCAD A Compositional Approach Constraining Original protocol Abstraction … Abstracted protocol
FMCAD Non-Circular Assume/Guarantee We can’t Verify: h ║ r1 ║ r2 ╞ Coh Instead Check-1: h ║ R1 ║ R2 ╞ Coh1 Λ Constrains1 Check-2: H ║ r1 ║ R2 ╞ Coh2 Λ Constrains2
FMCAD Verification Methodology Abstraction Two abstracted protocols Fixing real bugs in M Refinement
FMCAD Abstracted Protocol #1 RAC L2 Cache+Local Dir’ Global Dir Main Memory Home Cluster Remote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir L1 Cache L1 Cache RAC L2 Cache+Local Dir’
FMCAD Abstracted Protocol #2 RAC L2 Cache+Local Dir’ Global Dir Main Memory Home Cluster Remote Cluster 1 Remote Cluster 2 RAC L2 Cache+Local Dir L1 Cache L1 Cache RAC L2 Cache+Local Dir’
FMCAD Abstraction States Projection Transitions Overapproximation
FMCAD Abstraction on States Intra-cluster details Inter-cluster details
FMCAD Abstracting Transitions Rule-based system: guard action; Relaxing guards Relaxing expr values Remove stmt Procs[p].WbMsg.Cmd = WB_Wb → Procs[p].L2.Data := Procs[p].WbMsg.Data; Procs[p].L2.HeadPtr := L2; … true → Procs[p].L2.Data := d; …
FMCAD Detecting Bugs in M When a real error is found in M i Fix bug in M Regenerate M i ’s Iterate the process
FMCAD Refinement When a bogus error found in M i Analyze and find out problematic rule g → a Locate original rule in M G → A Add a new lemma in one abstracted protocol G => P Strengthen rule into g Λ P → a
FMCAD M1M1 1. False alarm found Remote cluster-1 can modify its L2 line arbitrarily Details of Refinement (I) true → …
FMCAD Locate the original rule in M before abstraction Guard: when the local dir receives a WB from an L1 cache Details of Refinement (II) 1 M1M1 Procs[p].WbMsg.Cmd = WB → …
FMCAD Strengthen problematic rule in 1. Only when local dir is exclusive, could L2 modify its line Details of Refinement (III) 1 M1M1 3 true & Procs[p].L2.State = Excl → …
FMCAD Why strengthening is sound? Details of Refinement (IV) 1 M1M1 3
FMCAD We can add a new lemma in M 2 Details of Refinement (V) M1M1 1 3 M2M2 4 Procs[p].WbMsg.Cmd = WB => Procs[p].L2.State = Excl
FMCAD One Detail Excl: 1 Home Cluster Remote Cluster 1Remote Cluster 2 Excl Invld Req_E2 Req_E3 Fwd_ReqE 4 Fwd_ReqE5 Gnt_E
FMCAD Original Transitions (I) GUniMsg[src].Cmd = RDX_RAC & GUniMsg[src].Cluster = r & Procs[r].L2.Gblock_WB = false & Procs[r].L2.State = Excl & Procs[r].L2.HeadPtr != L2 … undefine GUniMsg[src]; GUniMsg[src].Cmd := GUNI_None;
FMCAD Original Transitions (II) Procs[r].ShWbMsg.Cmd = SHWB_FAck & src_node = L2 … true & A BSProcs[r].L2.State = Excl & ABSProcs[r].RAC.State = Inval & ABSProcs[r].L2.Gblock_WB = false & GUniMsg[src].Cmd = RDX_RAC & GUniMsg[src].Cluster = p …
FMCAD Adding A Variable Excl: 1 Home Cluster Remote Cluster 1Remote Cluster 2 Excl Invld ifKeepMsg: boolean
FMCAD Soundness of the Approach Goal If M 1 and M 2 can be model checked correct w.r.t. the coherence property Ф in M, M must also be correct w.r.t Ф
FMCAD Soundness Proof Temporal Induction Initial states Each var has the same value in M, M 1 and M 2 Each newly added lemma is checked in M 1 and M 2 Each property is checked Suppose soundness in state s
FMCAD Soundness Proof (II) h1, h2, r11, r12, r21, r22 h1, h2, r12, r22 h1, r11, r12, r22 h1’, h2’, r11’, r12’, r21’, r22’ g a g 1 & p 1 a 1 h1’, h2’, r12’, r22’ g 2 & p 2 a 2 h2’, r11’, r12’, r22’ M M1M1 M2M2
FMCAD Experiment Results A real bug found 10 iterations of refinements The size of each error trace is < 12 One person-day of work
FMCAD ProtocolNumber of states M> 161,876,000 M1M1 31,919,219 M2M2 78,689,678 Reduction 64-bit Murphi IA-64 with 20GB of memory
FMCAD Outline Two hierarchical protocols Inclusive Non-inclusive A compositional approach Abstraction Counter-example guided refinement Soundness
FMCAD Caching Hierarchy Inclusive Exclusive Non-inclusive
FMCAD A Non-Inclusive Hierarchical Protocol RAC L2 Cache+Local Dir L1 Cache L1 Cache Global Dir Main Memory Home ClusterRemote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir L1 Cache L1 Cache RAC L2 Cache+Local Dir L1 Cache L1 Cache
FMCAD Protocol Differences Broadcasting channels RAC L2 Cache+Local Dir L1 Cache L1 Cache SnoopMsg[]
FMCAD Imprecise Local Directory LDir L1-1 GDir Req_S (S) S: L1-1 L1-2 (I) Swap Broadcast NAck Fwd_Req Gnt_S S: L1-2 Imprecision!
FMCAD Verification Difficulty Coherence properties Can involve multiple L1 caches Refinement Noninterference lemmas cannot infer L2 cache line states, from local behaviors
FMCAD An Example Excl Invld Excl Invld WB L2: (Excl, data1) (Excl, data2) L2: (Invld, *) (Excl, data2)
FMCAD Two Approaches of Refinement Inferring “exclusive” from Outside the cluster Inside the cluster
FMCAD Infer exclusive From Outside Invld Excl Invld WB L2: (Invld, *) (Excl, data2) IsExcl(p) Ξ Dir.State = Excl & GUniMsg[p].Cmd != (ACK || IACK || ImACK) & GUniMsg[h].Cmd != (ACK || IACK || ImACK) & GWbMsg.Cmd = GWB_None & ( (GShWbMsg.Cmd = GSHWB_None & Dir.Headptr = p) || (GShWbMsg.Cmd = DXFER & GShWbMsg.Cluster = p)) Cluster p
FMCAD Refinement Example Invld Excl Invld WB L2: (Invld, *) (Excl, data2) Cluster p p.WbMsg.Cmd = WB => IsExcl(p) (Invld & IsExcl(p), *) (Excl, data2)
FMCAD Infer exclusive From Inside M1M1 M2M2
FMCAD Definition of IE IE(p): exists i: L1_caches (p.L1(i).state = Excl or p.SnoopMsg(i).Cmd = (Put or PutX) or p.UniMsg(i).Cmd = PutX) or p.WbMsg.Cmd = WB or p.ShWbMsg.Cmd = ShWb or p.ShWbMsg.Cmd = FAck
FMCAD Refinement Invld Excl Invld WB L2: (Invld, *) (Excl, data2) Cluster p Procs[p].WbMsg.Cmd = WB & Procs[p].L2.Stae = Invld => IE(p) (Invld & IE(p), *) (Excl, data2)
FMCAD Soundness Still holds by adding the extra bits “IE”
FMCAD Experiment Results 17 iterations of refinements Size of each error trace is < 8 ProtocolNumber of states M> 1,521,900,000 M1M1 234,478,105 M2M2 283,124,383
FMCAD Outline Two hierarchical protocols Inclusive Non-inclusive A compositional approach Abstraction Counter-example guided refinement Soundness
FMCAD Conclusion Developed 2-level hierarchical protocols Proposed a compositional approach Abstraction Bug fixing Refinement Proved the soundness
FMCAD Related Work FMCAD’04 Chou et. al., A simple method for parameterized verification of cache coherence protocols CHARME’99 McMillan, Verification of infinite state systems by compositional model checking
FMCAD For Details
FMCAD A Multicore Coherence Protocol RAC L2 Cache+Local Dir L1 Cache L1 Cache Global Dir Main Memory Home ClusterRemote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir L1 Cache L1 Cache RAC L2 Cache+Local Dir L1 Cache L1 Cache
FMCAD About the Bug IACK
FMCAD Another Decomposing Approach Split protocols hierarchically Intra-cluster protocol Inter-cluster protocol
FMCAD Intra-cluster Protocol RAC L2 Cache+Local Dir L1 Cache L1 Cache Cluster Environment
FMCAD Inter-cluster Protocol RAC L2 Cache+Local Dir’ Global Dir Main Memory Home ClusterRemote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir’ RAC L2 Cache+Local Dir’
FMCAD Verification Difficulty Environment RAC L2 Cache+Local Dir L1 Cache L1 Cache Global Dir Main Memory Home ClusterRemote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir L1 Cache L1 Cache RAC L2 Cache+Local Dir L1 Cache L1 Cache
FMCAD An Example Scenario Excl: 1 Home Cluster Remote Cluster 1Remote Cluster 2 Excl Invld NACK 1 Req_E2 Req_E3 Fwd_ReqE 4 Swap5 Req_E6 Fwd_ReqE 7