Verification of Hierarchical Cache Coherence Protocols for Future Processors Student: Xiaofang Chen Advisor: Ganesh Gopalakrishnan
2 Outline Background Proposed solutions –High level hierarchical coherence protocol verification –Refinement check: specifications vs. RTL implementations Conclusion
3 Hierarchical Cache Coherence Protocols Chip-level protocols Inter-cluster protocols Intra-cluster protocols dir mem dir mem …
4 Modeling and Verification of Coherence Protocols High-level modeling approaches –Model checking Low-level modeling: RTL or VHDL –Simulation
5 Problems with Hierarchical Coherence Protocols For high level modeling –Handle the complexity of hierarchical protocols For RTL implementations –Verify a RTL correctly implements the specification
6 Example: Verification Complexity (I) RAC L2 Cache+Local Dir L1 Cache Main Mem Home ClusterRemote Cluster 1Remote Cluster 2 L1 Cache Global Dir RAC L2 Cache+Local Dir L1 Cache RAC L2 Cache+Local Dir L1 Cache
7 Example: Verification Complexity (II) Tool: Murphi Verification –IA-64 machine –18GB memory –40-bit hash compaction –Non-conclusive after >30 hours of state enumeration
8 Differences in Modeling: Specs vs. Impls home client buf local cache One step in high-level Multiple steps in low-level
9 Differences in Execution: Specs vs. Impls Interleaving in HL Concurrency in LL
10 Proposed Mechanisms For high level modeling, develop –A few M-CMP coherence protocols –A compositional approach For specifications vs. implementations, develop –A formal theory –A compositional approach –A practical tool
11
12 Outline Background Proposed solutions –High level hierarchical coherence protocol verification –Refinement check: specifications vs. RTL implementations Conclusion
13 An M-CMP Benchmark Protocol RAC L2 Cache+Local Dir L1 Cache Main Mem Home ClusterRemote Cluster 1Remote Cluster 2 L1 Cache Global Dir RAC L2 Cache+Local Dir L1 Cache RAC L2 Cache+Local Dir L1 Cache Inter-cluster Intra-cluster
14 Protocol Features Both levels use MESI protocols –Intra-cluster: FLASH –Inter-cluster: DASH Silent drop on non-Modified cache lines Network channels are non-FIFO Inclusive caches
15 Another Benchmark: Non-inclusive Caches RAC L2 Cache+Local Dir L1 Cache Main Mem Home ClusterRemote Cluster 1Remote Cluster 2 L1 Cache Global Dir RAC L2 Cache+Local Dir L1 Cache RAC L2 Cache+Local Dir L1 Cache
16 Our Compositional Approach Original protocol
17 Our Compositional Approach
18 One Way to Decompose Protocols Create three abstract protocols Each with 1 detailed cluster + 2 abstracted clusters
19 Abstract Protocol #1 RAC L2 Cache+Local Dir’ Main Mem Home Cluster Remote Cluster 1 Global Dir RAC L2 Cache+Local Dir L1 Cache RAC L2 Cache+Local Dir’ Remote Cluster 2
20 Abstract Protocol #2 RAC L2 Cache+Local Dir’ Main Mem Home Cluster Remote Cluster 1 Global Dir RAC L2 Cache+Local Dir L1 Cache RAC L2 Cache+Local Dir’ Remote Cluster 2
21 Problems with This Approach Every abstract protocol contains 2 protocols Duplicated behaviors in abstract protocols State space still large ,613,051M2M ,088,425M1M1 Mem (GB)Time (hour)# of states
22 Second Way to Decompose Protocols RAC L2 Cache+Local Dir’ Main Mem Home ClusterRemote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir’ Global Dir RAC L2 Cache+Local Dir’ Home Cluster Remote Cluster 1 ABS #1 ABS #2 ABS #3 L2 Cache+Local Dir L1 Cache L2 Cache+Local Dir L1 Cache
23 Model Checking Results
24 Details of Our Approach Abstraction –States –Transitions, properties Constraining –Assume guarantee reasoning
25 Abstraction on States Intra-cluster Inter-cluster
26 State Representation L2 Cache+Local Dir L1 Cache L2 Cache+Local Dir’ L1sNetwork L2 Local Dir Original cluster RAC L2 Cache+Local Dir L1 Cache L1sNetwork L2 Local Dir L2 Local Dir’ RAC Abstract clusters
27 Rule: guard action guard –Become more permissive action –Allow more behaviors Abstracting Transitions and Properties
28 An Example of Abstraction RAC L2 Cache+Local Dir L1 Cache RAC L2 Cache+Local Dir’ WB Clusters[c].WbMsg.Cmd = WB Clusters[c].L2.Data := Clusters[c].WbMsg.Data; Clusters[c].L2.HeadPtr := L2; … True Clusters[c].L2.Data := nondet ; … Abstract inter-cluster protocol Abstract intra-cluster protocol
29 Abstraction, Now Constraining
30 An Example of Constraining RAC L2 Cache+Local Dir L1 Cache RAC L2 Cache+Local Dir’ WB Clusters[c].WbMsg.Cmd = WB Clusters[c].L2.State = Excl True & Clusters[c].L2.State = Excl Clusters[c].L2.Data := nondet; …
31 Non-inclusive Protocols: History Variables RAC L2 Cache+Local Dir’ Main Mem Home ClusterRemote Cluster 1Remote Cluster 2 RAC L2 Cache+Local Dir’ Global Dir RAC L2 Cache+Local Dir’ Home Cluster Remote Cluster 1 L2 Cache+Local Dir L1 Cache L2 Cache+Local Dir L1 Cache
32 Experimental Results
33 Outline Background Proposed solutions High level hierarchical coherence protocol verification –Refinement check: specifications vs. RTL implementations Conclusion
34 Our Approach Use a hardware language –Hardware Murphi Develop a formal theory of refinement check Develop a compositional approach –Abstraction –Assume guarantee Develop a practical tool
35 Hardware Murphi Murphi extension by S. German and G. Janssen A concurrent shared variable language –On each cycle Multiple transitions execute concurrently Exclusive write to a variable Shared reads to variables Write immediately visible within the same transition Write visible to other transitions on the next cycle Support transactions, signals, etc
36 Transaction Group multiple steps in impl Transaction Rule-1 …. … Rule-6 … End;
37 Workflow of Our Refinement Check Hardware Murphi Impl model Product model in Hardware Murphi Product model in VHDL Murphi Spec model Property check Muv Check low-level correctly implements high-level
38 Full List of Assertions for Refinement Check 1.Serializability for specifications 2.No write-write conflicts 3.Initial states containment 4.Write set variables containment 5.Enableness for specifications 6.Joint variables match at the end of transactions
39 An Example Transaction Rule-1 guard1 action1; Rule-2 guard2 action2; Rule-3 guard3 action3; End; Rule spec_guard spec_action; Impl transaction Spec rule
40 An Example (Cont’d) Transaction Rule-1 guard1 action1; assert spec_guard; spec_action; Rule-2 guard2 action2; Rule-3 guard3 action3; End; assert impl_var1 = spec_var1; assert impl_var2 = spec_var2; …
41 Driving Benchmark Buf Remote DirCache Mem Router Buf Local Home Remote DirCache Mem S. German and G. Janssen, IBM Research Tech Report 2006 Local Home
42 Bugs Found with Refinement Check Benchmark satisfies cache coherence already Bugs still found –Bug 1: router unit loses messages –Bug 2: home unit replies twice for one request –Bug 3: cache unit gets updated twice from one reply Refinement check is an automatic way of constructing checks
43 Model Checking Approaches Monolithic –Straightforward property check Compositional –Divide and conquer Product model in VHDL Monolithic Compositional
44 Compositional Refinement Check Reduce the verification complexity Basic Techniques –Abstraction Removing details to make verification easier –Assume guarantee A simple form of induction which introduces assumptions and justifies them
45 In More Detail Abstraction –Change variables to free input variables –E.g. change a latch to free input signal Assume guarantee (spec.Var = impl.Var) holds Assume for reads of a transaction
46 Experimental Results Verification Time 1-bit 10-bit 1-day Datapath Configurations –2 nodes, 2 addresses, SixthSense 30 min Monolithic approach Compositional approach
47 Outline Background Proposed solutions High level hierarchical coherence protocol verification Refinement check: specifications vs. RTL implementations Conclusion
48
49 Thank you.
50 Related Work Parameterized verification –Chou et al. Bluespec –Arvind et al. Aggregation of distributed actions –Park and Dill Compositional verification –Many previous works including McMillan, Jones, etc.