Testing HCN for PRAM Michael Jones, Ganesh Gopalakrishnan University of Utah, School of Computing.

Slides:



Advertisements
Similar presentations
Verification of architectural memory models by model checking Shaz Qadeer Compaq Systems Research Center
Advertisements

Functional Decompositions for Hardware Verification With a few speculations on formal methods for embedded systems Ken McMillan.
Cache Coherence. Memory Consistency in SMPs Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale values write-through: cache-2 has.
Tintu David Joy. Agenda Motivation Better Verification Through Symmetry-basic idea Structural Symmetry and Multiprocessor Systems Mur ϕ verification system.
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
Information-Flow Models for Shared Memory Allon Adir Hagit Attiya Gil Shurek.
Hierarchical Cache Coherence Protocol Verification One Level at a Time through Assume Guarantee Xiaofang Chen, Yu Yang, Michael Delisi, Ganesh Gopalakrishnan.
4/16/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
1 Lecture 4: Directory Protocols Topics: directory-based cache coherence implementations.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
Transaction Based Modeling and Verification of Hardware Protocols Xiaofang Chen, Steven M. German and Ganesh Gopalakrishnan Supported in part by SRC Contract.
Transaction Based Modeling and Verification of Hardware Protocols Xiaofang Chen, Steven M. German and Ganesh Gopalakrishnan Supported in part by Intel.
DDM – A Cache Only Memory Architecture Hagersten, Landin, and Haridi (1991) Presented by Patrick Eibl.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
Nested Transactional Memory: Model and Preliminary Architecture Sketches J. Eliot B. Moss Antony L. Hosking.
1 A Compositional Approach to Verifying Hierarchical Cache Coherence Protocols Xiaofang Chen 1 Yu Yang 1 Ganesh Gopalakrishnan 1 Ching-Tsun Chou 2 1 University.
Verification of Hierarchical Cache Coherence Protocols for Future Processors Student: Xiaofang Chen Advisor: Ganesh Gopalakrishnan.
Other time considerations Source: Simon Garrett Modifications by Evan Korth.
1 Lecture 2: Snooping and Directory Protocols Topics: Snooping wrap-up and directory implementations.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
Transaction Ordering Verification using Trace Inclusion Refinement Mike Jones 11 January 2000.
CS 152 Computer Architecture and Engineering Lecture 21: Directory-Based Cache Protocols Scott Beamer (substituting for Krste Asanovic) Electrical Engineering.
J. Michael Moore Computer Organization CPSC 110. J. Michael Moore High Level View Of A Computer ProcessorInputOutput Memory Storage.
Transaction Ordering Verification using Trace Inclusion Refinement Mike Jones 11 January 2000.
Formal Design and Verification Methods for Shared Memory Systems Ratan Nalumasu Dissertation Defense September 10, 1998.
NUMA coherence CSE 471 Aut 011 Cache Coherence in NUMA Machines Snooping is not possible on media other than bus/ring Broadcast / multicast is not that.
Counterexample Guided Invariant Discovery for Parameterized Cache Coherence Verification Sudhindra Pandav Konrad Slind Ganesh Gopalakrishnan.
Mobile Ambients Luca Cardelli Digital Equipment Corporation, Systems Research Center Andrew D. Gordon University of Cambridge, Computer Laboratory Presented.
1 Reducing Verification Complexity of a Multicore Coherence Protocol Using Assume/Guarantee Xiaofang Chen 1, Yu Yang 1, Ganesh Gopalakrishnan 1, Ching-Tsun.
Transaction Based Modeling and Verification of Hardware Protocols Xiaofang Chen, Steven M. German and Ganesh Gopalakrishnan Supported in part by SRC Contract.
1 Lecture 20: Protocols and Synchronization Topics: distributed shared-memory multiprocessors, synchronization (Sections )
1 PCI transaction ordering verification using trace inclusion refinement Mike Jones UV Meeting October 4, 1999.
April 18, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
1 Cache coherence CEG 4131 Computer Architecture III Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Dynamic Verification of Cache Coherence Protocols Jason F. Cantin Mikko H. Lipasti James E. Smith.
CSC 213 – Large Scale Programming Lecture 37: External Caching & (a,b)-Trees.
Race Checking by Context Inference Tom Henzinger Ranjit Jhala Rupak Majumdar UC Berkeley.
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
Constructive Computer Architecture Cache Coherence Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November.
RAM, PRAM, and LogP models
Chapter 9 Hardware Addressing and Frame Type Identification 1.Delivering and sending packets 2.Hardware addressing: specifying a destination 3. Broadcasting.
ECE 353 Lab 2 Pipeline Simulator. Aims Further experience in C programming Handling strings Further experience in the use of assertions Reinforce concepts.
Lecture 5 1 CSP tools for verification of Sec Prot Overview of the lecture The Casper interface Refinement checking and FDR Model checking Theorem proving.
Tutorial on Test Model-checking Ganesh Gopalakrishnan Ratan Nalumasu Rajnish Ghughal Mike Jones Ritwik Bhattacharya Ali Sezgin Prosenjit Chatterjee.
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February Session 13.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
High Performance Embedded Computing © 2007 Elsevier Lecture 4: Models of Computation Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Performance of Snooping Protocols Kay Jr-Hui Jeng.
1 Lecture 8: Snooping and Directory Protocols Topics: 4/5-state snooping protocols, split-transaction implementation details, directory implementations.
Xiaofang Chen1 Yu Yang1 Ganesh Gopalakrishnan1 Ching-Tsun Chou2
CS427 Multicore Architecture and Parallel Computing
Michael D. Jones, Ganesh Gopalakrishnan
Dr. George Michelogiannakis EECS, University of California at Berkeley
ECE 353 Lab 3 Pipeline Simulator
Krste Asanovic Electrical Engineering and Computer Sciences
Trees and Binary Trees.
CS427 Multicore Architecture and Parallel Computing
Cache Coherence Constructive Computer Architecture Arvind
Cache Coherence Constructive Computer Architecture Arvind
Standards Chapter 7.
Lecture 25: Multiprocessors
CSE 373: Data Structures and Algorithms
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 18 Cache Coherence Krste Asanovic Electrical Engineering and.
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
Cache Coherence Constructive Computer Architecture Arvind
CSE 486/586 Distributed Systems Cache Coherence
Presentation transcript:

Testing HCN for PRAM Michael Jones, Ganesh Gopalakrishnan University of Utah, School of Computing

Outline Goals: –re-use an abstraction for branching topologies –combine test model checking and abstraction How HCN works What was verified and how Discussion

HCN Directory-based hierarchical caching netw. Obeys sequential consistency, and PRAM is weaker than SC. Written by Arvind and Xiaowei Shen

HCN Model P P PPPP M M MM M M M M M

P P PPPP M M MM2 M M M0 M M1 wr_req (a,2) ex-req(a)

HCN Model P P PPPP M M MM2 M M M0 M M1 wr_req (a,2) ex-req(a)

Testing for PRAM Any 3 processors Located anywhere in any HCN network Sharing a single address Always satisfy PRAM Abstraction to cover all networks Test model check for PRAM with N=3.

Testing for PRAM # Procs sharing address:3 # Procs in system:arbitrary # Caches in system:arbitrary # Addresses being shared1 # Addresses in systemarbitrary Propertymem model

Abstraction Recipe 1.Throw away enough transactions and structure, and... 2.Merge enough structure to get a finite state model. 3.Add enough non-determinism to get same behavior on remaining observed state (Inspired by trace inclusion refinement)

Why the Recipe Works For some class of protocols, a “nice amount” of non-determinism is required to capture all behaviors of the observed state in the reduced model

HCN Abstraction M M MM2 M M M0 M M1

HCN Abstraction M M MM2 M M M0 M M1 P QP

HCN Abstraction M M MM2 M M M0 M M1 P QP

HCN Abstraction M M M0 M M1 P QP

Merging Linear State M M M...

HCN Abstraction M M M0 M M1 P QP PPQ

|{Finite State Configs}| is Finite PPQPPQPPQ

Modeling a TRS in Mur  Rule "receive wb rep and send sh rep" (trec[addr].req = sh_req & hd_in.opc = wb_rep & hd_in.addr = addr & state[addr] = ex_w & (current_writer(addr,m) = hd_in.src)) ==> var rep_msg : tMsg ; begin rep_msg.opc := sh_rep; rep_msg.src := m; rep_msg.dst := trec[addr].id; rep_msg.addr := addr; rep_msg.data := hd_in.data; enqueue (outq, rep_msg); state[addr] := ex_r; add_to_dir (addr, trec[addr].id, m, dir); add_to_dir (addr, hd_in.src, m, dir); clearTrec (addr,trec); delete (inq, 0); end; receive-wb-rep-and-send-sh-rep <id,Cell(a,u,(Ex,W(idk)))|m, Msg(idk,id,Wbrep,a,v)+i,o, Trec(a,(idp,Sh-req))|t>  <id,Cell(a,v,(Ex,R(idk|idj)))|m, i,o+Msg(id,idj,sh-rep,a,v), t>

Testing for PRAM wr(A,2) rd(A,-) wr(A,2) rd(A,-) rd( A,1 ) rd(A,0) E wr(A,0) rd(A,-) wr(A,1)rd(A,-) wr(A,1) rd(A,1) E wr(A,1) rd(A,-) rd(A,0) Model Checker

Inadvertantly Seeded Error

Model Checking Results PPQ PQP PPQ StatesCPU time (sec) 110, Total 881, , ,

Discussion  at least one error in which topology matters Abstraction carried over nicely to a non-PCI protocol. N=4 and 2 addresses: both too big. –only explore several million states per model Abstraction + test model checking = more general results.

Inadvertantly Seeded Error read&miss sh-req

Inadvertantly Seeded Error read&miss sh-req write&miss ex-req

Inadvertantly Seeded Error read&miss sh-req write&miss ex-req write&miss ex-req

Inadvertantly Seeded Error read&miss sh-req write&miss ex-req write&miss ex-req ex-req(2) 1 0 2

Inadvertantly Seeded Error read&miss sh-req write&miss ex-req write&miss ex-rep 1 0 2:0

Inadvertantly Seeded Error read&miss sh-req write&miss ex-req write&miss wb-req ex-req(1) 1 0 2:0

Cache State Encoding M State Address Value Cache Home cell... cell

Cache State Encoding State Address Value Cache Home cell... cell “Cstate”: Shared or exclusive wrt siblings “Horizontal” state Sh = shared with siblings Ex = has an exclusive copy.

Cache State Encoding State Address Value Cache Home cell... cell “Hstate”: Which children have cached the state and why “Vertical” state R(dir) = all children in dir have shared copies for reading W(id) = the child id has an exclusive copy for writting

HCN Model P P PPPP M M MM2 M M M0 M M1 M1 is a child of M0, so M1 is a cache for data in M0.

HCN Model P P PPPP M M MM2 M M M0 M M1 M1 is the parent of M2, so M1 is the home of data in M2

HCN Model P P PPPP M M MM M M M M M Innermost memories, or L1 caches.

HCN Model P P PPPP M M MM M M M M M Outermost memory

Testing for PRAM wr(A,2) rd(A,-) wr(A,2) rd(A,-) rd(A,1) rd(A,0) E wr(A,0) rd(A,-) wr(A,1)rd(A,-) wr(A,1) rd(A,1) E wr(A,1) rd(A,-) rd(A,0)

HCN Model P P PPPP M M MM2 M M M0 M M1 wr_req (a,2) ex-req(a) wb-req(a)

HCN Model P P PPPP M M MM2 M M M0 M M1 wr_req (a,2) ex-req(a) wb-req(a)