Download presentation
Presentation is loading. Please wait.
Published byPatrick Baker Modified over 11 years ago
1
September 1999Compaq Computer CorporationSlide 1 of 18 Proving cache coherence for the Alpha 21264 (EV6) processor Paul Harter, Leslie Lamport, Mark Tuttle, Yuan Yu Compaq Computer Corporation
2
September 1999Compaq Computer CorporationSlide 2 of 18 Cache coherence protocols Goal: prove the cache coherence protocol is correct. processor cache memory x=2 Alpha memory model defines ordering of reads and writes to x. Cache coherence protocol enforces the Alpha memory model. cache x=2 cache x=1 processor
3
September 1999Compaq Computer CorporationSlide 3 of 18 Proving cache coherence in three easy steps+two-man years Model Alpha memory model. (200 lines) Model complete protocol. (2000 lines, 3 months) Prove implementation (5500 lines, 4+ months, incomplete) Model abstract protocol. (500 lines) Prove implementation (550 lines, 2 months, informal)
4
September 1999Compaq Computer CorporationSlide 4 of 18 Step 1: Alpha memory model We specify the Alpha memory memory model: –The official specification is an informal description of the allowed sequences of reads and writes. –We need a precise, state-based specification. We specify a simplified version of the model: –Operations read and write entire cache lines. –Operations accessing a cache line have a common point of synchronization.
5
September 1999Compaq Computer CorporationSlide 5 of 18 Key definition: read/write ordering Before order for an execution orders reads/writes and determines what values are returned by reads. GoodExecutionOrder defines good Before orders, namely the orders allowed by the memory model.
6
September 1999Compaq Computer CorporationSlide 6 of 18 State machine actions ReceiveRequest(proc, req) Receive a request ChooseNewData(proc, idx) Choose the return value for a request Respond(proc, idx) Return the value to a request ExtendBefore Expand the Before relation Actions preserve GoodExecutionOrder.
7
September 1999Compaq Computer CorporationSlide 7 of 18 GoodExecutionOrder GoodExecutionOrder == LET [some definitions deleted] IN /\ (*************************************************************) (* Before is a partial order. *) (*************************************************************) /\ Before \subseteq ReqId \X ReqId /\ \A r1, r2 \in ReqId : IsBefore(r1, r2) => ~IsBefore(r2, r1) /\ \A r1, r2, r3 \in ReqId : IsBefore(r1, r2) /\ IsBefore(r2, r3) => IsBefore(r1, r3) /\ (*************************************************************) (* SourceOrder implies the Before order. *) (*************************************************************) \A r1, r2 \in ReqId : SourceOrder(r1, r2) => IsBefore(r1, r2) /\ (*************************************************************) (* RequestOrder implies the Before order. *) (*************************************************************) \A r1, r2 \in ReqId : RequestOrder(r1, r2) => IsBefore(r1, r2) This is the hard part --- but look how short it is!
8
September 1999Compaq Computer CorporationSlide 8 of 18 /\ (*******************************************************) (* Writes and successful SCs to the same location that *) (* have issued a response are totally ordered. *) (*******************************************************) \A r1, r2 \in ReqId : /\ ReqIdQ[r1].req.type \in {"Wr", "SC"} /\ ReqIdQ[r1].req.newData # "Failed" /\ ReqIdQ[r1].req.responded /\ ReqIdQ[r2].req.type \in {"Wr", "SC"} /\ ReqIdQ[r2].req.newData # "Failed" /\ ReqIdQ[r2].req.responded /\ ReqIdQ[r1].req.adr = ReqIdQ[r2].req.adr => IsBefore(r1, r2) \/ IsBefore(r2, r1)
9
September 1999Compaq Computer CorporationSlide 9 of 18 /\ (*******************************************************************) (* LL/SC Axiom: For each successful SC, there is a matching LL and *) (* there is no write to the same address from a different *) (* processor between the LL and SC in the Before order. *) (*******************************************************************) \A r2 \in ReqId : /\ ReqIdQ[r2].req.type = "SC" /\ ReqIdQ[r2].newData \notin {Failed, NotChosen} => \E r1 \in ReqId : /\ LLSCPair(r1, r2) /\ \A r \in ReqId : /\ \/ ReqIdQ[r].req.type = "Wr" \/ /\ ReqIdQ[r].req.type = "SC" /\ ReqIdQ[r].newData \notin {NotChosen, Failed} /\ r[1] # r2[1] /\ ReqIdQ[r2].req.adr = ReqIdQ[r].req.adr => ~IsBefore(r1, r) \/ ~IsBefore(r, r2)
10
September 1999Compaq Computer CorporationSlide 10 of 18 /\ (**************************************************************) (* Value Axiom: A read reads from the preceding write in the *) (* Before order. *) (**************************************************************) \A r1, r2 \in ReqId : /\ ReqIdQ[r2].source # NoSource /\ ReqIdQ[r1].req.type = "Wr" /\ ReqIdQ[r1].req.adr = ReqIdQ[r2].req.adr => IF ReqIdQ[r2].source = FromInitMem THEN ~IsBefore(r1, r2) ELSE \/ ~IsBefore(ReqIdQ[r2].source, r1) \/ ~IsBefore(r1, r2)
11
September 1999Compaq Computer CorporationSlide 11 of 18 Step 2: Model abstract protocol Like most systems, the actual protocol is an –abstract protocol together with lots of –implementation details Unlike most systems, –abstract protocols correctness was far from obvious –we discovered a behavior not allowed by the model –this turned out to be an error in the memory model
12
September 1999Compaq Computer CorporationSlide 12 of 18 Define protocols Before ordering: fairly easy. Prove it satisfies GoodExecutionOrder: hard part was proving that the ordering is acyclic. Engineers had a behavioral intuition. Writing invariance proof was extremely hard: –35-line invariant, based on 300 lines of definitions –550-line proof, cases nested 10 levels deep The high-level proof
13
September 1999Compaq Computer CorporationSlide 13 of 18 Obstacle 1: find a single, complete description –English documents: 20 documents, 4-inch stack –Lisp simulator: crucial to understanding some details No description is –complete, precise, or –mathematically-tractable We wrote a relatively elegant, compact description Step 3: Model complete protocol
14
September 1999Compaq Computer CorporationSlide 14 of 18 Obstacle 2: algorithm complexity –60 different kinds of messages Quarks were the solution: –15 units of functionality –each message modeled as a set of quarks –resolved message overloading, simplified protocol Protocol took 9 man-months, 1900 lines of TLA+ Step 3: Model complete protocol
15
September 1999Compaq Computer CorporationSlide 15 of 18 Complete proof impossible due to time and labor Informal invariant was 1000 lines long We focus on the two most difficult conjuncts (each 150 lines) cache data structure point of synch. messages The low-level proof
16
September 1999Compaq Computer CorporationSlide 16 of 18 The low-level proof Proof took 7 man-months –one conjunct: 2000 lines, cases 13 levels deep –second conjunct: potentially twice as long, stopped at a point of diminishing returns Found one actual error: –demonstration requires use of 4 processors, 2 memory locations, and 15 messages –state space is too big for model checkers to find it –error is too obscure for testing to find it
17
September 1999Compaq Computer CorporationSlide 17 of 18 Lessons learned Engineers can read TLA+ after an hour, write TLA+ after several hours Engineers valued the work: the resulting confidence in the protocol was invaluable Specification should be part of design process: –removes ambiguity, uncovers corner cases –describes entire system at single level of abstraction –allows use of tools like TLC early in design stage
18
September 1999Compaq Computer CorporationSlide 18 of 18 Future work Engineers –see the potential of formal methods –open to including formal methods in design phase We want to facilitate adoption by engineering Most likely future project: analyze proposals made to standards committees –PCI-X, …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.