Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Systematic Methodology to Develop Resilient Cache Coherence Protocols Konstantinos Aisopos (Princeton, MIT) Li-Shiuan Peh (MIT)

Similar presentations


Presentation on theme: "A Systematic Methodology to Develop Resilient Cache Coherence Protocols Konstantinos Aisopos (Princeton, MIT) Li-Shiuan Peh (MIT)"— Presentation transcript:

1 A Systematic Methodology to Develop Resilient Cache Coherence Protocols Konstantinos Aisopos (Princeton, MIT) Li-Shiuan Peh (MIT)

2 Motivation CMP era is here… Enabled by aggressive transistor scaling shrinking transistor dimensions  unreliable silicon (10K-100K FITs, frequency of errors : months) NIC P$ S$ P P CC … CC R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R [1,2] [1] R. Bauman (TI), IEEE Design Test of Computers, vol. 22 (3), 2005 [2] J. Graham (MoSys), EE Times, 2002

3 Motivation CMP era is here… Enabled by aggressive transistor scaling shrinking transistor dimensions  unreliable silicon (10K-100K FITs, frequency of errors : months) Goal: resilient cache coherence protocol NIC P$ S$ P P CC … CC loss of a single coherence message : deadlock R R R data request R R R R R R S R R

4 Outline Motivation Methodology – Walkthrough: a resilient transaction – Defining resilience properties – Enforcing resilience properties Evaluation – Overhead – Performance Conclusions

5 S1 S2 S S R S SM dir I I M request (M) unblock ack S{ } B M M{ } request (M) R S1S2R 1. initiator sends request to the directory 2. directory forwards request to the sharers 3. sharers invalidate their copy and acknowledge 4. request completes and initiator sends unblock to the dir 5. dir updates sharing vector and may now process succeeding requests Walkthrough Example: transaction resilient transaction

6 S1 S2 R S dir request (M) SM request (M) 1. initiator sends request to the directory 2. request is lost 3. initiator resends request after a timeout 4. directory forwards request to the sharers (…transaction continues identically as before) Walkthrough Example: transaction resilient transaction

7 S2 S1 R request (M) ack S{R,S1,S2 } B M request (M) S SM dir ack S{ } R S1S2 1. initiator resends its request Walkthrough Example: transaction resilient transaction

8 S2 S1 R request (M) ack S{R,S1,S2 } B M request (M) S SM S request (M) request (S) B S unblock S M B M request (M) ? request (M) dir tolerate a duplicate request: (1) transit to same state (2) generate the same messages S{ } R S1S2 1. initiator resends its request Walkthrough Example: transaction resilient transaction B M (M) request unblock

9 S2 S1 R request (M) ack request (M) S SM ack dir S{R,S1,S2 } B M S{ } R S1S2 1. initiator resends its request 2. directory forwards the request to sharers (again) Walkthrough Example: transaction resilient transaction

10 S2 S1 request (M) ack S I request (M) ack request (M) ack Walkthrough Example: transaction resilient transaction tolerate a duplicate request: (1) transit to same state (2) generate the same messages

11 S2 S1 R request (M) ack request (M) S SM ack dir ack M 1. initiator resends its request 2. directory forwards the request to sharers (again) 3. sharers acknowledge (again) (…transaction completes identically as before) Walkthrough Example: transaction resilient transaction

12 Outline Motivation Methodology – Walkthrough: a resilient transaction – Defining resilience properties – Enforcing resilience properties Evaluation – Overhead – Performance Conclusions

13 Defining the Resilience Properties request R … … … R response - same state transition - same outgoing messages - same state transition - same outgoing messages response message loss => transaction suspended the requestor regenerates its request after timeout

14 Defining the Resilience Properties request X A msgA … Y … msgB msgA msgB transient … stable request stable message last R … … … Property 1 initiator remains transient throughout the transaction Property 2 replicate msgs roll-back to same earlier state Property 3 retain information to regenerate msgs R response

15 Outline Motivation Methodology – Walkthrough: a resilient transaction – Defining resilience properties – Enforcing resilience properties Evaluation – Overhead – Performance Conclusions

16 Enforcing Property 1 the initiator remains transient throughout a transaction to be able to resend lost messages transient … stable request stable message last Property 1

17 Enforcing Property 1 the initiator remains transient throughout a transaction to be able to resend lost messages transient … request stable message last Property 1 transient stable request stable dir … response unblock done initiator cannot resend unblock counter-example: Enforcement: transient - detect every outgoing message that transits the initiator to stable state - replace the stable with a transient state, and wait for done stable

18 Enforcing Property 2 Property 2 A msgA … replicate messages roll-back to the earlier state the original message transitioned to

19 T1T1 S msgA … T2T2 … … … … TMTM … T M2 T1T1 S msgA … T2T2 … … … … T M1 TMTM disassociate branches after merging point msgA T 1 or T 2 ? Enforcing Property 2 replicate messages roll-back to the earlier state the original message transitioned to Property 2 A msgA …

20 unique data I M R request (M) dir ( ) unique data request (M) dir ( ) Enforcing Property 3 retain info to regenerate every outgoing message, in case a replicate request is received Property 3 msgA … msgB msgA msgB Sharer

21 TMTM … unique data M R request (M) dir ( ) I TITI invalidate permission invalidate ack … Enforcing Property 3 retain info to regenerate every outgoing message, in case a replicate request is received Property 3 msgA … msgB msgA msgB Sharer unique data retains

22 Outline Motivation Methodology – Walkthrough: a resilient transaction – Defining resilience properties – Enforcing resilience properties Evaluation – Overhead – Performance Conclusions

23 Evaluation: Overhead  directory-based protocol (static directory node, MESI) base statesresilient states stable ModifiedMd (M, waiting done) Ed (E, waiting done) Exclusive SharedSd (S, waiting done) InvalidId (I, waiting done) transient IM (I  M)Sp (S, waiting permission) IS (I  S)Ip (I, waiting permission) SM (S  M)Ma (M, waiting ack) ISI (IS  I)Sa (S, waiting ack) MI (M  I) base statesresilient states stable ransient ModifiedMd (M, waiting done) OwnedEd (E, waiting done) ExclusiveSd (S, waiting done) SharedId (I, waiting done) InvalidMId (MI, waiting done) transient IM (I  M)Sp (S, waiting permission) IS (I  S)Ip (I, waiting permission) SM (S  M)Ma (M, waiting ack) SE (S  E)Ea (E, waiting ack) SS (S  S)Sa (S, waiting ack) OM (O  M) WB req  broadcast-based protocol (AMD Hammer, MOESI) 9 to 17 states (4 to 5 bits) 12 to 22 cache states (4 to 5 bits) 12 to 22 states (4 to 5 bits) stable transient stable transient No state was introduced into the critical path of serving a request

24 PCaddressrequestorflagsstate Miss Status Holding Register (MSHR) entries 4-32 timer 0 to 2 13 state 1bit 13bits response bitvector 64bits trans ID 6bits 11 bytes total storage overhead : < 0.5 KB / core (worst-case: 2KB / core) (*)(*) assuming a 64-node CMP with in-order cores (*)(*) Evaluation: Overhead

25 Network-on-Chip Topology8x8 mesh Channels64-bit VNets5 RoutingXY System Configuration Processorsin-order SPARC cores L1 Caches64KB/node, 3 cycles4-way 64Byte blk L2 Caches1MB/node, 6 cycles Memory4 controllers * 1GB, 160 cycles Simulator: Wisconsin Multifacet GEMS Evaluation: Performance

26 fft fmm lu radix water water blacks canneal fluidan swaptions x264 AVERAGE nsq sp choles imate SPLASHPARSEC 7.4% 11% 1.4% 1.8% 1.1% 3.5%  lower is better directory protocol Evaluation: Performance metric: runtime overhead vs. non-resilient baseline

27 fft fmm lu radix water water blacks canneal fluidan swaptions x264 AVERAGE nsq sp choles imate SPLASHPARSEC 2.4% 5.1% 0.5% 20.4% 51% 56% broadcast protocol Evaluation: Performance metric: runtime overhead vs. non-resilient baseline

28 Outline Motivation Methodology – Walkthrough: a resilient transaction – Defining resilience properties – Enforcing resilience properties Evaluation – Overhead – Performance Conclusions

29 We have presented a generic methodology: coherence protocol -> resilient coherence protocol …by enforcing 3 properties minimal hardware overhead (<2KB / node) small performance overhead – directory-based protocol: 1.4% (1 fault / msec) – broadcast-based protocol:2.4% (1 fault / msec) Conclusions

30 Thank You! Questions?

31 BACKUP SLIDES

32 Why performance overhead? transactions last longer => a request may have to wait for outstanding conflicting requests to complete data remain in caches for longer (3-way hs) => cache replacement duration more messages are injected in the NoC => network traffic => average NoC latency

33 Transaction Duration B: baseline protocol, no faults R: resilient protocol, 1fault/10μsec L1: transaction served by sharer's L1 L2: transaction served by directory (L2) +12% +18%

34 Transaction Duration 11% 24% B: baseline protocol, no faults R: resilient protocol, 1fault/10μsec L1: transaction served by sharer's L1 L2: transaction served by directory (L2) large working sets, shared data => high number of requests (high traffic) (!) retransmissions saturate network)

35 Network Traffic most congested link average over all links

36 Enforcing the Resilience Properties  A single message type transits to a unique state in every FSM branch P2 … … T1T1 T2T2 msgA … Case 2: identical messages in same branch X Y msgA T count =1 T count =2 ack SM + acks =1 ack SM + acks =2 R request (M) SM + acks =0 … M

37 Enforcing the Resilience Properties  A single message type transits to a unique state in every FSM branch P2 … … msgA … Case 2: identical messages in same branch X Y msgA T count =1 T count =2 … … X msgA T [XYZ=100] msgA … Y T [XYZ=110]

38 Enforcing the Resilience Properties  A single message type transits to a unique state in every FSM branch P2 … … msgA … Case 2: identical messages in same branch X Y msgA T count =1 T count =2 … … X msgA T [XYZ=100] msgA … X T [XYZ=100] (duplicate)

39 01234567 89101112131415 161719212223 24252728293031 3233343536373839 4041424344454647 4849505152535455 5657585960616263 20 18 26


Download ppt "A Systematic Methodology to Develop Resilient Cache Coherence Protocols Konstantinos Aisopos (Princeton, MIT) Li-Shiuan Peh (MIT)"

Similar presentations


Ads by Google