Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock Domains by Synchronizing the Mice in the Mousetrap (PATMOS, Sep. 2003) by Joep Kessels and Ad Peeters Philips Research Laboratories, The Netherlands together with Suk-Jin Kim at KJIST, South Korea
Avshalom Elyada, Ran GinosarPipeline Synchronization 2 Recall Seizovic’s Synchronization Pipeline Seizovic, “Pipeline Synchronization,” Async 1994 Kessels, Peeters, Kim, "Bridging Clock Domains by synchronizing the mice in the mousetrap", PATMOS, 2003 B clk Ripple Buffer between two clock domains –High throughput –Embedded synchronization –spanning a long distance 2-phase half cycle distance A clk MEME MEME REQ MEME B clk MEME MEME ACK MEME
Avshalom Elyada, Ran GinosarPipeline Synchronization 3 Which buffer to use? Ripple Buffer –Stream data (isochronous) Throughput important, latency not Steady rate maintained on both sides –Short distance (2-3 stages) Pipe to improve throughput –or Long distance (many stages) Improve throughput and bridge distance
Avshalom Elyada, Ran GinosarPipeline Synchronization 4 Which buffer to use? Pointer Buffer –Block data Chunk available at-once Rate not important No sense to ripple every word in all pipe stages Write few long bursts to SRAM and read on other side, with pointers –But if long distance, need Ripple
Avshalom Elyada, Ran GinosarPipeline Synchronization 5 An ME as a Synchronizer Outputs mutually exclusive : Connect ~clk and signal ‘R’ to inputs ‘A’ synced output, other output unused Today we refer to ME with ~clk as WAIT4 component S clk X R1R1 R0R0 A1A1 A0A0 ME RA
Avshalom Elyada, Ran GinosarPipeline Synchronization 6 WAIT4 A is synced to clk Used in 4-phase, doesn’t sync A used as building block for 2-phase sync
Avshalom Elyada, Ran GinosarPipeline Synchronization 7 One Stage
Avshalom Elyada, Ran GinosarPipeline Synchronization 8 “Mousetrap Cell” as FIFO Element 2-phase single-rail Any hi/lo signal toggle indicates change reqǂack, sender cell is full req=ack, data accepted by rcver, snder empty “Equal” gate implements “empty” when req=ack Cell empty all 4 ctrl signals equal
Avshalom Elyada, Ran GinosarPipeline Synchronization 9 MT Behavior Ignoring ‘empty’ signal, MT similar to Muller Pipeline: ([Rreq=Rack * Wreq ǂ Rreq]; Rreq := Wreq)* (rcving cell empty)*(sending cell full); capture data, send merely prevents idle operations ([Wreq ǂ Rack * Wreq ǂ Rreq]; Rreq := Wreq)* ([Wreq ǂ Rack]; Rreq := Wreq)*
Avshalom Elyada, Ran GinosarPipeline Synchronization 10 Mousetrap vs. Muller Muller –Need to match delay of req to comb. logic –For 2-phase, need special Capture-Pass Latch –When full, every other cell contains data Mousetrap –‘empty’ no need for CP Latch –‘empty’ does automatic delay-matching –When full, all cells contain data –No async elements (good for business)
Avshalom Elyada, Ran GinosarPipeline Synchronization 11 Rcver Ack to Snder does NOT indicate latch locked Latch locked T(EQ+Hold Latch ) after Ack Timing restraint to ensure data not overrun 1) Snder Full 4) Rcver gets Rack from outside 5) Rcver empties EQ 3) Rcver stores data EQ+Hold Latch Latch 2)Rcver Ack back & Rreq forward
Avshalom Elyada, Ran GinosarPipeline Synchronization 12 Delay Asymmetries Delay of full/empty token –Full: T(Latch), Empty: T(Latch+EQ) –Phase-shift in handshake signals –FIFO at full speed is less than ½ full
Avshalom Elyada, Ran GinosarPipeline Synchronization 13 Delay Asymmetries II Different inputs of a cell have different delay-to-out –Connect slow EQ input to Ack to help timing, or –…to Req to improve performance
Avshalom Elyada, Ran GinosarPipeline Synchronization 14 Delay Asymmetries III Signals’ rising/falling edges have different transition delays – Req precedes empty, empty precedes Req –To avoid malfunction, ctrl-latch always slower than data-latch
Avshalom Elyada, Ran GinosarPipeline Synchronization 15 UE4 Parallel composition of two WAIT4 -> Up-Edge 4-phase detector Inv delay ensures 2 nd WAIT4 closed before 1 st opened Use a FF here instead? –doesn’t filter out the metastability
Avshalom Elyada, Ran GinosarPipeline Synchronization 16 Detect up & down edges for 2-phase Build a Edge 2-phase detector UE2 –‘d’ ifferent, ‘e’mpty –‘U’ even though it is up-and-down –Note resemblance to MT ctrl logic UE2
Avshalom Elyada, Ran GinosarPipeline Synchronization 17 Pipeline Interfaces FIFO indicates ready : –To receive new Wdat: Wrdy –To send new valid Rdat: Rrdy Environment enables: –Send of new valid Wdat: Wenb –Receive of new Rdat: Renb Data transfer if both rdy and enb –Transfer item every clock
Avshalom Elyada, Ran GinosarPipeline Synchronization 18 Handshaking continues … at next Rclk, state repeats itself Read-Interface Renb enables Rclk at FF –Z empty, Rrdy low, handshake signals equal –Z becomes full, Rrdy hi, handshakes differ –Upon next Rclk*Renb, FF makes handshakes equal again Following Rclk*Renb, Z passes new Rdat After T(Latch+EQ), X empties into Y
Avshalom Elyada, Ran GinosarPipeline Synchronization 19 Write-Interface Wenb enables Wclk at data+ctrl FF –‘A’ full, handshake signals differ –‘A’ empty, Wack toggles –Upon next Wclk*Wenb,‘A’ receives new Wdat 1) C filled from B, ack from C waits at UE2 for Wclk 2) After Wclk, B gets ack, ‘A’ filled from outside 3) Handshaking continues … at next Wclk, state repeats itself
Avshalom Elyada, Ran GinosarPipeline Synchronization 20 Integrated Synchronizing Circuit in MT Write Cell
Avshalom Elyada, Ran GinosarPipeline Synchronization 21 Summary Pipeline Synchronization –High throughput, embedded sync, long interconnect, 2-phase The Mousetrap Cell Synchronization components –WAIT4, UE4, UE2 Buffer Interfaces –Write and Read sections MT with integrated sync circuit