Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 15, 2010.

Slides:



Advertisements
Similar presentations
An EHR based methodology for Concurrency management Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Advertisements

March, 2007http://csg.csail.mit.edu/arvind802.11a-1 Architectural Exploration: Area-Performance tradeoff in a Transmitter Arvind Computer Science.
Folding and Pipelining complex combinational circuits Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.
Analog-to-Digital Converter (ADC) And
Combinational Circuits in Bluespec
Synchronous Digital Design Methodology and Guidelines
June 5, Architectural Exploration: a Transmitter Arvind, Nirav Dave, Steve Gerding, Mike Pellauer Computer Science & Artificial Intelligence.
Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
February 21, 2007http://csg.csail.mit.edu/6.375/L07-1 Bluespec-4: Architectural exploration using IP lookup Arvind Computer Science & Artificial Intelligence.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013
December 10, 2009 L29-1 The Semantics of Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute.
© 2003 Xilinx, Inc. All Rights Reserved FPGA Design Techniques.
Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 20, 2013http://csg.csail.mit.edu/6.375L05-1.
March 4, 2009L13-1http://csg.csail.mit.edu/6.375 Multiple Clock Domains Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
© BYU 18 ASYNCH Page 1 ECEn 224 Handling Asynchronous Inputs.
Folded Combinational Circuits as an example of Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Multiple Clock Domains (MCD) Continued … Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology November.
March, 2007Intro-1http://csg.csail.mit.edu/arvind Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial Intelligence Lab.
March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.
Constructive Computer Architecture Tutorial 2 Advanced BSV Sizhuo Zhang TA Oct 9, 2015T02-1http://csg.csail.mit.edu/6.175.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
September 8, 2009http://csg.csail.mit.edu/koreaL03-1 Combinational Circuits in Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Folding complex combinational circuits to save area Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Sequential Circuits - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
03/30/031 ECE Digital System Design & Synthesis Lecture Design Partitioning for Synthesis Strategies  Partition for design reuse  Keep related.
REGISTER TRANSFER LANGUAGE (RTL) INTRODUCTION TO REGISTER Registers1.
March 1, 2006http://csg.csail.mit.edu/6.375/L09-1 Bluespec-3: Architecture exploration using static elaboration Arvind Computer Science & Artificial Intelligence.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Combinational Circuits in Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 9, 2011L03-1
Computer Architecture: A Constructive Approach Bluespec execution model and concurrent rule scheduling Teacher: Yoav Etsion Taken (with permission) from.
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
Introduction to Bluespec: A new methodology for designing Hardware
Introduction to Bluespec: A new methodology for designing Hardware
REGISTER TRANSFER LANGUAGE (RTL)
Folded Combinational Circuits as an example of Sequential Circuits
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Sequential Circuits Constructive Computer Architecture Arvind
Architectural Exploration:
Sequential Circuits: Constructive Computer Architecture
Combinational Circuits in Bluespec
Stmt FSM Arvind (with the help of Nirav Dave)
Clock Domain Crossing Keon Amini.
Combinational Circuits in Bluespec
Multirule Systems and Concurrent Execution of Rules
Constructive Computer Architecture: Guards
Sequential Circuits Constructive Computer Architecture Arvind
Combinational Circuits and Simple Synchronous Pipelines
Combinational Circuits and Simple Synchronous Pipelines
Modules with Guarded Interfaces
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Introduction to Bluespec: A new methodology for designing Hardware
Stmt FSM Arvind (with the help of Nirav Dave)
Combinational Circuits in Bluespec
Multiple Clock Domains
Constructive Computer Architecture: Guards
Simple Synchronous Pipelines
Introduction to Bluespec: A new methodology for designing Hardware
Architectural Exploration:
Simple Synchronous Pipelines
Outline Registers Counters 5/11/2019.
Registers Today we’ll see some common sequential devices: counters and registers. They’re good examples of sequential analysis and design. They are also.
Presentation transcript:

Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 15, L12-1

Plan Why Multiple Clock Domains a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit a Synchronizers March 15, L12-2

Why Multiple Clock Domains Arise naturally in interfacing with the outside world Needed to manage clock skew Allow parts of the design to be isolated to do selective power gating and clock gating Reduce power and energy consumption March 15, L12-3

Review: a Transmitter ControllerScramblerEncoderInterleaverMapper IFFT Cyclic Extend headers data 24 Uncoded bits Operates on 1-4 packets at a time Converts 1-4 packets into a single packet A lot of compute; n cycles/packet n depends on the IFFT implementation; for the superfolded version n  51 March 15, L12-4

Synthesis results for different microachitectures Design Area (mm 2 )Best CLK Period Throughput CLK/symbol Latency Comb ns1 Pipelined1.467 ns121 ns Folded0.838 ns324 ns S Folded 1 Radix ns ns TSMC.13 micron; numbers reported are before place and route. Single radix-4 node design is ¼ the size of combinational design but still meets the throughput requirement easily; clock can be reduced to Mhz For the same throughput SF has to run ~16 times faster than F March 15, Dave,Pellauer,Ng 2005 L12-5

Rate Matching ControllerScramblerEncoder InterleaverMapper IFFT Cyclic Extend headers data Clock speed f f/13 f/52 After the design you may discover the clocks of many boxes can be lowered without affecting the overall performance March 15, L12-6

Power-Area tradeoff The power equation: P = ½ CV 2 f V and f are not independent; one can lower the f by lowering V – linear in some limited range Typically we run the whole circuit at one voltage but can run different parts at different frequencies We can often increase the area, i.e., exploit more parallelism, and lower the frequency (power) for the same performance One would actually want to explore many relative frequency partitionings to determine the real area/power tradeoff March 15, L12-7

Plan Why Multiple Clock Domains a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit a Synchronizers March 15, L12-8

Associating circuit parts with a particular clock Two choices to split the design: Partition State Rules must operate in multiple domains Partition Rules State Elements must have methods in different clock domains It is very difficult to maintain rule atomicity with multi-clock rules. Therefore we would not examine “Partitioned State” approach further March 15, L12-9

Partitioning Rules Only touched by one domain Methods in red and green domains A method in each domain March 15, L12-10

Handling Module Hierarchy Methods added to expose needed functionality March 15, L12-11

We need a primitive MCD synchronizer: for example enq one on clock and deq/first/pop on another full/empty signals are conservative approximations may not be full when full signal is true We’ll discuss implementations later FIFO March 15, L12-12

Back to the Transmitters ControllerScramblerEncoder InterleaverMapper IFFT Cyclic Extend headers data March 15, L12-13

Domains in the Transmitter let controller <- mkController(); let scrambler <- mkScrambler_48(); let conv_encoder <- mkConvEncoder_24_48(); let interleaver <- mkInterleaver(); let mapper <- mkMapper_48_64(); let ifft <- mkIFFT_Pipe(); let cyc_extender <- mkCyclicExtender(); rule controller2scrambler(True); stitch(controller.getData,scrambler.fromControl); endrule … many such stitch rules … function Action stitch(ActionValue#(a) x, function Action f(a v)); action let v <- x; f(v); endaction endfunction These colors are just to remind us about domains March 15, L12-14

Coloring the rules? rule controller2scrambler(True); stitch(controller.getData, scrambler.fromControl); endrule rule scrambler2convEnc(True); stitch(scrambler.getData, conv_encoder.putData); endrule rule mapper2ifft(True); stitch(mapper.toIFFT, ifft.fromMapper); endrule All methods in the same domain Using different domains… March 15, L12-15

Domain Crossing rule mapper2ifft(True); stitch(mapper.toIFFT, ifft.fromMapper); endrule inline stitch Different methods in an action are on different clocks – we need to change the clock domains rule mapper2ifft(True); let x <- mapper.toIFFT(); ifft.fromMapper(x) endrule March 15, L12-16

Introduce a domain crossing module let m2ifftFF <- mkSyncFIFO(size,clkGreen, clkRed); Many such synchronizers In real syntax, one clock value is passed implicitly March 15, L12-17

Fixing the Domain Crossing rule mapper2ifft(True); let x <- mapper.toIFFT(); ifft.fromMapper(x) endrule rule mapper2fifo(True); stitch(mapper.toIFFT, m2ifftFF.enq); endrule rule fifo2ifft(True); stitch(pop(m2ifftFF), ifft.fromMapper); endrule split let m2ifftFF <- mkSyncFIFO(size,clkGreen,clkRed); synchronizer syntax is not quite correct March 15, L12-18

Similarly for IFFT to CyclicExt let ifft2ceFF <- mkSyncFIFO(size,clkRed,clkBlue); rule ifft2ff(True); stitch(ifft.toCyclicExtender, ifft2ceFF.enq); endrule rule ff2cyclicExtender(True); stitch(pop(ifft2ceFF), cyc_extender.fromIFFT); endrule Now each rule is associated with exactly one clock domain! March 15, L12-19

Plan Why Multiple Clock Domains a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit a Synchronizers March 15, L12-20

How to introduce clocks module mkTransmitter(Transmitter#(24,81)); let controller <- mkController(); let scrambler <- mkScrambler_48(); let conv_encoder <- mkConvEncoder_24_48(); let interleaver <- mkInterleaver(); let mapper <- mkMapper_48_64(); let ifft <- mkIFFT_Pipe(); let cyc_extender <- mkCyclicExtender(); // rules to stitch these modules together How should we 1. Generate different clocks? 2. Pass them to modules? 3. Introduce clock synchronizers and fix the rules? March 15, L12-21

Instantiating modules with clocks (clock is a type) Synthesized modules have an input port called CLK, which is passed to all interior instantiated modules by default However, any module can be instantiating with an explicit clock Modules can also take clocks as ordinary arguments, to be fed to interior module instantiations Clock c = … ; Reg# (Bool) b <- mkReg (True, clocked_by c); March 15, L12-22

The clockOf() function makes the implicit clock explicit c, c0, c1 and c2 are all equal Can be used interchangeably for all purposes If the expression is a constant, the result is the special value noClock noClock values can be used on in any domain Reg# (UInt# (17)) x <- mkReg (0, clocked_by c); Clock c0 <- exposeCurrentClock; let y = x + 2; Clock c1 = clockOf (x); Clock c2 = clockOf (y); March 15, L12-23

Clock Dividers interface ClockDividerIfc ; interface Clock fastClock ; // original clock interface Clock slowClock ; // derived clock method Bool clockReady ; endinterface module mkClockDivider #( Integer divisor ) ( ClockDividerIfc ifc ) ; Divisor = 3 Fast CLK Slow CLK CLK rdy March 15, L12-24

Plan Why Multiple Clock Domains a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit a Synchronizers March 15, L12-25

Step 1: Introduce Clocks module mkTransmitter(Transmitter#(24,81));... let controller <- mkController(); let scrambler <- mkScrambler_48(); let conv_encoder <- mkConvEncoder_24_48(); let interleaver <- mkInterleaver(); let mapper <- mkMapper_48_64(); let ifft <- mkIFFT_Pipe(); let cyc_extender <- mkCyclicExtender(); // rules to stitch these modules together let clockdiv13 <- mkClockDivider(13); let clockdiv52 <- mkClockDivider(52); let clk13 = clockdiv13.slowClock; let clk52 = clockdiv52.slowClock; March 15, L12-26

Step 2: Pass the Clocks module mkTransmitter(Transmitter#(24,81)); let clockdiv13 <- mkClockDivider(13); let clockdiv52 <- mkClockDivider(52); let clk13 = clockdiv13.slowClock; let clk52 = clockdiv52.slowClock; let controller <- mkController(clocked_by clk13); let scrambler <- mkScrambler_48(clocked_by clk13); let conv_encoder <- mkConvEncoder_24_48(clocked_by clk13); let interleaver <- mkInterleaver(clocked_by clk13); let mapper <- mkMapper_48_64(clocked_by clk13); let ifft <- mkIFFT_Pipe(); let cyc_extender <- mkCyclicExtender(clocked_by clk52); // rules to stitch these modules together Default Clock Now some of the stitch rules have become illegal because they call methods from different clock families Introduce Clock Synchronizers March 15, L12-27

Step 3: Introduce Clock Synchronizers module mkTransmitter(Transmitter#(24,81)); let m2ifftFF <- mkSyncFIFOToFast(2,clockdiv13); let ifft2ceSF <- mkSyncFIFOToSlow(2,clockdiv52); … let mapper <- mkMapper_48_64(clocked_by clk13); let ifft <- mkIFFT_Pipe(); let cyc_extender <- mkCyclicExtender(clocked_by clk52); rule mapper2fifo(True); //split mapper2ifft rule stitch(mapper.toIFFT, m2ifftFF.enq); endrule rule fifo2ifft(True); stitch(pop(m2ifftFF), ifft.fromMapper); endrule rule ifft2fifo(True); //split ifft2ce rule stitch(ifft.toCycExtend, ifft2ceFF.enq); endrule rule fifo2ce(True); stitch(pop(ifft2ceFF), cyc_extender.fromIFFT); endrule March 15, L12-28

Did not work... bsc -u -verilog Transmitter.bsv Error: "./Interfaces.bi", line 62, column 15: (G0045) Method getFromMAC is unusable because it is connected to a clock not available at the module boundary. Need to fix the Transmitter’s interface so that the outside world knows about the clocks that the interface methods operate on. (These clocks were defined inside the module) March 15, L12-29

The Fix – pass the clocks out interface Transmitter#(type inN, type out); method Action getFromMAC(TXMAC2ControllerInfo x); method Action getDataFromMAC(Data#(inN) x); method ActionValue#(MsgComplexFVec#(out)) toAnalogTX(); interface Clock clkMAC; interface Clock clkAnalog; endinterface March 15, L12-30

Clock Summary The Clock type, and type checking ensures that all circuits are clocked by actual clocks BSV provides ways to create, derive and manipulate clocks, safely BSV clocks are gated, and gating fits into Rule- enabling semantics (clock guards) BSV provides a full set of speed-independent data synchronizers, already tested and verified The user can define new synchronizers BSV precludes unsynchronized domain crossings March 15, L12-31

Clock Synchronizers March 15, L12-32

Moving Data Across Clock Domains Data moved across clock domains appears asynchronous to the receiving (destination) domain Asynchronous data will cause meta-stability The only safe way: use a synchronizer clk d q Meta-stable data Setup & hold violation March 15, L12-33

Synchronizers Good synchronizer design and use reduces the probability of observing meta-stable data Bluespec delivers conservative (speed independent) synchronizers User can define and use new synchronizers Bluespec does not allow unsynchronized crossings (compiler static checking error) March 15, L12-34

2 - Flop BIT-Synchronizer Most common type of (bit) synchronizer FF1 will go meta-stable, but FF2 does not look at data until a clock period later, giving FF1 time to stabilize Limitations: When moving from fast to slow clocks data may be overrun Cannot synchronize words since bits may not be seen at same time sClkdClk sDINdD_OUT FF0 FF1FF2 March 15, L12-35

Bluespec’s 2-Flop Bit-Synchronizer The designer must follow the synchronizer design guidelines: No logic between FF0 and FF1 No access to FF1’s output sClkdClk send()read() FF0 FF1FF2 mkSyncBit interface SyncBitIfc ; method Action send ( Bit#(1) bitData ) ; method Bit#(1) read () ; endinterface March 15, L12-36

Use example: MCD Counter Up/down counter: Increments when up_down_bit is one; the up_down_bit is set from a different clock domain. Registers: Reg# (Bit#(1)) up_down_bit <- mkReg(0, clocked_by ( writeClk ) ); Reg# (Bit# (32)) cntr <- mkReg(0); // Default Clk The Rule (attempt 1): rule countup ( up_down_bit == 1 ) ; cntr <= cntr + 1; endrule Illegal Clock Domain Crossing March 15, L12-37

Adding the Synchronizer SyncBitIfc sync <- mkSyncBit( writeClk, writeRst, currentClk ) ; rule transfer ( True ) ; sync.send ( up_down_bit ); endrule rule countup ( sync.read == 1 ) ; cntr <= cntr + 1; endrule clocked by writeClk clocked by currentClk Split the rule into two rules where each rule operates in one clock domain March 15, L12-38

MCD Counter module mkTopLevel#(Clock writeClk, Reset writeRst) (Top ifc); Reg# (Bit# (1)) up_down_bit <- mkReg(0, clocked_by(writeClk), reset_by(writeRst)) ; Reg# (Bit# (32)) cntr <- mkReg (0) ; // Default Clocking Clock currentClk <- exposeCurrentClock ; SyncBitIfc sync <- mkSyncBit ( writeClk, writeRst, currentClk ) ; rule transfer ( True ) ; sync.send( up_down_bit ); endrule rule countup ( sync.read == 1 ) ; cntr <= cntr + 1; endrule We won’t worry about resets for the rest of this lecture March 15, L12-39

Different Synchronizers Bit Synchronizer FIFO Synchronizer Pulse Synchronizer Word Synchronizer Asynchronous RAM Null Synchronizer Reset Synchronizers Documented in Reference Guide March 15, L12-40