Trigger Gigabit Serial Data Transfer Walter Miller Professor David Doughty CNU October 4, 2007
GOAL: A “FIFO-Like” Trigger Data Transport Mechanism MHz > 16 > 250 MHz Error Free Minimal Latency Continuous Data Trigger Data Data Rate = 500 MBytes/s Backplane Fabric Fiber Optic Fabric
INITIAL GOAL: Reduced Data Rate without Forward Error Correction MHz > 16 > 125 MHz Data Rate = 250 MBytes/s Backplane Fabric Fiber Optic Fabric
Aurora Protocol Xilinx Open Serial Protocol Scalable, Lightweight, Link-layer Protocol Supporting 622 Mbps to 6.5 Gbps Rocket IO 8B/10B Coding Unlimited Bonded Lanes Full Duplex or Simplex
Two Lane Aurora Between FIFO 16 > > > 125 MHz MHz 3.125Gbs Aurora FIFO TX RX SOF EOF SRC RDY DST RDY SOF EOF SRC RDY
Issues Compensation Cycle –Occurs every words –Takes 6 clock cycles! Receive FIFO Starvation –Not allowed – will confuse trigger –Compensation Cycle will “tie up” serial data periodically –Sufficient data must be buffered to deal with compensation cycle –Requires a “starting pad” of data in the send FIFO at least 6 words of data Frame Size and Transfer Efficiency –To keep up we need to be able to transfer sustained at 2.0 Gbps (With a data link of 2.5 Gbps) Packetizing and state machine overheads Must be at least 80 percent efficient
Solution Receive FIFO starvation prevented Compensation cycle –Pad size of 11 words at start Frame size and transfer efficiency –Example - 16 word frame is % efficient Does not keep up with the data –Example – 32 word frame is % efficient – keeps up but close Data is taken with 64 word frames
State Machine Control Send state machines deal with: –Start pad –Send FIFO starvation –Compensation cycles –Start and end of frame signaling Receive state machines deal with: –Start and End of Frame signaling –Fill FIFO –Validate receipt order
Initial Goal Timing Sixteen bit trigger word generated at 125 MHz (every 8 ns) –Data rate of 2.0 Gbps Aurora clocked at MHz (6.397 ns) for 16 bits into one lane -> Yields 2.5 Gbps data transfer rate –Raw rate is Gbps for each lane Idea is to fill output FIFO (via Aurora) faster than it drains –Compensates for synchronization events in Aurora –Output is never “data starved” Requires us to drain input FIFO (via Aurora) faster than it fills –Requires initial starting “pad” of data –Start pad of 11 packets (88 ns) –Frame (Packet) size greater than 32 for needed efficiency
Simulated Frame Start and End Frame Start Frame End FIFO Empty Data Pause Start Buffer Pad 125 MHz Data Clock MHz Aurora Clock Data Gen Aurora Send
Compensation Cycle Frame EndStart Frame
Receive Compensation Cycle Compensation Cycle Output Data
Results Works! -Receiver never starved -Data correctly sequenced Test time (more coming) –Simulated to 800 s at 128 data words –Pad and Efficiency Checked to 45 s at 16, 32, 64, and 128 Latency: –Aurora has s –Pad minimal 88 ns
On to Goal Bond two lanes to achieve 4 Gbps –Handles 16 bits at 250 MHz Add Error Correction –Must be forward EC to avoid additional latency –Worst case double send (with offset – minor latency effect) –Uses 4 lanes –VXS has four lanes at each slot!
Proposed Timing Sixteen Bit Packet Generated at 125 MHz (every 4 ns) 2.0 Gbps Aurora Clocked at MHz (6.397 ns) for Gbps Transfer Each Lane Effective Rate is 2.5 Gbps for Each Lane Start Pad of 11 Packets (88 ns) Frame Size Greater than 32 for Needed Efficiency Latency Between Send and Receive s
Two Lane Aurora Between FIFO 16 >> >> 250 MHz MHz 3.125Gbs Aurora FIFO