Download presentation
Presentation is loading. Please wait.
Published byScot McLaughlin Modified over 8 years ago
1
ROACH II Review Introduction
2
ROACH II Review Agenda ● Agenda: – ROACH I, what was right, what was wrong? – Why ROACH II? – The place of ROACH II – Specifications – Financial Implications – Block diagram – Mezzanine boards – Mechanical and Thermal – Tool flow – Applications – Downgrading – Schedule and plans – Risks – Open discussion – Action items
3
Looking back at ROACH ● Good: – Architecture ● External PPC ● Good mix of memory types ● Useful interfaces – Compatibility ● ATX, mechanically and power supply ● Z-DOK ● 10GbE – Appears to be reliable – In production
4
Looking back at ROACH ● Not so good – Lantronix Xport firmware undocumented and changes without notice – Actel Fusion hard to program, a few failures of on-board Flash – National 1GbE phy sometimes (~5%) fails in production – Production test system is a bit of a hack – Memory bandwidth somewhat less than potential Z-DOK bandwidth – Possible to brick the board during Flash update – need JTAG tool to bring back to life – Want more digital bandwidth for next generation samplers
5
Why ROACH II? ● It's the Casper way ● Moore's law ● Xilinx Virtex 6 ● Fast ADC's ● CX4 is not likely to grow ● meerKAT
6
IBOB ● Good ● Cheap ● Simple ● ZDOKs for ADC inputs ● SRAM ● Bad ● No external processor ● Small FPGA capacity ● SRAM low bandwidth ● Limited IO ● Custom housing and PSU ● Good for ● F engines ● spectrometers
7
BEE2 ● Good ● Lots of IO ● Lots of DRAM ● Bad ● Control FPGA wasted ● No ADC interface ● Expensive ● Reliability concerns ● DRAM hard to use ● Custom PSU and case ● Good for: ● X engines ● Beamformers
8
BEE3 ● Good ● Huge processing density ● Bad ● Expensive ● Closed-source ● No onboard processor ● Custom size ● poor IO relative processing power ● Good for ● ??
9
ROACH1 ● Good: ● External PPC ● Well-chosen FPGA ● QDR and DDR ● Health monitoring ● ATX mech compatible ● Bad ● Hard to manufacture ● CX4 dying standard ● Overly complicated ● Need JTAG for update ● Good For ● Just about everything! ● Correlators ● Beamformers ● Pulsar Machines ● Spectrometers ●...
10
MORPH Massive Open Reconfigurable Performance Hardware Morph Open Reconfigurable Performance Hardware ● Good for: ● Low-cost pre-processors (F engines or Spectrometers) ● IBOB replacement ● Good ● Low cost ● Low power ● Incorporates lessons learnt from ROACH1 ● Bad ● No QDR ● Limited DDR bandwidth ● Limited IO ● Not sure ● Student project ● Aggressive schedule
11
ROACH 2 ● Good ● All ROACH-1 wins ● Reduced complexity ● Increased performance ● Flexible IO (SFP+, CX4, RJ45) ● Careful performance matching of onboard resources. ● Bad ● Expensive?? ● Good for: ● Large systems
12
ROACH 2: Specifications Motivation ● Primary design towards MeerKAT ● Largest resource consumer: Correlator ● Proportion resources for this application. ● Beamformer low on compute resources, high on IO (depending on number of synth'd beams) ● Want to keep R2 as useful platform for pulsar work, spectrometers and other basic instruments. ● Maintain backwards compatibility with existing hardware. ● Increase compute density, but keep 1FPGA/board.
13
Correlator requirements summary ● Virtex-6 SX475 main FPGA. ● Four 36bit QDR parts of 36Mibit each (144Mibit for MeerKAT2 upgrade path). ● A 72-bit DRAM DIMM slot running faster than 250MHz DDR. ● At least four 10GbE ports (more will be needed for beamforming). ● Increased PPC-FPGA datarates (32bit bus?).
14
Logic: FPGA sizing ● Logic: As big as possible! ● More FFT channels. ● Longer X engines. ● SX-series has good ratio of BRAM/Logic ● Logic/BRAM budgets with SX475 ● 512-input correlators ● Up to 8K FFT channels at 4GHz bandwidth
15
QDR Memory bandwidth ● Bandwidth scale 2^N, like FFT. ● Big matrix transpose operations possible, and fast, simple VACC implementations. ● ROACH2's four 36bit QDRs appear in application space as four 72bit SDR interfaces with simultaneous read/write. ● Flexibility: can be used for low-bandwidth deep depth, or in parallel for high bandwidth. ● F engine can transpose an 8GHz (16Gsps) stream!
16
QDR Memory capacity ● ROACH2 can be populated with different size QDRs. ● With maximum current capacity of 4x144Mib QDR memories: ● Packetise up to 32Ki FFT channels for 512-input correlator (or two streams of 64Ki FFT channels for 256-input correlator). ● VACC of 256-input correlator and 64Ki FFT chans. ● Need more? Sure: populate with 288Mib QDR, or move onto DRAM!
17
DDR ● Used where large memory capacities required (circular buffering and VACC for high-res spectrometers and large correlators) ● Variable latency; prefer QDR where possible. ● DRAM DIMMs expensive (FPGA pin resources). ● Single DIMM sufficient.
18
10GbE output ports ● ROACH's mezzanine connector is 6.25Gbps capable. With -2 speed devices, RXAUI enables up to 16x 10GbE ports per device. ● Two mezzanine cards. Can mix one with four CX4s (to mate to IBOB/BEE2/ROACH1) and one with four or eight SFP+ ports (to mate to new switches or driving optical fibre). ● Matched for dual-pol 4GHz stream using std XAUI and slower speed-grade devices. ● Can digitise and output streams up to 16GHz (quantised) bandwidth.
20
Board Management Hardware Monitoring Revamp ● Actel Fusion replaced with various COTS ICs to perform same function ● No remote power management interface FTDI Debug Interface ● USB to Multi-purpose IO Chip ● 2xUARTs, JTAG, IIC, GPIO access from host PC
21
PowerPC ● V6 bus interface now 32-bits (330 MB/s) ● SelectMAP interface directly driven by Processor Bus ● DMA signals added to CPLD/MMC interface ● Marvell 88E1111 1Ge PHY replaces problematic National PHY ● Up to 1GB NOR flash ● No more programmable clocks
22
FPGA ● Designed for Xilinx Virtex-6 SX475T or LX550T ● LX240T, LX365T, SX315T can be used with some loss in functionality
23
FPGA Memory ● From 2x18-bit to 4x36-bit QDR II+ ● From DDR2 → DDR3 (Registered DIMM only)
24
FPGA External I/Fs ● 2 x ZDOKS – up to 1.25 Gbps per pair (from 1 Gbps) ● 2 x MGT Breakout Slots ● 4 x Ten Gigabit Ethernet per Slot ● Media independent – could support CX4/SFP+/??? ● Supports up to 6.25 Gbps (RXAUI) ● 6 Single-Ended GPIO for MDIO, etc. ● 12 Single-ended GPIO (LVCMOS15), 8 LSBs of connected to LEDs ● Single Gigabit Ethernet (Marvell 88E1111) ● 2 x Auxiliary SMA inputs, 1 SMA output
25
ROACH II Mezzanine Cards/Boards ● Dimensions L x W x H (mm) ● 103 x 90 x 92 (total incl connector pair) ● Connector by SAMTEC ● QMS/QFS High Speed ● Rated at 8GHz/16Gbps ● 3V3/1V5, 12GPIO, 4 x 32 transceiver pairs ● Issues ● SFP+ mounting ● CX4 mounting ● Combinations ● RFI ● Cooling Altium Schematic of SAMTEC connector 3D model of SAMTEC connector
27
3D Impressions of board possibilities, CX4 and SFP+
28
ROACH II Mechanical and Thermal ● Roach II enclosure to be 1U ● Same format as ROACH I with telescopic slides and customizable front face ● ISSUES thermal and mechanical ● Suitable enclosure manufacturer ● RFI with regard to SFP+ and cooling requirements (shielding effectiveness) ● Current Internal Air Flow not sufficient to cool PPC, ROACH I. PPC on ROACH II may require fan cooled heatsink. ● Air flow channeling will be challenging with the mezzanine cards. Rear panel vent space will be limited. ● Moving to Cfdesign 2010 with review by Blue Ridge Numerics. ● Water cooling
29
Thermal Vector plot of ROACH I
30
Power PC Temperature Input Ambient Temperature QDR TemperatureXylinx V5 FPGA Temperature
31
Exit Ambient Temperature
32
ROACH2 Toolflow ● Maintain compatibility with ROACH1. ● Hardware cores identical. ● Transparent port of existing designs. ● No complications like BEE2/iBOB → ROACH because Simulink primitives identical.
33
Pricing ● Xilinx V6 Costs: (1759 pin package) ● Avnet Express: V6 LX240T - $2,884.4400 ● V6 LX550T - $8K-10K (likely 3 x more than LX240T) ● SX parts tend to be more expensive than LX ● V6 LX475T - $9K-12K (likely 4x more than 240T) ● Additional: 2xQDRs ($120 each), SFP+ (100$ per port) ● Board costs: $ 4000 (including SFP+) or $ 3200 (CX4) ● Total cost: Likely $ 13000-16000 with SX475T
34
Downgrading ● ROACH II designed for SX475T or LX550T ● LX240T, LX365T or SX315T can also be used with losing the following: ● 1 Full QDR Memory, 18 Bits of a QDR ● 3 GTX tiles: lose 2 x 10Ge ports and SGMII port ● 4 x Single-ended GPIO
35
Risks ● Cost ● XC6VSX475T is likely to be 4x the cost of XC5VSX95T ● Availability ● Large devices not available yet... ● Technical ● Amongst others: – Mezzanine card adds connector to high speed IO – 45A on 1V – current density requires careful attention – General power and signal integrity
36
Schedule ● Schematics: Complete ● Layout: late June 2010 ● First prototype manufactured: mid July 2010 ● Initial bring-up and testing: late August 2010 ● First production run: September 2010
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.