Download presentation
Presentation is loading. Please wait.
Published byNicholas Booth Modified over 6 years ago
1
Rad-tolerance Tests of FPGAs and SerDeses for SuperB
A. Aloisio1, V. Bocci2, R. Giordano1 In this talk 1 Physics Dept. - University of Napoli “Federico II” and INFN Sezione di Napoli, Italy 2 INFN Sezione di Roma 1, Rome, Italy
2
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Outline Testing of the Xilinx Virtex-5 and Virtex-6 FPGAs Testing of the Texas DS92LV2421 Serializer Conclusions 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
3
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
FPGA Devices Xilinx Virtex 5 LX50T Includes high-speed SerDeses (GTP) Tested by Xilinx, NASA, Lawrence Berkeley Lab. and Los Alamos Lab. Two versions: rad-hard (70k$) and standard (400$) (we focus on the latter) Xilinx Virtex 6 LX240T Includes high-speed SerDeses (GTX) Both families embed configuration CRC, ECC and partial reconfiguration capabilities 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
4
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Benchmark Design Serial Link running at 2 Gbps (compatible with the TLK2711-A) 16-bit parallel words (18-bit including control bits) 100 MHz parallel clock 8b10b encoding Explicit lock flag Dummy Logic on Tx and Rx parallel data-path to observe realistic SEU effects 2x 16 levels of 18-bit registers and combinational logic Implemented with Precision Hi-Rel synthesizer Two firmware versions: one with and one without Triple Module Redundancy moderation techniques Resource Occupation GTPs : 1 (10%) Slices: 376 (5%) FFs: 926 (3%) LUTs: 980 (3%) BRAM: 1 (1%) PLLs: 1 (16%) Clock Buffers: 6 (18%) IOBs: 47 (10%) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
5
Global Triple Module Redundancy
We adopted the GTMR option (recommended for S-RAM FPGAs) Combinatorial logic, registers and the pertaining clock and reset networks are tripled and voted out But, in our design, IOs have not been tripled, unfeasible Embedded SerDes (GTP) & PLL not tripled GTP replication would have required Serial IO triplication, unpractical PLL triplication not supported by the tool We also used V5 dedicated power-posts (avoid half-latch structures) Tripled reset and clock networks majority voters 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
6
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Placement Hardening TMR firmware hardened with dedicated placement algorithm developed at Politecnico di Torino (thanks to M. Violante & L. Sterpone) Prepared 6 different layouts => 6 different firmwares Efforts: low, medium, high, VP, Compact (Dec test) VP-B (Mar. 2012) For each layout a different number of PIPs, interconnect, multi-domain switch matrices are used Flat layout, no separation between TMR domains Separate TMR domains VP-B VP 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
7
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
On-Beam Setup Test at LNS, Catania Beam facts (apply also for the other tests presented in this talk) 62-MeV protons spot diameter 30mm intensity uniform over sample Test conditions for Virtex-5 and Virtex-6 FPGAs All Vcc set to maximum allowed value (1.05V, 1.32V, 2.63V, 3.45V) fclock = 100 MHz Line rate during test=2 Gbps Configuration cross section measurement and link failure test Tested several link firmwares: w/ TMR (with several layouts) and w/out TMR Closer Front view Beam spot WiderBackview FPGA Test board Parallel & Serial Data cables Tester board 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
8
SEU Tolerance Results (1)
Total 23 runs Dose rate from 2 to 2.7 Gy/min (Si), average 2.21 Gy/min (Si) Average Test flow Start a new run Program the FPGA Wait for the scrub interval (120s) Read-back the configuration and log the number of SEUs Reprogram the FPGA Go back to 3, until the link fails End of the run Firmware version N of Runs Average Scrub cycles/run No TMR 5 2.4 TMR (no optimization) 5.4 TMR low 9.6 TMR medium 4 4.75 TMR vp 7.75 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
9
SEU Tolerance Results (2)
Virtex-5 (Mar. 2012) 1 run with TMR firmware (VP-B layout) and scrubbing, dose rate 2 Gy/min (Si) Survived 7633 SEUs, 15 scrub cycles Did not fail, test interrupted due to end of beam time allocation Just 1 run, but encouraging Virtex-6 3 runs for link failure testing at 2.4 Gy/min (Si) In ALL runs, failure measured before 100s of operation (dose < 4 Gy (Si)) Also, 5 runs for configuration x-section measurement with int. dose of 80 Gy (Si) each (see next slide) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
10
Configuration Memory Results
Total 8 runs V5 dose rates from 1.1 to 2.6 Gy/min (Si), average 1.8 Gy/min (Si) Configuration x-section results: Virtex-5 sbit = 1.7410-14 cm2 bit-1 A factor 2 lower than previous results, different batch, different foundry, perhaps improved process Virtex-6 sbit = 1.0510-14 cm2 bit-1 Published value for neutrons sbit = 1.0210-14 cm2 bit-1 Test Methodology Start a new run Program the FPGA Read-back the configuration via JTAG and compare it to the initial one Log number of differences (errored bits) Go back to 3, until total desired dose reached The test loop (pts 3,4,5) has been executed at the maximum speed permitted by JTAG 1 readback every 15 seconds for V5, 1 every 45s for V6 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
11
S-RAM FPGAs in Radiation Environments
S-RAM FPGAs traditionally excluded from radiation areas Configuration (stored in static RAM) is sensitive to single event upsets (SEUs) => bit-flips A bit-flip in the configuration memory can change design functionality, until device is reconfigured It is important to test BOTH device SEU cross-section and design SEU-tolerance In addition, SEE effects common to all digital devices e.g. SEU in registers , SET in combinatorial logic 21/04/2018 Raffaele Giordano NSS, Valencia Oct. 2011
12
Current During SEU Tolerance Tests
Core Static and Dynamic currents trends Run #20-21(Jul. ‘11) - TMR project Accumulation of SEUs in configuration memory leads to current increase (‘switching on’ of unused elements, clashes) Dynamic and static core currents exhibit very similar trends Other currents (IO, AUX, MGT) remain constant Initial current drawn is restored after FPGA reconfiguration, i.e. no persistent effects No significant change in current drawn by the device after the total integrated dose of all the tests (1.2 kGy (Si)) => no TID effects at this dose Irrad. Stop FPGA reconfig 25 Gy(Si) Id (A) Is (A) SEU accumulation Irrad. start Time (s) 3 Gy/min (Si) 5174 SEUs in 484 s (time to failure) 641 SEU/min => 209 SEU/Gy (Si) Total current increase 15mA => 2.8 uA/SEU 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
13
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Core Current Close-up Icore Close-up Trend measured without running the functional test FPGA programmed with noTMR firmware Trend is locally non-monotonic, globally increasing Steps due to SEUs Icore during run # 48, Dec. ‘11 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
14
Texas (former National) DS92LV2421
Serializer designed for video applications 24–bit data, 3–bit control, 10 – 75 MHz clock Datarate 240 Mbps Gbps Separate power supply for core (1.8V) and IO (3.3V LVCMOS) Dc-balanced data scrambling (proprietary encoding «Channel Link II») => to be used with companion deserializer (DS92LV2422) Optional I2C compatible Serial Control Bus Built-in self-test w/ reporting pin and adjustable de-emphasis options Huge and variable latency: clock cycles! 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
15
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
DS92LV2421 Test Bench data from Deserializer data to Serializer Ethernet link Ethernet link BRIDGE RS-232 link XILINX ML505 Tektronix DTG5334 PC logging BER Test results on the hard-drive 24+1 Clock Generator FPGA Clk RX Data out & Recovered Clock TX Data in & Clock DUT (Serializer) Deserializer 24+1 Serial connection (USB cable) Ethernet link Dire che e’ lo stesso setup dell’altra volta. Power Analyzer & Logger 1.8V 3.3V 1.8V 3.3V 21/04/2018 AGILENT N6705A Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
16
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
On-Beam Setup Test conditions fclock = 50 MHz Payload data rate during test = 1.2 Gbps 19 runs total, 16 Gy (Si) each Typical dose rate 0.8 Gy/min (Si) (very low compared to previous rad-tests) 14 runs at with maximum Vcore,VIO (1.89V, 3.6V) for ehnanced SEU-susceptivity 2 runs at with typical Vcore,VIO (1.8V, 3.3V) 3 runs at with minimum Vcore,VIO (1.71V, 3.0V) Performed BER tests without power-cycling the device at each time unit (TU lasts 10 s) DS92LV2421 Close-up on DUT Beam spot Deserializer Wider view FPGA Test board Parallel & Serial Data cables Overall view Tester board 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
17
Bit Error Ratio Test Results
slock= 2.210-10 cm2 Total dose 210 Gy (Si), very low due to low dose rate Typical failure bus stuck at a value (loss-of-lock?), recovered within 10 s (# 25 total) In addition isolated errors in the payload (#10 total) No difference between runs at maximum, typical and minimum supply voltages DS92LV24 performed 2 orders of magnitude worse than LV18 DS92LV2421 is really sensitive to radiation Currents steady during the whole test serr= 8.8 10-11 cm2 Combined failure x-section stxLV24= 2.410-10 cm2 Results for DS92LV18 stxLV18= 2.510-12 cm2 srx= cm2 slink= 6 cm2 Comparison of the x-sections, corrected for the payload data rates (on LV24 it is 6% higher) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
18
Expected Failure Rates at SuperB
For a dose rate of 5 kGy/year (Si), 62-MeV proton equivalent Device Failures/ year MTBF (days) Notes FPGA V5 – TMR 133 3 Current trend, removed by scrubbing FPGA V5 – No TMR 407 1 FPGA V6 Too many Too low TLK2711-A 33 11 TID effects, unrecoverable failure at Gy DS92LV18 tx 9 41 TID effects, current variation DS92LV18 rx 0.4 1023 DS92LV18 loss-of-lock 2.1 170 DS92LV2421 (tx) 856 0,43 No current variations vs TID 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
19
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Conclusions Measured Xilinx V5 and V6 configuration x-section/bit and design SEU-tolerance Virtex-5 New layouts with placement optimized for SEU-tolerance tested (testbeams in Dec and Mar. 2012) Improvement thank to TMR + scrubbing + placement hardening One of the layouts reached 7600 SEUs without failure, very encouraging, unfortunately had the (beam) time to execute just one run Virtex-6 configuration x-section/bit lower (better) than V5 family however the benchmark link design more sensitive (worse) than V5 version, needs to be investigated (probably failure not due to configuration errors) DS92LV2421 Very sensitive to radiation (weaker than designs in V5 FPGAs) failures lead to loss-of-lock, recovered within a TU Next proton test beam 62-MeV protons) scheduled for July 2012 Test of Xilinx Kintex-7 further testing of opto-electronics (will be presented in Elba together with results from Mar testbeam, under studies) Improve FPGA link, by design and by placement ? => new tests ? 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
20
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Aknowlegdements We wish to thank the LNS staff (G.A.P.Cirrone and many others) for their help and support during the beam test G. Chiodi, M. Capodiferro (INFN Roma 1) for their help with the test-beam preparation and shifts L. Sterpone and M. Violante (Politecnico di Torino) for the placement hardening of the TMR firmwares 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
21
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Back-up Slides 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
22
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Virtex-5 and 6 Test Bench Ethernet link Ethernet link BRIDGE RS-232 link XILINX ML505 Tektronix DTG5334 USB link 18 2 PC logging all the data on the hard-drive (errors, static and dynamic currents…) and driving configuration read-back 18 RX Data out Recovered Clk 8 Controls Clock Generator FPGA Clk JTAG Programmer TX Clk Ethernet link TX Data in V5 / V6 FPGA JTAG Dire che e’ lo stesso setup dell’altra volta. Power Analyzer & Logger Controls RX Serdes TX Serdes Configuration and Readback 21/04/2018 AGILENT N6705A 4 Power inputs: 1.0V, 1.2V, 2.5V, 3.3V Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
23
Details on Placement Hardening
Original TMR: the topology is obtained with Xilinx proprietary place and route tools without the application of any modification. TMR VP: the TMR medium topology has been finally modified acting on the placement position of CLBs having LUTs or routing resources belonging global logic interconnections and therefore isolating the global logic resources in independent CLB locations. TMR low: the topology modified by applying the V-Place algorithm on the FPGA’s Configurable Logic Blocks (CLBs) using LUTs, FFs and routing avoiding the presence of two different TMR domains in the same resource. TMR medium: the topology obtained on the TMR low has been further modified applying the V-Place algorithm and avoiding CLBs having logic resources and routing of three different TMR domains. 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
24
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
DS92LV2421 Currents Trend Close-up: oscillation due to Time-Unit structure of the test No variation of currents vs dose (time) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
25
Configuration Bit-flip Cross-section
V5LX50T device Die size = 1.4 cm2 62MeV Measured s = 3,510-14 cm2/bit Published s = 6.410-14 cm2/bit, see [1] Good agreement between measured and published s (different device, facility, test conditions…) X-ray image Package 12 mm Die Reference: [1] Quinn et al., Proceedings of 2007 IEEE Radiation Effetcs Data Workshop, Page(s): 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
26
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
LNS Facility Superconducting cyclotron at LNS - Catania 62-MeV proton beam CATANA beam line Current tunable up to 300 pA Uniform beam intensity on our sample Experimental setup LNS accelerator PLAN Beam spot 30 mm 2D Beam profile Gafcromico per beamspot 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
27
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Link Failure Rates Courtesy of Riccardo Cenci, Elba Meeting, May-Jun. 2011 Effective x-section to failure is design-dependent (critical bits) 0.5 kGy(Si)/year (a) => 4.1104 configuration bit-flips on our design /year (measured) .# of link failures/year = 20 (on average, without ANY recovery strategy) Expected one link failure every 18 days (includes both configuration and SEUs in configured logic) Die Note: (a) Estimated by R. Cenci, INFN Pisa 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
28
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Virtex-5: Device Facts Configuration size Mbits CLBs&routing 9 Mbit IOB, DSP, BRAM-interconnect 2.37 Mbit Clock cycles per CRC Readback 355,190 Readback time 7.1 ms (at 50 MHz) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
29
Configuration Memory Error Results
Accumulation of bit-flips into configuration memory, FPGA working like a «dosimeter» # of Config Errors Irradiation 71 Gy (Si) Experienced a few SEFIs, a SEU causes failure of readback circuitry => 1 SEU generates many (thousands) configuration errors Abrupt jumps of config errors Time (s) SEFI causes ‘jump’ of config erros # of Config Errors 9225 errors Time (s) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
30
Mentor Precision Hi-Rel Synthesizer
Aimed at applications in aerospace, medical, high-reliability and safety-critical FPGAs Several grades/options for Triple Module Redundancy Local TMR, registers are tripled and voted are inserted Distributed TMR, registers, combinatorial logic and even voters are tripled Global TMR, for highest reliability, see next slide Can use power posts of V5 instead of half-latches as a source for logic constants (logic ‘0’ and ‘1’) Local TMR Distributed TMR 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
31
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Tester Board Console I/O Same FPGA as DUT board Data pattern stored in the FPGA firmware TX and RX sections of the benchmark link design are tested independently and simultaneuosly Console IO Status and errors are logged on a console handle by an embedded microprocessor Parallel I/O XILINX ML505 Virtex5 – XC5VLX50T Clock input GTPdiff. I/O 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
32
Can We Use FPGAs On-Detector ?
S-RAM FPGAs traditionally excluded from radiation areas Configuration (stored in static RAM) is sensitive to single event upsets (SEUs) => bit-flips A bit flip in the configuration memory can change design functionality Mid-range Xilinx FPGAs include tools for configuration error detection and correction 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
33
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Quick Facts in Si = 8.39 KeV/(mg/cm^2) s = (1 / F) * nerrors (where F is the fluency) NSEU = Fbeam*sbit*Nbit(eff.) Nsysfail = NSEU / NSEU->failure 71 Gy(Si) => 5891 config. Errors 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
34
Single Event Functional Interrupts
Experienced a few BRAM SEFIs Run 30 - This is a known SEFI we call "BRAM SEFI" where the configuration bits (one) get flipped, and the entire BRAM gets its data inverted. Another BRAM SEFI is a replacement of a column with the spare column (256 bits in error). 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
35
Raffaele Giordano 3rd SuperB Collaboration Meeting, Frascati Mar. 2012
Recovery Strategies Scheduled maintenance Reconfigure the FPGA at regular intervals, e.g. once a day, no matter what Emergency maintenance Reconfigure as soon as the service can be interrupted, e.g. exploit any reset or power-cycle of the link to reconfigure Repair-while-running Partial reconfigure as soon as the error is detected, even if the link is working => interruption of service, i.e. dead time 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
36
Emulating Configuration SEUs
Investigate impact of configuration SEUs on design functionality Flip configuration bits by means of internal configuration access port (ICAP) Optional error correction thank to integrated CRC calculator and ECC (FRAME_ECC) Programmable integrated controller: SEU generation without correction (error accumulation) SEU generation and on-the-fly correction Custom design derivative of a Xilinx core (so called SEU Controller) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
37
Need to Estimate Rad-Tolerance
Two kind of issues : Total Ionizing Dose (common to all ICs) SEUs However, configuration SEUs have been largely over-estimated in the past: they rarely impact the design functionality In this talk we will not cover logic errors, they are common to every digital device Logic errors Lab test (SEU Injector) Configuration errors Beam test 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
38
What if FPGAs Could be Used
The fast link sub-system will be dramatically simplified: we would only have symmetrical links! (FPGA<->FPGA) No constraints imposed by line encodings of stand-alone SerDeses (i.e. start/stop bits of DS92LV18 and 8b10b coding TLK2711-A) Just one type of links, protocol and line coding customized to ETD requirements Fixed-latency proof and thoroughly tested Artix-7: cheap Xilinx FPGAs ( 40$) with 8 embedded SerDeses (GTPs) More than one link per chip Data-rates up to 3 Gbps on all links (including FCTS) (actually SerDes could go even faster : 6.6 Gbps) # of links could be halved or better (when not driven by topology) 21/04/2018 Raffaele Giordano rd SuperB Collaboration Meeting, Frascati Mar. 2012
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.