Presentation is loading. Please wait.

Presentation is loading. Please wait.

FTK AM boards Design Review

Similar presentations


Presentation on theme: "FTK AM boards Design Review"— Presentation transcript:

1 FTK AM boards Design Review
S. Citraro, N. Biesuz, P. Giannetti, P. Luciano, D. Magalotti, M. Piendibene, A. Sakellariou, C.-L. Sotiropoulou Saverio Citraro PhD Student University of Pisa & I.N.F.N. Pisa

2 Outline Architecture of AMBSLP: Associative Memory Board Serial Link Processor FPGA design Detailed description of power distribution on Board & PCB design LAMBSLP: PCB design Measurements & Tests Next Test, and fix

3 The Goal of this review The AMBSLP_V3 board is the last prototype before the production, we need to discuss the last changes to be done on AMBSLP_V4 This is the right moment to get your comments to include solutions in the next board version. The design we present is “final” for us, except few marginal issues to be fixed, but some tests are still to be done. 3

4 AMBSLP Serial Links @ 2Gb/s VME 9 U Clock @100MHz Supply Voltages:
Power consumption: ~ 250 Watt Let’s start with a little description of the board: Here is shown a picture of the ambslp

5 AMBSLP Serial Links @ 2Gb/s VME Clock @100MHz Supply Voltages: 2,5V
AMB distributes SSs to 64 AMchips Serial Links @ 2Gb/s VME Clock @100MHz Supply Voltages: 2,5V 1,8 V 1,2V 1V Power consumption: ~ 250 Watt 8 links 12 links

6 AMBSLP Serial Links @ 2Gb/s VME Clock @100MHz Supply Voltages: 2,5V
AMB collects roads and send them to the AUX Serial Links @ 2Gb/s VME Clock @100MHz Supply Voltages: 2,5V 1,8 V 1,2V 1V Power consumption: ~ 250 Watt 4 links 16 links

7 AMBoard Main logic L A M B HITs from AUX ROADs to AUX HITs to LAMBs
GTPclock domain 50MHz GTP clock domain 50MHz HITs to LAMBs FIFO Dual Clock GTP RX Module Data Processing & Monitoring TX SYSTEM_CLOCK Parallelized Data VME SpyB HITs from AUX L A M B GTP clock domain 50MHz GTP clock domain 50MHz System clock domain 100MHz FIFO Dual Clock GTP RX Module Data Processing & Monitoring TX SYSTEM_CLOCK Parallelized Data VME ROAD FPGA ROADs to AUX SpyB ROADs from LAMBs

8 Processing Rate Requirements
AMB specs: The expected number of hits for a 70 pile-up WH event is < 650 per layer, or 1 hit per 15 ns. (The 8 layers are handled in parallel.) AMB specification is 1 hit per 10 ns. The expected rate of roads from the AMB is 8 roads per 10 ns it is satisfied.

9 DATA FORMAT From AUX to HIT From HIT to AMchip From Amchip to ROAD
Type of word Value K character IDLE WORD BCBC1C1C 1111 HITS DATA - 2 words (XXXX & YYYY 16 bits each) XXXXYYYY 0000 End Event WORD RR= error bits NNNN=L1ID F7RRNNNN 1000 From AUX to HIT Type of word Value K character IDLE WORD BCBC1C1C 1111 HITS DATA XXXXYYYY 0000 End Event WORD F7RRNNNN 1000 From HIT to AMchip From Amchip to ROAD Type of word Value K character IDLE WORD BCBC1C1C 1111 ROAD DATA -BB=bitmap AAAAAA= road address BBAAAAAA 0000 Type of word Value K character IDLE WORD 0000BC50 0010 ROAD DATA BBAAAAAA 0000 END EVENT WORD F7RRMMMM 1000 From ROAD to AUX

10 FPGAs Usage HIT (%) ROAD(%) VME(%) CTRL(%) Number of Slice Registers 4
1 Slice LUTs 6 9 occupied Slices 14 20 3 IOBs 24 27 70 87 RAMB36bits/FIFO36bits 56 74 - RAMB18bits/FIFO18bits 2 RAMB16bits BUFG/BUFGCTRLs 50 31 BUFIO2/BUFIO2_2CLKs BUFIO2FB/BUFIO2FB_2CLKs GTPE2_CHANNELs 100 DCM/DCM_CLKGENs 12 25 MMCME2_ADVs ILOGIC2/ISERDES2s 7 Let’s start with a little description of the board: Here is shown a picture of the ambslp

11 AMBSLP VME 9 U Serial Links @ 2Gb/s Clock @100MHz Supply Voltages:
Power consumption: ~ 250 Watt LVDS System Clock Used for Synchronous logic On the board we have a system clock at one hundred mega herz and it is distributed on all devices, we have also dedicated clocks (in red squares) for the serial interfaces of the FPGA.

12 AMBSLP VME 9 U Serial Links @ 2Gb/s Clock @100MHz Supply Voltages:
Power consumption: ~ 250 Watt On the board we have a system clock at one hundred mega herz and it is distributed on all devices, we have also dedicated clocks (in red squares) for the serial interfaces of the FPGA.

13 New AM Power Distribution Conf.
48V 5V Crate Crate with modified J1 48V 12V 5V 1.2V 1.8V, 1V 2.5V 1V VME I/O VME I/O 12V I/O SerDes FPGAs AMchips 1.2V 1.8V 1V 2.5V 1V Increase margin on backplane pins current SerDes FPGAs I/O AMchips

14 Motherboard AMB + 4 Daughterboard LAMB
Power Consumption Motherboard AMB + 4 Daughterboard LAMB Net Tot Power Usage Current @ 12V Power @ 12V Power @ 48 V 48V FPGAs * TOT. 5 V (vme) - 5 W 2.5 V (I/O) 1,2 W 49.2 W 49.2 % 4.56 A 54.7 W 57 W 1.19 A 1.8 V 2.7 W 2.7W 12.5 % 260 mA 3.1W 3.2W 70 mA 1.2 V 2.5 W 8.03 W 55.8 % 760 mA 9.1W 9.5 W 200 mA 1 V (FPGA) 6 W 30% 570 mA 6.8 W 7.1 W 150 mA 1 V (CORE) 137 W 63.4 % 12.7 A 152W 159W 3.30 A Total 18.8 A Usage 54% 235 W 4.90 A Sum of 2.5V 1.8V 1.2V 1V (FPGA) measured** as 66 Watt, estimated 76 Watt * Xilinx Power Estimate ** AMBSLP+4 LAMBSLPs with V core OFF

15 Power Consumption Crate 48 V AM 4.90 A 10 (J1) 0.49 A 103 A 150 70 %
Current per Board N. Power Pins Current per Pin Current per Crate 16x(AM+AUX) Max. Current Available Percentage of Power supply use 48 V AM 4.90 A 10 (J1) 0.49 A 103 A 150 70 % AUX 1.6 A 8 (J0) 0.2 A 5 V 1 A 5 (J1) 87 A 120 A 72.5 % SSB 16 A 16 (all) CPU 7 A -

16 DC-DC converters placed on the board
Avoid to block air flux to the AMchips 12V 2.5V 1V 1V 1V 1.2V 1.8V 1V 1V 1V 1V

17 New Power Distribution Conf.
GE Critical Power QBDW033A0B41-HZ DC-DC From 36-75V to 12V, 400W output

18 New Power Distribution Conf.
USED TO GENERATE 1 V for Amchip core GE Critical Power, MVT040A0X3-SRPHZDC-DC From 4.5V–14.4V to 0.6V- 2V, 40A output Max. Cap. ESR ≥ 0.15 mΩ, 7000 µF DC-DC Feature: Connection 3 in Parallel to reach A in Output current (40A*3*0.9)

19 New Power Distribution Conf.
GE Critical Power MVT040A0X3-SRPHZDC-DC From 4.5V–14.4V to 0.6V- 2V, 40A output 100 Watt Load ALL TEMPERATURES BELOW 70 degrees without air flow!

20 New Power Distribution Conf.
Artesyn Embedded Technologies SIL40C2-00SADJ-HJ From V to V, 40A output USED TO GENERATE 2,5 V

21 New Power Distribution Conf.
GE Critical Power UDXS1212A0X3-SRZ 4.5 –14.4V input; 0.51V to 5.5V output 2 × 12A Output Current USED TO GENERATE 1,2 and 1,8 V

22 New Power Distribution Conf.
GE Critical Power UDXS1212A0X3-SRZ 3 –14.4V input; 0.45V to 5.5V output 20 A Output Current USED TO GENERATE 1 V for FPGA

23 PCB Design Detail Technical data: (1) Base material FR4 tg180°; (2) Surface finishing ENIG; 14 layers (8 GND&VCC, 6 signal layers); board size: 416 mm X 367 mm X 2.3 mm; 100 Ω differential impedence for the serial link distribution; Vias: (1) 0.25 mm under BGAs and signals; (2) 0.6 mm for high current; (3) 0.4 mm for general purpose; Gnd Gnd Gnd 1V Core, 12V , 48 V 2.5V, 1V FPGA 1.2V, 1.8 V Gnd Gnd

24 PCB Power design 48V 48V Unfused 12V DC DC Converter 48V to 12V
1V Core Up Daughter Board Connector 1V Core Down DC DC Converter 12V to 1V

25 PCB Power design DC DC Converter 12V to 2.5V 2.5V FPGA 1V

26 PCB Power design 5V 5V BIAS 5V Unfused DC DC Converter 12V to 1.2V
FPGA DC DC Converter 12V to 1.8V

27 Power Distribution Stress Test
Used passive Loads 0.02 Ω , 50 Watt each 34 Watt Exp. Power (AMchip06) Additional Cooling system for test

28 Power Distribution Stress Test
Total Power 317 Watt: PRBS Test (few minutes) Limited by cooling Simple functional Test

29 Voltage Drop, Voltage measure
Max Voltage Drop measured: 1V Core net: ~45 mV; Resistance calculated ~1mΩ 1.2V Core net: ~5 mV 2.5V Core net: ~19 mV We will recover this drops fixing the senses connection Our specification is to have max 50mV drop Difference voltage between two down side LAMBs: 13 mV

30 PCB Power design Bottleneck 48V 48V Unfused 12V DC DC Converter
48V to 12V 1V Core Up Daughter Board Connector Bottleneck 1V Core Down DC DC Converter 12V to 1V

31 PCB Power design: proposed change
48V 48V Unfused 12V DC DC Converter 48V to 12V 1V Core Up Daughter Board Connector 1V Core Down DC DC Converter 12V to 1V

32 Power Distribution Stress Test (2) To be done
Next test (before final prototype): Switching Active Loads ~100KHz With N-MOS Check the DC-DC response

33 LAMBSLP_V2 16 AMchips 1 High pin Count connector 1 Spartan6 Lx4 FPGA
FR4 Tg150°; 12 layers; Base material FR4 tg180°; Surface finishing ENIG

34 LAMBSLP_V2 Architecture
Input Data Two Fan-out stage Output Data Four Daisy chains 34

35 LAMBSLP_V2 Configuration
VMEDATA[31:24] LAMB 3 VMEDATA[23:16] LAMB 2 VMEDATA[15:8] LAMB 1 VMEDATA[7:0] LAMB 0 LAMB 3 VMEDATA[31] AM_chain 7 VMEDATA[30] AM_chain 6 VMEDATA[29] AM_chain 5 VMEDATA[28] AM_chain 4 VMEDATA[27] AM_chain 3 VMEDATA[26] AM_chain 2 VMEDATA[25] AM_chain 1 VMEDATA[24] AM_chain 0 VME operation will access in parallel 32 AMchip chains Each bit Transmit TMS and TDI and receive TDO 35

36 The High Frequency connector
1V SAMTEC ASP ASP GND MAX Current per 20° C A @ 95° C A We have : 2.5 V A per pin 1.2 V A per pin 1 V A per pin GND A per pin High speed connector: ~ 20 Gb/s 2,5V 1,2V Signals 36

37 AMchip layout 390 uF on 1V core 1V - YELLOW
(6x100nF, 4x4.7uF, 3x100uF) 130 uF on 1.2V core (3x100nF, 4x4.7uF, 1x100uF) 260 uF on 2.5V (5x100nF, 4x4.7uF, 2x100uF) 1V - YELLOW 1.2V- PURPLE 2.5 V - PINK GND - BLUE SIGNALS - GREEN 37

38 LAMB cross section Gnd 2.5V 1.2V 1V Core Gnd 1V Core Gnd Gnd 38

39 Tests Power test (already shown) Signal integrity PRBS tests
Block Transfer timing and pattern bank downloading Configuring FPGAs through VME Mechanical tests

40 Signal Integrity We focused on these two main issues :
Long lines (max 55 cm): we divided the lines in two pieces, refreshing the signal with a buffer repeater LAMBs Crowd of differential pairs (200 diff. pairs in a PCB 14 cm2); 40

41 Signal Integrity Without guard lines Guard lines Floating
Guard lines GND Guard lines Terminated to 50 Ω On a PCB we realized all these configurations, and measured the quality of the signals 41

42 Signal Integrity As expected the best are: Guard lines GND
BER = 20*10-18 42

43 Signal Integrity Problem on AMchip05 Package
Inside the Package substrate, there is a impedance mismatch on the input lines BER = 20*10-6 43

44 Signal Integrity With TDR measure we discovered the mistake:
In AMchip06 the problem will be fixed 44

45 PRBS Test on Input ~ 60 hours Without Any errors PRBS-7 Check on
Input FPGA 64 AMchips ~ 60 hours Without Any errors PRBS-7 From AUX 45

46 PRBS Test on Output ~ 60 hours Without Any errors PRBS-7 Check on
Output FPGA AUX Rx ~ 60 hours Without Any errors PRBS-7 from AMchips AUXcard 46

47 PRBS between AMchips ~ 60 hours Without Any errors PRBS-7 Check on
Input Daisy chain PRBS-7 Gen. ~ 60 hours Without Any errors PRBS-7 Gen. PRBS-7 Gen. 47

48 System Configuration AM Bank
All the AM system will be installed into 8 VME crate: Each crate can be configured independently The configuration time is dominated by AM bank In the AMboard we can write 16 JTAG chain made of 2 AMchips in parallel N. patterns to be written sequentially = 2 * 128 k patterns N. of bit to write each pattern (TMS, TDI) = 300 ops/pattern Time/BT operation = 350 ns (VME block transfer implemented, see next slide) TOT time/board = 2 * 128 k * 300 * 350 ns = ns= s/board TOT time/crate = s/board *16 AMboard = 430 s = 7 minuts/crate 48

49 Block Transfer timing 100 ms 350 ms DS_
# of Block Transfer cycles= 16*64=1024  time/BT cycle ~ 350 ns AS_ 49

50 System Configuration AM Bank
Time needed by AMBoard : 7min/Crate Time needed by AUXcard: 16min/Crate If we use a Flash RAM (already added to AMBSLP_V3 ), we can configure both AUX and AMBoard in parallel Time to configure an AMBSLP could be ~= time due to configure AUX (AM bank dominates) once JTAG will be driven by VME chip firmware. FLASH RAM S34MS01G2_04G2: 4 Gb for 1 Gb bank (144 bits * 128kpatterns*64 chips) 50

51 System Configuration FPGAs
Time needed by AUXcard for FPGAs configuration through VME : 5min/board In the AMboard, the firmware to configure FPAGAs through VME is not implemented yet but the JTAG chain of FPGAs has been connected to the VME chip on the AMBoard_V3 Note: full-board configuration rarely needed (eg: power cycle / change patterns) (few/year) 51

52 Mechanical test The backplane Wiener is more thick than the CDF one
We changed the AMBSLP shape and moved the P connector of 2mm to improve the contact on Wiener bin We have to do the PRBS test with the AUX card in the Wiener bin. Almost 3 mm 52

53 To do AMBSLP - Summary Fix few simple things (eg: sense wires, 1 V area, 1 FPGA pin) Fix 2.5 V DCDC Footprint Test FPGA programmability from VME AM patterns downloading fast procedure implementation (use of Flash Ram?) Stress test with Switching Active Loads Stress test with 3 AMBSLP one near to the other Long random bank-random hits tests 53

54 To do for the final LAMB version LAMBSLP_V3
Change the footprint according with the new AMchip06 Pin out Put two temperature sensors, in hot spots. Read “hold” signal among the AMchips Connect the sense signal in the farthest position Add some control signal (NoRoad) 54

55 WHAT WE HAVE today: MOTHERBOARD 1 AMBSLP_v1
3 AMBSLP_v2 (Pisa, Thessaloniki, CERN) 1 AMBSLP_v3 other 2 next week MEZZANINE 2 MiniLAmb 3 LAMBs_V1: 2 w 4 AM05s (Melbourne, Thessaloniki), 1 w 16 AM05s (Pisa) 4 LAMBS_V2 with 16 chips each (Pisa) 55

56 Conclusions: NEXT STEPS
The system is working, but no extensive tests done up to now We will complete tests before submitting a new PCB for both LAMBSLP and AMBSLP: AMBSLP: (1) new cooling tests compatible with T-sim suggestions, (2) complete tests with MC events at full power for AM core, (3) improve pattern downloading speed (flash ram added on board), (4) add capability to download firmware from VME, + other details …. We ordered the FPGAs what about the other components? 56

57 Thanks!

58 1/3-AMchip info AM chip power 2.5V typ 90mA max 200mA
Typical power from 2.5V and 1.2V ~0.32W Typical values measured at 2.0Gb/s “normal conditions” Max values are worst combination of datasheet typical values at 2.4Gb/s 1.0V highly data dependent: core power can go from very low to full power when data arrives (and then back at stop)

59 2/3 - Amchip core 1.0 V power possible peaks
highly data dependent Nominal: 2.14 A (extrap. from AMchip05 measurement) Assume input data at full speed 100MHz Assume 50% bit swapping 100% bit swapping would double this value! Add consumption of glue logic (top level) + 6 mA Add worst case (temperature, voltage, process)*1.2 Expected Total ~3 A 100% bit swap would be huge (double) Can we cover 50% bit swap in worst case cond.? Max available current 3.3A/chip (105A from 3 DC-DC for 32 Amchips) but current per connector pin > 1,6 A  too high  3A/chip limit 3.3 A is feasible if the PS voltage goes from 12 V  14 V.

60 3/3 - More realistic Amchip power
We cover 50% bit swap at 100MHz in worst case conditions for power distribution What is a realistic upper limit for power dissipation? 2W typ (2.6W realist max) See next slide

61 About AM06 core power Power proportional to the number of hits
Also proportional to the number of switching bits Assume 50% switch probability for all 16 input bits We will probably use only about bits depending on layer Average hits per event 2900 FTK TDR table 5 page 77 endcap (extrapolated to mu=80) 3200 Last slide here rounded up 4300 Last slide here worst case with IBL rounded up AMchip estimated “worst case” core power ~3W [full input rate (8000 hits/ev), 50% probability swap for all 16 bits, Typical consumption (2.5 W)+20% to cover worst case] Expected power at the simulated input rate (80 PU)→ 3W*3200/8000= 1.2W Goal: 80% maximum speed 3W*2*3200/8000= 2.4W core FTK system requirement 70% Design power distribution for 2.4W core max usage (80% worst case) Design cooling for 2.4W = core (80% typ) + IO: Power/chip = 2.5W typ core * IO = 2.4W total

62

63 realistic upper limit for power dissipation
ttbar mu=80 FTK used Layers: Pix0,1,2, SCT0, 2, 4, 5,6 Less than 50% usage Nhits Expected power ~< 2W AMchip limit ~ 960 IBL Pixel 0 Worst case usage channels 0-7: IBL words - Pix words Pix1 ~500 words Pix2 ~400 words Lay *< 400 words Total 4300/8000 ~ only 54 MHz instead of 100 MHz available  ~half of the clock cycles have data If needed interlock on data rate can be implemented in the AM board. Layers Towers

64 Worse case Board Temperature Simul.
12 V AMBV2 AMBV3

65 FTK Rear Transition Module -VME
Block Diagram for a typical VME Slave Module Interface. The Rear Transition Module (RTM) receives power from the crate, but is not part of the VME data transfer bus. Generally, the RTMs can not be accessed via the Crate CPU, and are used only to bring IO to the front processing module. The AUX board has instead important processing capability To Lambs Data[31:0] Addr[26:0] Data[31:0] Addr[26:0] Ctrl lines Mircea Bogdan, November 11, 2014


Download ppt "FTK AM boards Design Review"

Similar presentations


Ads by Google