Download presentation
Presentation is loading. Please wait.
1
ALICE Data Challenges On the way to recording @ 1 GB/s
2
What is ALICE
9
ALICE Data Acquisition architecture
Inner Tracking System Time Projection Chamber Particle Identification Photon Spectrometer Trigger Detectors Muon Trigger decisions Front-End Electronics Trigger data Detector Data Link Trigger Level 0 Local Data Concentrator Readout Receiver Card Trigger Level 1 Event Building switch 3 GB/s Event Destination Manager Trigger Level 2 Global Data Collector Storage switch 1.25 GB/s Perm. Data Storage
10
ALICE running parameters
Two different running modes: Heavy Ion (HI): 106 seconds/year Proton: 107 seconds/year One Data Acquisition system (DAQ): Data Acquisition and Test Environment (DATE) Many triggers classes each providing events at different rates, sizes and sources HI data rates: 3 GB/s 1.25 GB/s ~1 PB/year to mass storage Proton run: ~ 0.5 PB/year to mass storage Staged DAQ installation plan (20% 30% 100%): 85 300 LDCs, 10 40 Global Data Collectors (GDC) Different recording options: Local/remote disks Permanent Data Storage (PDS): CERN Advanced Storage Manager (CASTOR)
11
History of ALICE Data Challenges
Started in 1998 to put together a high-bandwidth DAQ/recording chain Continued as a periodic activity to: Validate interoperability of all existing components Assess and validate developments, trends and options commercial products in-house developments Provide guidelines for ALICE & IT development and installation Continuously expand up to ALICE requirement at LHC startup
12
Performance goals MBytes/s
13
Data volume goals TBytes to Mass Storage
14
The ALICE Data Challenge IV
15
Components & modes LDC emulator ALICE DAQ Objectifier CASTOR FE CASTOR
PDS RAW DATA OBJECTS RAW EVENTS Private network CERN backbone AFFAIR CASTOR monitor
16
Targets DAQ system scalability tests Single peer-to-peer tests:
Evaluate the behavior of the DAQ system components with the available HW Preliminary tuning Multiple LDC/GDC tests: Add the full Data Acquisition (DAQ) functionality Verify the objectification process Validate & benchmark the CASTOR I/F Evaluate the performance of new hardware components: New generation of tapes 10 Gb Ethernet Achieve a stable production period: Minimum 200 MB/s sustained 7 days non stop 200 TB data to PDS
17
Software components Configurable LDC Emulator (COLE)
Data Acquisition and Test Environment (DATE) 4.2 A Fine Fabric and Application Information Recorder (AFFAIR) V1 ALICE Mock Data Challenge objectifier (ALIMDC) ROOT (Object-Oriented Data Analysis Framework) v3.03 Permanent Data Storage (PDS): CASTOR V Linux RedHat 7.2, kernel 2.2 and 2.4 Physical pinned memory driver (PHYSMEM) Standard TCP/IP library
18
Hardware setup ALICE DAQ: infrastructure & benchmarking
NFS & DAQ servers SMP HP Netserver (4 CPUs): setup & benchmarking LCG testbed (lxshare): setup & production 78 CPU servers on GE Dual ~1GHz Pentium III, 512 MB RAM Linux kernel 2.2 and 2.4 NFS (installation, distribution) and AFS (unused) [ ] DISK servers (IDE-based) on GE Mixed FE/GE/trunk GE, private & CERN backbone 2 * Extreme Networks Summit 7i switches (32 GE ports) 12 * 3COM 4900 switches (16 GE ports) CERN backbone: Enterasys SSR8600 routers (28 GE ports) PDS: 16 * 9940B tape drives in two different buildings STK linear tapes, 30 MB/s, 200 GB/cartridge
19
Networking LDCs & GDCs DISK servers 3 3 3 3 2 2 2 6 CPU servers on FE
Backbone (4 Gbps) 16 TAPE servers (distributed)
20
Scalability test Put together as many hosts as possible to verify the scalability of: run control state machines control and data channels DAQ services system services hardware infrastructure Connect/control/disconnect plus simple data transfers Data patterns, payloads and throughputs uninteresting Keywords: usable, reliable, scalable, responsive
21
Scalability test
22
Single peer-to-peer Compare: Architectures Network configurations
System and DAQ parameters Exercise: DAQ system network modules DAQ system clients and daemons Linux system calls, system libraries and network drivers Benchmark and tune: Linux parameters DAQ processes, libraries and network components DAQ data flow
23
Single peer-to-peer MB/s % CPU/MB Event size (KB)
10 20 30 40 50 60 70 80 90 100 110 200 400 600 800 1000 1200 1400 1600 1800 2000 Event size (KB) MB/s 0.00 1.00 2.00 3.00 4.00 5.00 % CPU/MB Transfer speed GDC CPU usage LDC CPU usage HP Netserver LH6000, 4*Xeon 700 MHz, 1.5 GB RAM, 3COM BaseT, tg3 driver, RedHat 7.3.1, kernel x.cernsmp
24
Full test runtime options
Different trigger classes for different traffic patterns Several recording options NULL GDC disk CASTOR disks CASTOR tapes Raw data vs. ROOT objects We concentrated on two major traffic patterns: Flat traffic: all LDCs send the same event ALICE-like traffic: periodic sequence of different events distributed according to forecasted ALICE raw data
25
Performance Goals MBytes/s 650 MB/s
26
Flat data traffic 40 LDCs * 38 GDCs 1 MB/event/LDC NULL Occupancies:
Critical item: load balancing over the GE trunks (2/3 nominal)
27
Load distribution on trunks
100 200 300 400 500 1 2 3 4 5 6 7 # LDCs Distributed Same switch MB/s 3 GE trunks: 330 MB/s -> why 220 MB/s? Challenge: 200 MB/s sustained, each switch used for: DATE and CASTOR -> contention separate switches -> insufficient # switches
28
ALICE-like traffic LDCs: rather realistic simulation
partitioned in detectors no hardware trigger simulated readout, no “real” input channels GDCs acting as: event builder CASTOR front-end Data traffic: Realistic event sizes and trigger classes Partial detector readout Networking & nodes’ distribution scaled down & adapted
29
Challenge setup & outcomes
~ 25 LDCs TPC: 10 LDCs others detectors: [ ] LDCs ~ 50 GDCs Each satellite switch: 12 LDCs/GDCs (distributed) [ ] tape servers on the CERN backbone [ ] tape drives attached to a tape server No objectification named pipes too slow and too heavy upgraded to avoid named pipes: ALIMDC/CASTOR not performing well ITS Pix: 2 ITS Drift: 3 ITS Strips: 1 TPC: 10 TRD: 2 TOF: 1 PHOS: 1 HMPID: 1 MUON: 2 PMD: 2 TRIGGER: 1
30
Impact of traffic pattern
FLAT/CASTOR ALICE/NULL ALICE/CASTOR
31
Performance Goals MBytes/s 200 MB/s
32
Production run 8 LDCs*16 GDCs, 1 MB/event/LDC (FLAT traffic)
[ ] tape server and tape units 7 days at ~300 MB/s sustained, > 350 MB/s peak, ~ 180 TB to tape 9 Dec: too much input data 10 Dec: HW failures on tape drives & reconfiguration Despite the failures, always exceeded the performance goals
33
System reliability Hosts: ~ 10% Dead On Installation
~ 25% Failed On Installation Long period of short runs (tuning): occasional problems (recovered) with: name server network & O.S. in average [ ] O.S. failures per week (on 77 hosts) unrecoverable occasional failures on GE interfaces Production run: one tape unit failed and had to be excluded DOI: power supply, Hard Disk, BOOT failure, bad BIOS, installation failed FOI: no AFS, not all products, NIC working at FE or at simple duplex, single CPU - usually solved by a complete re-installation, sometimes repeated twice
34
Outcomes DATE 80 hosts/160+ roles with one run control
Excellent reliability and performance Scalable and efficient architecture Linux Few hiccups here and there but rather stable and fast Excellent network performance/CPU usage Some components are too slow (e.g. named pipes) More reliability needed from the GE interfaces GDCs: 70% of each CPU free for extra activities CPU user: 1% CPU system (interrupts, libs): [ ]% Input rate: [ ] MB/s (average: 7 MB/s) LDCs: [ ]% CPU busy CPU user: [3..6]% CPU system: [10..25]% In reality: higher CPU user, same CPU system (pRORC: no interrupts) MEMORY: LDC: TPC can buffer 300/3.79 ~ 80 events: no problems GDC: every central event needs~ 43 MB. The event builder has to allocate maxEvent Size to each LDC (to ensure forward progress). In reality events are allocated on-the-fly according to their real size, but the minimum has to be ensured statically NETWORK: configured as a mesh with same resources on each node. The DAQ needs a tree with asymmetric assignment of branches (to absorb all the traffic)
35
Outcomes FARM installation and operation: not to be underestimated!
CASTOR Reliable and effective Improvements needed on: Overloading Parallelizing tape resources Tapes One DOA and one DOO Network: silent but very effective partner Layout made for a farm, not optimized for ALICE DAQ 10 GB Ethernet tests: failure at first problem “fixed” too late for the Data Challenge reconfiguration: transparent to DAQ and CASTOR
36
Future ALICE Data Challenges
Continue the planned progression ALICE-like pattern Record ROOT objects New technologies CPUs Servers Network NICs Infrastructure Beyond 1 GbE Insert online algorithms Provide some “real” input channels Get ready to record at 1.25 GB/s MBytes/s
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.