Two running modes Dr Jekyll… … and Mr Hyde pp beam Pb-Pb collisions general-purpose heavy ion experiment … and Mr Hyde pp beam large cross-section pp processes
ALICE data rates 1 month (106 s) 1 Minimum Bias 20 1 - 87 Central 67 - 87 Dielectrons 200 Dimuon 670 0.7 - 2.4 24.5 2.5 1.25 NA 0.5 0.1 500 2 Pb-Pb run pp run Event rate (Hz) Event size (MB) Data in DAQ (GB/s) Data in EB (GB/s) Data on tape (GB/s) 10 months Run period Total on tape (PB) Trigger type
The original architecture
Detector Data Link Functions: main interface with the detectors handle detector-to-LDC data flow handle LDC-to-detector commands & data Keywords: cheap small functional rad-hard long distance optical used everywhere
Local Data Concentrator Functions: handle and control the local DDL(s) format the data perform local event building allow monitoring functions ship events to the event builders (GDCs) Keywords: distributed good data moving capabilities from the DDL to the Event Building Link CPU power not indispensable Not a farm
Global Data Collector Functions: accept the data sent from the LDCs perform final event building ship the events to the Permanent Data Storage (PDS) Keywords: distributed good data moving capabilities from the LDCs to the PDS CPU power not indispensable farm
Event Destination Manager Functions: collect availability information from the GDCs distribute event distribution policies to the data sources Keywords: optimized network usage look ahead capabilities
Event Building Link Functions: Move data from the LDCs to the GDCs Keywords: big events (1-3, 67-87 MB) low rates (20, 500, 670 Hz) many-to-many mainly unidirectional
Overall key concepts Keep forward flow of data Allow back-pressure at all levels (DDL, EBL, STL) Standard Hw and Sw solutions sought: ALICE collaboration CERN computing infrastructure Whenever possible go COTS During the pp run, keep any unused hardware busy
Mismatch of rates Recent introduction of: Transition Radiation Detector (TRD) Dielectron trigger change in Pixel event size increase in estimated TPC average occupancy Required throughput an order of magnitude too high! New scenarios: region-of-interest readout online compression online reconstruction introduction of a level 3 trigger
The new architecture
The Event Building process Events flow asynchronously into the LDCs Each LDC performs - if needed - local event building The Level 3 farm - if present - is notified Level 3 decision - if any - is sent to LDCs and GDC All data sources decide where to send the data according to: directives from the Event Destination Manager the content of the event The chosen GDC receives: sub-events optional reconstructed and compressed data optional level 3 decision The Event Building Link does the rest
Software environment DATE Data acquisition environment for ALICE and test beams Support DDLs, LDCs, GDCs and liaison to the PDS Standalone and complex DAQ systems Integrated with HPSS and CASTOR (via CDR) Keywords: C TCP/IP Tcl/Tk Java ROOT
Data challenges Use state-of-the-art equipment for real-life exercise 1998-1999: Challenge I 7 days @ 14 MB/s, 7 TB 1999-2000: Challenge II 2 * 7 days @ max 100 MB/s, > 20 TB transfer simulated TPC data 23 LDCs * 20 GDCs (AIX/Solaris/Linux) with offline filtering algorithms and online objectifier (ROOT) two different MSS (HPSS and CASTOR) several problems limited stability
Data Challenge II
Event building network Pure Linux setup 20 data sources FastEthernet local connection GigaBit Ethernet backbone
Run log
Data Challenge III Will run during the winter 2000-2001 shutdown Target: 100 MB/s (or more) sustained over [7..10] days Improved stability More “ALICE like” setup abandon older architectures still in use at the test beams Implement 10% of the planned ALICE EB throughput Integrate new modules & prototypes: improved event building Level 3 Regional Centers Will use the LHC computing testbed Better status reporting tools: use PEM if available