ALICE Data Challenge V P. VANDE VYVRE – CERN/PH LCG PEB - CERN March 2004.

Slides:



Advertisements
Similar presentations
6/2/2015Bernd Panzer-Steindel, CERN, IT1 Computing Fabric (CERN), Status and Plans.
Advertisements

1P. Vande Vyvre - CERN/PH ALICE DAQ Technical Design Report DAQ TDR Task Force Tome ANTICICFranco CARENA Wisla CARENA Ozgur COBANOGLU Ervin DENESRoberto.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
CERN - European Laboratory for Particle Physics HEP Computer Farms Frédéric Hemmer CERN Information Technology Division Physics Data processing Group.
MSS, ALICE week, 21/9/041 A part of ALICE-DAQ for the Forward Detectors University of Athens Physics Department Annie BELOGIANNI, Paraskevi GANOTI, Filimon.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
System performance monitoring in the ALICE Data Acquisition System with Zabbix Adriana Telesca October 15 th, 2013 CHEP 2013, Amsterdam.
Evaluation of the LDC Computing Platform for Point 2 SuperMicro X6DHE-XB, X7DB8+ Andrey Shevel CERN PH-AID ALICE DAQ CERN 10 October 2006.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
ALICE DAQ Plans for 2006 Procurement, Installation, Commissioning P. VANDE VYVRE – CERN/PH for LHC DAQ Club - CERN - May 2006.
1 Alice DAQ Configuration DB
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
V. Altini, T. Anticic, F. Carena, W. Carena, S. Chapeland, V. Chibante Barroso, F. Costa, E. Dénes, R. Divià, U. Fuchs, I. Makhlyueva, F. Roukoutakis,
The ALICE DAQ: Current Status and Future Challenges P. VANDE VYVRE CERN-EP/AID.
The ALICE Data-Acquisition Software Framework DATE V5 F. Carena, W. Carena, S. Chapeland, R. Divià, I. Makhlyueva, J-C. Marin, K. Schossmaier, C. Soós,
LHC Computing Review Recommendations John Harvey CERN/EP March 28 th, th LHCb Software Week.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
Sejong STATUS Chang Yeong CHOI CERN, ALICE LHC Computing Grid Tier-2 Workshop in Asia, 1 th December 2006.
4 Dec 2006 Testing the machine (X7DBE-X) with 6 D-RORCs 1 Evaluation of the LDC Computing Platform for Point 2 SuperMicro X7DBE-X Andrey Shevel CERN PH-AID.
ALICE Computing Model The ALICE raw data flow P. VANDE VYVRE – CERN/PH Computing Model WS – 09 Dec CERN.
Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
NL Service Challenge Plans Kors Bos, Sander Klous, Davide Salomoni (NIKHEF) Pieter de Boer, Mark van de Sanden, Huub Stoffers, Ron Trompert, Jules Wolfrat.
AFFAIR – fabric monitoring ROOT 2005 Tome Antičić Ruđer Bošković Institute, Zagreb,Croatia ALICE,CERN Tome Antičić Ruđer Bošković.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
NA49-future Meeting, January 26, 20071Ervin Dénes, KFKI - RMKI DATE the DAQ s/w for ALICE (Birmingham, Budapest, CERN, Istanbul, Mexico, Split, Zagreb.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
LHCb DAQ system LHCb SFC review Nov. 26 th 2004 Niko Neufeld, CERN.
1 LHCC RRB SG 16 Sep P. Vande Vyvre CERN-PH On-line Computing M&O LHCC RRB SG 16 Sep 2004 P. Vande Vyvre CERN/PH for 4 LHC DAQ project leaders.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
P. Vande Vyvre – CERN/PH CERN – January Research Theme 2: DAQ ARCHITECT – Jan 2011 P. Vande Vyvre – CERN/PH2 Current DAQ status: Large computing.
LHC experimental data: From today’s Data Challenges to the promise of tomorrow B. Panzer – CERN/IT, F. Rademakers – CERN/EP, P. Vande Vyvre - CERN/EP Academic.
The Past... DDL in ALICE DAQ The DDL project ( )  Collaboration of CERN, Wigner RCP, and Cerntech Ltd.  The major Hungarian engineering contribution.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
R.Divià, CERN/ALICE Challenging the challenge Handling data in the Gigabit/s range.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
ALICE Online Upgrade P. VANDE VYVRE – CERN/PH ALICE meeting in Budapest – March 2012.
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
Pierre VANDE VYVRE ALICE Online upgrade October 03, 2012 Offline Meeting, CERN.
CWG13: Ideas and discussion about the online part of the prototype P. Hristov, 11/04/2014.
ALICE RRB-T ALICE Computing – an update F.Carminati 23 October 2001.
ALICE experiences with CASTOR2 Latchezar Betev ALICE.
R.Divià, CERN/ALICE 1 ALICE off-line week, CERN, 9 September 2002 DAQ-HLT software interface.
Remigius K Mommsen Fermilab CMS Run 2 Event Building.
P. Vande Vyvre – CERN/PH for the ALICE collaboration CHEP – October 2010.
1 ALICE Summary LHCC Computing Manpower Review September 3, 2003.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
The ALICE Data-Acquisition Read-out Receiver Card C. Soós et al. (for the ALICE collaboration) LECC September 2004, Boston.
SJ – June CERN openlab for DataGrid applications Sverre Jarp CERN openlab CTO IT Department, CERN.
ALICE Computing Data Challenge VI
Challenges in ALICE and LHCb in LHC Run3
OpenLab Enterasys Meeting
NL Service Challenge Plans
PC Farms & Central Data Recording
LHC experiments Requirements and Concepts ALICE
Service Challenge 3 CERN
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
Update on Plan for KISTI-GSDC
openLab Technical Manager
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
ALICE Data Challenges On the way to 1 GB/s
ALICE Data Challenges Fons Rademakers Click to add notes.
Presentation transcript:

ALICE Data Challenge V P. VANDE VYVRE – CERN/PH LCG PEB - CERN March 2004

LCG PEB March 20042P. VANDE VYVRE CERN-EP Trigger Level 0,1 Trigger Level 2 High-Level Trigger Transient Data Storage (TDS) Event-Building Network Storage network Detector Digitizers Front-end Pipeline/Buffer Decision Readout Buffer Decision Sub-event Buffer Local Data Concentrators (LDC) Event Buffer Global Data Collectors (GDC) Permanent Data Storage (PDS) Decision Detector Data Link (DDL) Data Logical Model and Requirements Tested during ADC 25 GB/s 2.50 GB/s 1.25 GB/s

LCG PEB March 20043P. VANDE VYVRE CERN-EP Architecture & Performance Goals (1) DAQ project: System size and scalability: Scale similar to ALICE year 1 (2007 pp and PbPb runs) 30 % of final performance: scalability up to 150 nodes System performances: ALICE data traffic: verify optimal usage of computing resource Verify load balancing From DAQ to MSS Tape: 300 MB/s sustained over a week Disk: 450 MB/s peak needed Performance monitoring

LCG PEB March 20044P. VANDE VYVRE CERN-EP Architecture & Performance Goals (2) Offline project Simulated raw data from several detectors (large and small data fragments) Used during ADC V: TPC, ITS Other detectors: dummy data of realistic size Different trigger classes and detector sets with realistic multiplicity Read data back Improve Alimdc (ROOT formatting program) /CASTOR performance Algorithms from HLT project used for data monitoring purposes Automatic registration of files in the AliEn catalogue for world-wide availability

LCG PEB March 20045P. VANDE VYVRE CERN-EP Technology Goals CPU servers: Mostly Dual CPUs (LXSHARE) SMP machines (HP Netservers) for DAQ services (ALICE) IA 64 technology: test DATE code on Itanium Network: New generation of NIC cards (Intel Pro 1000) Trunking 10 Gbit Eth. Backbone. Including NICs Storage: Disk servers 23 New IDE-based disk servers (Nominal performance: 90 MB/s) Tapes STK 9940B : ~ 30 MB/s, ~ 200 GB/vol.

LCG PEB March 20046P. VANDE VYVRE CERN-EP HW Architecture ~ 80 CPU servers 2 x 2.4 GHz Xeon, 1 GB RAM, Intel 8254EM Gigabit in PCI-X 133 (Intel PRO/1000), CERN Linux x 7 Disk servers 2 x 2.0 GHz Xeon 1 GB RAM Intel 82544GC 32 x GE 32 IA64 HP-rx2600 Servers 2 x 1 GHz Itanium-2 2 GB RAM Broadcom NetXtrem BCM5701 (tg3) RedHat Advanced Workstation GB/s to memory, 4.0 GB/s to I/O 3COM x Gbit Enterasys E1 OAS 12 Gbit, 1 x 10 Gbit Enterasys ER16 16 slots 4/8 x Gbit or 1 x 10 Gbit/slot 3COM 4900 LDCs GDCs

LCG PEB March 20047P. VANDE VYVRE CERN-EP System Setup: CPU servers CPU servers requested (Cocotime) CPU servers allocated & used Comments LCGOpenlab March Not used by ALICE due to an internal review April Jul. 2003~ 805 DAQ + network tests, addition of IA64 nodes, setup perf. mon. Aug. 2003~ 8020 New CPU SEIL servers Network problems Sep. 2003~ 8020 Broadcom NIC replaced by Intel Enterasys ER16 replaced by N7 Oct ~ 8020 Nov ~ 8020 Dec (70)20 Jan (70)20 Production Feb (70)20 (+ 15) Production

LCG PEB March 20048P. VANDE VYVRE CERN-EP System Setup: Storage Number of disk servers Requested Bandwidth to disk (MB/s) Measured Bandwidth to disk (MB/s) Requested Bandwidth to tape (MB/s) Measured Bandwidth to tape (MB/s) March April Oct Nov Dec Jan Feb

LCG PEB March 20049P. VANDE VYVRE CERN-EP ALICE DC: Scalability

LCG PEB March P. VANDE VYVRE CERN-EP ALICE DC – DAQ Bw MBytes/s.

LCG PEB March P. VANDE VYVRE CERN-EP Trunking ADC IV # LDCs Distributed Same switch MB/s Trunk of 3 x Gb Eth

LCG PEB March P. VANDE VYVRE CERN-EP Trunking ADC V Trunk of 4 x Gb Eth

LCG PEB March P. VANDE VYVRE CERN-EP ALICE DC – MSS Bw (1) MBytes/s.  alimdc/rootd/castor bw between 2 nodes: 30 MB/s

LCG PEB March P. VANDE VYVRE CERN-EP ALICE DC – MSS Bw (2)

LCG PEB March P. VANDE VYVRE CERN-EP ALICE DC – MSS Bw (3)

LCG PEB March P. VANDE VYVRE CERN-EP Achievements (1) System size System scalability (Hw and DATE Sw) Performance test with ALICE data traffic ALICE-like traffic LDCs working in ALICE conditions: Realistic ratio of event rate and sub-event sizes from 1 LDC to another ALICE-like events using simulated data: Realistic (sub-)event size on tape (ALICE year 1) DATE load-balancing demonstrated and used Sustained bw to tape not achieved Peak 350 MB/s Reached production-quality level only last week of test Sustained 280 MB/s over 1 day but with interventions IA-64 from Openlab successfully integrated in the ADC V

LCG PEB March P. VANDE VYVRE CERN-EP Achievements (2) Simulated raw data used for performance test Several detectors Several triggers Data read back from CASTOR Data read back and verified Data fully reconstructed Alimdc/CASTOR bw: from 3 to 10 MB/s per data stream Algorithms from HLT successfully integrated

LCG PEB March P. VANDE VYVRE CERN-EP Hardware components Network LDCs and GDCs: stable and scaleable including trunking Between GDCs and disk servers: Unreliable Trunking not scaling as expected Module broken and replaced twice in Enterasys router Network either seriously degraded or completely unusable 10 Gbit Eth. Backbone New generation of NIC cards (Intel Pro 1000) NIC from Broadcom unreliable. Replaced by Intel Pro Several CPU servers unusable (~3 out of 70) Storage Hardware problems on the disk servers (unrecovered hard disks failure) Unfortunate reaction from CASTOR concentrating requests to the faulty machine Several last minute workarounds needed (scripts for monitoring and reconfiguring)

LCG PEB March P. VANDE VYVRE CERN-EP Open issues and future goals CASTOR: Unsupervised recovery from malfunctioning disk server New stager Special daemon should be put back in main development Used instead of standard RFIO daemon to achieve adequate performance. New xrootd daemon from BaBar. DAQ Increase performances Improve performance monitoring package (AFFAIR) Offline Realistic data for more detectors More remote sites accessing the raw data (monitoring and prompt reconstruction) Data streaming per trigger or detector Run HLT inline in alimdc and not anymore semi-realtime Network First generation of 10 Gig cards from Enterasys unreliable No indication of hardware failure Enterasys support took a long time to resolve the problem

LCG PEB March P. VANDE VYVRE CERN-EP ALICE DC – DAQ Bw revised MBytes/s.

LCG PEB March P. VANDE VYVRE CERN-EP Conclusions Computing Data Challenge is still the best tool for: exercising the fabric, demonstrating the software, verifying interfaces ADC V: Lots of achievements but… 1 major performance milestone missed Trouble with the network due to the Enterasys equipment under beta test A lot of work and milestones in front of us Next Computing ADC: 50% more on performance milestones Simulated raw data from all major detectors Preparatory work needed to test each component independently before their integration

LCG PEB March P. VANDE VYVRE CERN-EP Postscript A lot of people from IT and ALICE have spent quite some time and hard work on this DC DC are and will remain manpower intensive exercises as will the LHC computing be Excellent collaboration between all groups and projects involved Regular meetings: Very constructive attitude Informal but extremely efficient atmosphere Thanks to all for enthusiast and effective contribution !