Michal Turala Warszawa, 25 February Computing development projects GRIDS M. Turala The Henryk Niewodniczanski Instytut of Nuclear Physics PAN and the Academic Computing Center Cyfronet AGH, Kraków
Michal Turala Warszawa, 25 February Outline - computing requirements of the future HEP experiments - HEP world wide computing models and related grid projects - Polish computing projects: PIONIER and GRIDS - Polish participation in the LHC Computing Grid (LCG) project
Michal Turala Warszawa, 25 February Data preselection in real time - many different physics processes - several levels of filtering - high efficiency for events of interest - total reduction factor of about 10 7 LHC data rate and filtering Level 1 - Special Hardware Level 2 - Embedded Processors/Farm 40 MHz (1000 TB/sec) equivalent) Level 3 – Farm of commodity CPU 75 KHz (75 GB/sec)fully digitised 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) Data Recording & Offline Analysis
Michal Turala Warszawa, 25 February Data rate for LHC p-p events Typical parameters: Nominal rate events/s (luminosity /cm 2 s, collision rate 40MHz) Registration rate- ~100 events/s (270 events/s) Event size - ~1 M Byte/ event (2 M Byte/ event) Running time~ 10 7 s/ year Raw data volume~ 2 Peta Byte/year/experiment Monte Carlos~ 1 Peta Byte/year/experiment The rate and volume of HEP data doubles every 12 months !!! Already today BaBar, Belle, CDF, DO experiments produce 1 TB/ day
Michal Turala Warszawa, 25 February Data analysis scheme Interactive Data Analysis Interactive Data Analysis Processed Data Processed Data GB/sec 200 TB / year Detector Raw data EventReconstructionEventReconstruction EventSimulationEventSimulation One Experiment 35K SI95 ~200 MB/sec 250K SI95 350K SI95 64 GB/sec 350K SI95 64 GB/sec 500 TB 1 PB / year ~100 MB/sec analysis objects Event Filter (selection & reconstruction) Event Filter (selection & reconstruction) Event Summary Data Event Summary Data Batch Physics Analysis Analysis 0.1 to 1 GB/sec Thousands of scientists from M. Delfino
Michal Turala Warszawa, 25 February Multi-tier model of data analysis
7 LHC computing model (Cloud) CERN Tier2 Lab a Uni a Lab c Uni n Lab m Lab b Uni b Uni y Uni x Physics Department Desktop Germany Tier 1 USA FermiLab UK France Italy NL USA Brookhaven ………. The LHC Computing Centre
8 ICFA Network Task Force (1998): required network bandwidth (Mbps) 100–1000 X Bandwidth Increase Foreseen for See the ICFA-NTF Requirements Report:
Michal Turala Warszawa, 25 February LHC computing – specifications for Tier0 and Tier1 CERN ALICE ATLAS CMS LHCb CPU (kSI95) Disk Pool (TB) Aut. Tape (TB) Shelf Tape (TB) Tape I/O (MB/s) Cost (MCHF) Tier 1 CPU (kSI95) Disk Pool (TB) Aut. Tape (TB) Shelf Tape (TB) Tape I/O (MB/s) # Tier Cost av (MCHF)
Michal Turala Warszawa, 25 February Development of Grid projects
11 Applications Infrastructure Middleware Infrastructure DataTag Computing EuroGrid, DataGrid, Damien Tools and Middleware GridLab, GRIP Applications EGSO, CrossGrid, BioGrid, FlowGrid, Moses, COG, GEMSS, Grace, Mammogrid, OpenMolGrid, Selene, P2P / ASP / Webservices P2People, ASP-BP, GRIA, MMAPS, GRASP, GRIP, WEBSI Clustering GridStart EU FP5 Grid Projects (EU Funding: 58 M) from M. Lemke at CGW04
12 Strong Polish Participation in FP5 Grid Research Projects 2 Polish-led Projects (out of 12) CrossGrid CYFRONET Cracow ICM Warsaw PSNC Poznan INP Cracow INS Warsaw GridLab PSNC Poznan Significant share of funding to Poland versus EU25 FP5 IST Grid Research Funding: 9.96 % FP5 wider IST Grid Project Funding: 5 % GDP: 3.8 % Population: 8.8 % CROSSGRID partners from M. Lemke at CGW04
Michal Turala Warszawa, 25 February CrossGrid testbeds LUBLIN SZCZECIN BYDGOSZCZ TORUŃ OPOLE 16 sites in 10 countries, about 200 processors and 4 TB disk storage Testbeds for - development - production - testing - tutorials - external users Middleware: from EDG 1.2 to LCG Last week CrossGrid has concluded successfully its final review
Michal Turala Warszawa, 25 February CrossGrid applications POZNAŃ WROCŁAW KATOWICE KRAKÓW WARSZAWA SZCZECIN TORUŃ BIAŁYSTOK ELBLĄG OLSZTYN KIELCE PUŁAWY RZESZÓW OPOLE BIELSKO-BIAŁA CZĘSTOCHOWA Medical Blood flow simulation, supporting vascular surgeons in the treatment of arteriosclerosis Flood prediction and simulation based on weather forecasts and geographical data Flood prediction Distributed data mining in high energy physics, supporting the LHC collider experiments at CERN Physics Large-scale weather forecasting combined with air pollution modeling (for various pollutants) Meteo Pollution
Michal Turala Warszawa, 25 February Grid for real time data filtering GDAŃSK POZNAŃ WROCŁAW ZIELONA GÓRA ŁÓDŹ KATOWICE KRAKÓW LUBLIN WARSZAWA SZCZECIN BYDGOSZCZ TORUŃ BIAŁYSTOK ELBLĄG OLSZTYN KIELCE PUŁAWY RZESZÓW OPOLE BIELSKO-BIAŁA KOSZALIN RADOM CZĘSTOCHOWA Studies on a possible use of remote computing farms for event filtering; in 2004 beam test data shipped to Cracow, and back to CERN, in real time.
Michal Turala Warszawa, 25 February LHC Computing Grid project-LCG Objectives - design, prototyping and implementation of the computing environment for LHC experiments (MC, reconstruction and data analysis): - infrastructure - middleware - operations (VO) Schedule - phase 1 (2002 – 2005; ~50 MCHF); R&D and prototyping (up to 30% of the final size) - phase 2 (2006 – 2008 ); preparation of a Technical Design Report, Memoranda of Understanding, deployment (2007) Coordination - Grid Deployment Board: representatives of the world HEP community, supervising of the LCG grid deployment and testing
17 Country providing resources Country anticipating joining EGEE/LCG In EGEE-0 (LCG-2): 91 sites >9000 cpu ~5 PB storage Computing Resources – Dec From F. Gagliardi at CGW04 Three Polish institutions involved - ACC Cyfronet Cracow - ICM Warsaw - PSNC Poznan Polish i nvestment in the local infrastructure EGEE supporting the operations
Michal Turala Warszawa, 25 February Polish Participation in LCG project Polish Tier2 INP/ ACC Cyfronet Cracow resources (plans for 2004) 128 processors (50%), storage: disk ~ 10TB, tape (UniTree) ~ 10 TB (?) manpower engineers/ physicists ~ 1 FTE + 2 FTE (EGEE) ATLAS data challenges – qualified in 2002 INS/ ICM Warsaw resources (plans for 2004) 128 processors (50%), storage: disk ~ 10TB, tape ~ 10 TB manpower engineers/ physicists ~ 1 FTE + 2 FTE (EGEE) Connected to LCG-1 world-wide testbed in September 2003
Michal Turala Warszawa, 25 February Polish networking - PIONIER from the report of PSNC to ICFA ICIC, Feb (M. Przybylski) 5200 km fibres installed, connecting 21 MAN centres Multi-lambda connections planned Good connectivity of HEP centres to MANs - IFJ PAN to MAN Cracow – 100 Mb/s -> 1 Gbs, - INS to MAN Warsaw – 155 Mb/s Stockholm GEANT Prague
Michal Turala Warszawa, 25 February PC Linux cluster at ACC Cyfronet CrossGrid – LCG-1 4 nodes 1U 2x PIII, 1GHz 512 MB RAM 40 GB HDD 2 x FastEthernet 100Mb/s 23 nodes 1U 2x Xeon 2,4Ghz 1 GB RAM 40 GB HDD Ethernet 100Mb/s+1Gb/s HP ProCurve Switch 40 ports 100Mb/s, 1 port 1Gb/s (uplink) Monitoring: 1U unit KVM keyboard touch pad LCD Ethernet 100 Mb PIII 1GHz Xeon 2,4 GHz INTERNET Last year 40 nodes of I64 processors have been added; in 2005 investments of 140 Linux 32 bit processors and TB of disk storage are planned
Michal Turala Warszawa, 25 February ACC Cyfronet in LCG-1 Sept. 2003: Sites taking part in the initial LCG service (red dots) Kraków Poland Karlsruhe Germany This is the very first really running global computing and data Grid, which covers participants on three continents Small Test clusters at 14 institutions; Grid middleware package (mainly parts of EDG and VDT) a global Grid testbed from K-P. Mickel at CGW03l
Michal Turala Warszawa, 25 February Linux Cluster at INS/ ICM CrossGrid – EGEE - LCG cluster at the Warsaw University (Physics Department) Worker Nodes: 10 CPUS (Athlon 1.7 MHz) Storage Element: ~ 0.5 TB Network: 155 Mb/s LCG 2.3.0, registered in LCG Test Zone PRESENT STATE NEAR FUTURE (to be ready in June 2005) cluster at the Warsaw University (ICM) Worker Nodes: CPUS (64-bit) Storage Element: ~ 9 TB Network: 1 Gb/s (PIONEER) from K. Nawrocki
Michal Turala Warszawa, 25 February PC Linux Cluster at ACC Cyfronet CrossGrid – EGEE- LCG-1 LCG cluster at ACC Cyfronet statistics for 2004 CPU timeWalltime [hours] 20493, , , , , ,375 Atlas Alice LHCb CPU timeWalltime [seconds] Atlas Alice LHCb
24 ATLAS DC Status ATLAS ~ 1350 kSI2k.months ~ 120,000 jobs ~ 10 Million events fully simulated (Geant4) ~ 27 TB All 3 Grids have been proven to be usable for a real production DC2 Phase I started beginning of July, finishing now 3 Grids were used LCG ( ~70 sites, up to 7600 CPUs) NorduGrid (22 sites, ~3280 CPUs (800), ~14TB) Grid3 (28 sites, ~2000 CPUs) LCG 41% Grid3 29% NorduGrid 30% from L. Robertson at C-RRB, Oct. 2004
Michal Turala Warszawa, 25 February In response to the LCG MoU draft document and using data of the PASTA report the plans for Polish Tier2 infrastructure have been prepared – they are summarized in the Table Polish LHC Tier2 - future CPU (kSI2000) Disk, LHC (TBytes) Tape, LHC (TBytes) WAN (Mbits/s) Manpower (FTE) It is planned that in the next few years the LCG resources will grow incrementally mainly due to local investments. A step is expected around 2007, when the matter of LHC computing fundings should be finally resolved. from the report to LCG GDB, 2004
Michal Turala Warszawa, 25 February Thank you for your attention