Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid and Data handling Gonzalo Merino, Port d’Informació Científica / CIEMAT Primeras Jornadas del CPAN, El Escorial, 25/11/2009.

Similar presentations


Presentation on theme: "Grid and Data handling Gonzalo Merino, Port d’Informació Científica / CIEMAT Primeras Jornadas del CPAN, El Escorial, 25/11/2009."— Presentation transcript:

1 Grid and Data handling Gonzalo Merino, merino@pic.esmerino@pic.es Port d’Informació Científica / CIEMAT Primeras Jornadas del CPAN, El Escorial, 25/11/2009

2 Disclaimer Though the title of this talk is very generic, I will focus in describing the LHC Grid and data handling as an example. This is the community with the largest and more imminent computing needs, as well as my area of work. I will try and address the Grid-related activities in other CPAN areas. The information presented does not aim to show a complete catalogue of Grid activities, but to describe the general view and provide a handful of URL pointers to further information. 25/11/20092

3 LHC computing needs The LHC is one of the world largest scientific machines Proton-proton collider, 27 Km perimeter, 100 m underground, superconducting magnets at 1,9 K Four detectors will record the outcome of collisions 1 GHz collisions  200-300 Hz trigger  near 1 GB/s  10-15 PB /yr Adding up processed data, simulation, replicas: 50-100 PB/year  10-15 years lifetime  LHC in the Exabyte scale Managing this huge amount of data and enabling its analysis by 1000s of scientists worldwide is a technological challenge. No way to concentrate such computing power and storage capacity at CERN. Decided to adopt the Grid paradigm for the LHC computing. 25/11/20093

4 4 LHC Grid: layered structure It comes from the early days (1999, MONARC). Then mainly motivated for limited network connectivity among sites. Today, the network is not the issue but the Tiered model is still used to organise work and data flows. Tier-1 (11 centres): Online to DAQ (24x7) Long term storage of RAW data copy, massive data reconstruction. Connect to CERN with dedicated 10 Gbps links. Tier-0 at CERN: DAQ and prompt reconstruction, Long term data curation. Tier-2 (>150 centres): End-user analysis and simulation. Connect to T1s with gral. purpose research networks. 25/11/2009

5 Worldwide LHC Computing Grid More than 170 centres in 34 countries ~ 86k CPUs, 68 PB disk, 65 PB tape CIEMAT UAM USC IFCA PIC IFAE UB IFIC PIC Tier-1ATLAS Tier-2CMS Tier-2LHCb Tier-2 (ATLAS, CMS, LHCb)IFICCIEMATUB IFAEIFCAUSC UAM Spain contributes with 1 Tier-1 and 7 Tier-2 (target ~5% of total T1/T2 capacity) 25/11/20095

6 Distribution of resources Experiment computing requirements for the 2009-2010 run at the different WLCG Tiers More than 80% of the resources are outside CERN The Grid MUST work from day 1! 25/11/20096

7 LHC computing requirements The computing and storage capacity needs for WLCG are enormous. Capacity planning managed through the WLCG MoU: Yearly process where requirements and pledges are updated and agreed ~100.000 today cores 25/11/20097

8 LHC Experiments Computing Models

9 Experiments Computing Models Every LHC experiment develops and maintains a Computing Model that aims to describe the organisation of the data and the computing infrastructure that is needed to process and analyse them. Example: input parameters to the ATLAS Computing Model 25/11/20099

10 ATLAS Computing Model Tier-0 CAF Prompt Reconstruction Calibration & Alignment Express Stream Analysis Tier-1 RAW Re-processing HITS Reconstruction Tier-2 Simulation and Analysis 650 MB/s 50-500 MB/sec 50-500 MB/s 10-20 MB/sec 25/11/200910

11 CMS Computing Model Tier-0 CAF Prompt Reconstruction Calibration Express Stream Analysis Tier-1 Re-reconstruction Skimming & selection Tier-2 500 MB/s 50-500 MB/s 10-20 MB/sec Simulation and analysis 25/11/200911

12 LHCb Computing Model Tier-0 CAF Reconstruction Stripping Analysis Calibration Express Stream Analysis Tier-1 Tier-2 100 MB/s 10 MB/s Few MB/sec Simulation Tier-1 Reconstruction Stripping Analysis 25/11/200912

13 13 Data Analysis on the Grid The original vision: Application thin layer interacting with a powerful middleware layer Output User Algorithms Dataset Query Workload Management System Other Services “Super-WMS” to which the user throws input dataset queries plus algorithms and it spits the result out. 25/11/2009

14 14 Data Analysis User Analysis: Single interface for the whole analysis cycle, hide the complexity of the Grid (Ganga, CRAB, DIRAC, Alien …) Workload Management: Pilot jobs, late scheduling, VO- steered prioritisation (DIRAC, Alien, Panda …) Data Management: Topology aware higher level tools, capable of managing complex data flows (Phedex, DDM …) To use the Grid at such large scale is not an easy business! Reality today: LHC experiments have build increasingly sophisticated s/w stacks to interact with the Grid. On top of basic services: CE, SE, FTS, LFC Grid middleware Basic Services (FTS, LFC …) VO-specific user interface VO-specific WMS, DMS Computing and Storage resources 25/11/2009

15 Testing the LHC Grid

16 WLCG Service Challenges Large scale test campaigns by which the readiness of the overall LHC Computing Service to meet the requirements of the experiments has been tested. 2005, SC3: The first one where all Tier-1 centres participated. Transfer “dummy” data to try and reach high transfer throughput between sites. 2006, SC4: Target transfer rate 1,6 GB/s out of CERN reached during 1 day. Sustained 80% of this rate during long periods. More realistic data. 2008, CCRC08: Focus on having all 4 experiments testing all workflows simultaneously and keeping the service stable for a long period. 2009, STEP09: Last chance to stress-test the system before LHC start. Focus on multi-experiment workloads never tested before at large scale (e.g. massive data re-reconstruction recalling from tape) 25/11/200916

17 CCRC08 test (June 2008) MB/s STEP09 test (June 2009) Testing data export from CERN Example of data export CERN  Tier-1s as tested by ATLAS: June-08: 2 days at 1 GB/s June-09: 2 weeks at 4 GB/s 25/11/200917

18 18 Performance: data volumes CMS has been transferring 100 – 200 TB per day (1 PB/week) on the Grid since more than 2 years. 2008 2009 150 TB/day Last June ATLAS added 4 PB in 11 days to their total of 12 PB on the Grid + 4 PB 25/11/2009

19 WLCG CPU Workload The CPU accounting of all Grid sites is centrally stored. Available from: http://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.php Monthly CPU walltime (millions of kSI2K·hrs) 100.000 ksi2k·month ~ 50.000 simult. busy cores 25/11/200919 (Data up to 22-Nov-2009)

20 20 Availability Setting up and deploying robust operational tools is crucial for building reliable services on the Grid. One of the key tools for WLCG: The Service Availability Monitor 25/11/2009

21 21 Availability Setting up and deploying robust operational tools is crucial for building reliable services on the Grid. One of the key tools for WLCG: The Service Availability Monitor 25/11/2009

22 Improving Reliability An increasing number of more realistic sensors, plus a powerful monitoring framework that ensures peer pressure, guarantees that reliability of WLCG service will keep improving. 25/11/200922

23 CMS PIC transfers since Jan 2007 4,8 PB into PIC 4 PB out from PIC 25/11/200923 Level of testing of the system: Moving almost 10 TB daily avg. during 3 years.

24 24 Data Transfers to Tier-2s Reconstructed data sent to the T2s for analysis. Bursty nature. Experiment requirements very fuzzy for this dataflow (as fast as possible) –Links to all SP/PT Tier-2s certified with 30-100 MB/s sustained –CMS Computing Model: sustained transfers to > 40 T2s worldwide ATLAS transfers PIC  T2s daily avg. 200 MB/s CMS transfers PIC  T2s daily avg. 100 MB/s 25/11/2009

25 Multi-discipline Grids for scientific research

26 Enabling Grids for E-sciencE http://www.eu-egee.org EU funded project to build a production quality Grid infrastructure for scientific research in Europe. Three phases: 2004 – 2010. Outcome: the largest, most widely used multi-disciplinary Grid infrastructure in the world. – WLCG is built on top of EGEE (and OSG in the USA) Many VOs and applications registered as EGEE users. Look for yours in the App database: http://appdb.eu-egee.org/ http://grid.ct.infn.it/egee_applications/ From http://cic.gridops.org/index.php?section=home&page=volisthttp://cic.gridops.org/index.php?section=home&page=volist 25/11/200926

27 Enabling Grids for E-sciencE http://www.eu-egee.org The EGEE project contained all of the Grid stakeholders: Vision beyond EGEE-III: migrate the existing production European Grid from a project-based model to a sustainable infrastructure. – Infrastructure: European Grid Initiative (http://www.eu-egi.eu/)http://www.eu-egi.eu/ Federated infrastructure based on National Grid Initiatives for multi- disciplinary use. Spanish Ministry of Science and Innovation signed the EGI MoU and designated CSIC as coordinator of the Spanish NGI. – Applications User community organized in Specialized Support Centers (SSCs). – Middleware Development in a separated project. Infrastructure and Applications can become “customers”. InfrastructureMiddlewareApplications 25/11/200927

28 EGI related projects submitted to EU Presented by C. Loomis in EGEE09 workshop, Sep-09 (link)link Astrophysics: MAGIC, et al. HEP: LHC, FAIR, et al. 25/11/200928

29 Spanish Network for e-Science A Network Initiative Funded by the Spanish Ministry of Science and Education. Officially Approved on Dec-2007. UPV is the coordinating institution. More than 900 researchers, 89 research groups. Organised in four areas: Grid infr.Supercomputing infr. ApplicationsMiddleware http://www.e-ciencia.es The Applications area coordinates the activities of the different users communities (see active groups and applications in the Area wiki)wiki 25/11/200929

30 Astroparticles MAGIC (IFAE, PIC, UCM, INSA) – Data centre at PIC Data storage, reduction and access for the collaboration Resources and tools for users’ analysis (in prep.) Publish data to the Virtual Observatory (in prep.) – Monte Carlo production “on-demand” AUGER (UAH, CETA-CIEMAT) – Run simulations on the Grid: CORSIKA, ESAF, AIRES … Two presentations from astroparticles in the last meeting of the “Red Española de e-ciencia” (Valencia Oct-09, see slides)slides 25/11/200930

31 Facility for Antiproton and Ion Research One of the largest projects of the ESFRI Road Map. Will provide high energy and intensity ions and antiproton beams for basic research. The computing and storage requirements for FAIR are expected to be of the order of those of the LHC or above. A detailed evaluation is under way. Two of the experiments (PANDA and CBM) have already started using the Grid for detector simulations. FAIR Baseline Technical Report: 2500 scientists, 250 institutions, 44 countries. Spain one of the 14 countries that signed the agreement for the construction of FAIR. Contributing with 2% of the cost. Civil construction expected to start in 2010. First beam expected in 2015/16. 25/11/200931

32 Summary In the recent years we are witnessing an explosion of the scientific data. – More precise and complex experiments. – Large international collaborations. Geographically dispersed users need to access the data. The LHC has been largely driving the activity in the last years, with the pressure of the Petabytes of data (now yes) around the corner. – WLCG, the largest Grid infrastructure in the world, has been deployed and is ready for storing, processing and analysing the LHC data. Since early 2000s, a series of EU funded projects (EGEE) have been in the core of the deployment of a Grid for scientific research in Europe. – Next round of EU projects focused in consolidating this into a sustainable infrastructure: federated model (NGIs). – Projects call closed yesterday. Stay tuned for the activity on the “Grid Users/Applications” arena (SSCs). 25/11/200932

33 thank you Gonzalo Merino (merino@pic.es)merino@pic.es Port d’Informació Científica (www.pic.es)www.pic.es

34 Backup Slides

35 PIC Tier-1 Reliability Tier-1 reliability targets have been met for most of the months 25/11/200935

36 36 T0/T1 ↔ PIC data transfers Target ATLAS+CMS+LHCb ~ 210 MB/s CMS data imported from T1s CMS data exported to T1s Target ATLAS+CMS+LHCb ~ 100 MB/s ATLAS daily rate CERN  PIC June 2009 Target: 76 MB/s CMS daily rate CERN  PIC June 2009 Target: 60 MB/s Data import from CERN and transfers with other Tier-1s successfully tested above targets 25/11/2009

37 Networking Tier-1 ↔Tier-2 in Spain 25/11/200937

38 EGI-User interaction User community organized into series of Specialized Support Centers (SSCs) Goals of an SSC: – Increase number of active users in the community – Promote use of grid technologies within the community – Encourage cooperation within the community – Safeguard grid knowledge and expertise of the community – Build scientific collaboration within and between communities An SSC will be a central, long-lived hub for grid activities within a given scientific community. (Presented by Cal Loomis in EGEE09 conference) 25/11/200938

39 25/11/200939


Download ppt "Grid and Data handling Gonzalo Merino, Port d’Informació Científica / CIEMAT Primeras Jornadas del CPAN, El Escorial, 25/11/2009."

Similar presentations


Ads by Google