Download presentation
Presentation is loading. Please wait.
1
HP Puerto-Rico – 9 February 2004 - 1 CERN and the LHC Computing Grid Ian Bird IT Department CERN, Geneva, Switzerland HP Puerto Rico 9 February 2004 Ian.Bird@cern.ch
2
HP Puerto-Rico – 9 February 2004 - 2 CERN is the world's largest particle physics centre funded by 20 European member states Particle physics is about: - elementary particles of which all matter in the universe is made - fundamental forces which hold matter together Particles physics requires: - special tools to create and study new particles What is CERN? CERN is: - 2500 staff scientists (physicists, engineers, …) - Some 6500 visiting scientists (half of the world's particle physicists) They come from 500 universities representing 80 nationalities.
3
HP Puerto-Rico – 9 February 2004 - 3 Mont Blanc, 4810 m Downtown Geneva … is located in Geneva, Switzerland
4
HP Puerto-Rico – 9 February 2004 - 4 What is CERN? The special tools for particle physics are: ACCELERATORS Huge machines able to speed up particles to very high energies before colliding them into other particles DETECTORS Massive instruments which register the particles produced when the accelerated particles collide
5
HP Puerto-Rico – 9 February 2004 - 5 LHC will collide beams of protons at an energy of 14 TeV Using the latest super-conducting technologies, it will operate at about – 270 0 C, just above absolute zero of temperature. With its 27 km circumference, the accelerator will be the largest superconducting installation in the world. What is LHC? LHC is due to switch on in 2007 Four experiments, with detectors as ‘big as cathedrals’: ALICE ATLAS CMS LHCb
6
HP Puerto-Rico – 9 February 2004 - 6 A particle collision = an event Provides trivial parallelism, hence usage of simple farms Physicist's goal is to count, trace and characterize all the particles produced and fully reconstruct the process. Among all tracks, the presence of “special shapes” is the sign for the occurrence of interesting interactions. The LHC Data Challenge
7
HP Puerto-Rico – 9 February 2004 - 7 Starting from this event… You are looking for this “signature” Selectivity: 1 in 10 13 Like looking for 1 person in a thousand world populations! Or for a needle in 20 million haystacks! The LHC Data Challenge
8
HP Puerto-Rico – 9 February 2004 - 8 LHC data (simplified) 40 million collisions per second After filtering, 100 collisions of interest per second A Megabyte of digitised information for each collision = recording rate of 0.1 Gigabytes/sec 10 11 collisions recorded each year = 10 Petabytes/year of data CMSLHCbATLASALICE 1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB 10% of the annual production by LHC experiments 1 Exabyte (1EB) = 1000 PB World annual information production
9
HP Puerto-Rico – 9 February 2004 - 9 Expected LHC computing needs Moore’s law (based on 2000 data) Networking: 10 – 40 Gb/s to all big centres today Data: ~15 Petabytes a year Processing: ~ 100,000 of today’s PC’s
10
HP Puerto-Rico – 9 February 2004 - 10 Computing at CERN today High-throughput computing based on reliable “commodity” technology More than 1500 dual processor PCs More than 3 Petabyte of data on disk (10%) and tapes (90%) Nowhere near enough!
11
HP Puerto-Rico – 9 February 2004 - 11 The new computer room is being populated… CPU servers Disk servers Tape silos and servers Computing at CERN today
12
HP Puerto-Rico – 9 February 2004 - 12 CPU servers Disk servers …while the existing computer centre is being cleared for renovation… Computing at CERN today …and an upgrade of the power supply from 0.5MW to 2.5MW is underway.
13
HP Puerto-Rico – 9 February 2004 - 13 Computing for LHC Problem: even with computer centre upgrade, CERN can only provide a fraction of the necessary resources Solution: computing centres, which were isolated in the past, will now be connected, uniting the computing resources of particle physicists in the world using GRID technologies! Europe: ~270 institutes ~4500 users Elsewhere: ~200 institutes ~1600 users
14
HP Puerto-Rico – 9 February 2004 - 14 LHC Computing Grid Project The LCG Project is a collaboration of – The LHC experiments The Regional Computing Centres Physics institutes.. working together to prepare and deploy the computing environment that will be used by the experiments to analyse the LHC data This includes support for applications provision of common tools, frameworks, environment, data persistency.. and the development and operation of a computing service exploiting the resources available to LHC experiments in computing centres, physics institutes and universities around the world presenting this as a reliable, coherent environment for the experiments
15
HP Puerto-Rico – 9 February 2004 - 15 Applications Area Torre Wenaus Development environment Joint projects, Data management Distributed analysis Middleware Area Frédéric Hemmer Provision of a base set of grid middleware (acquisition, development, integration) Testing, maintenance, support CERN Fabric Area Bernd Panzer Large cluster management Data recording, Cluster technology Networking, Computing service at CERN Grid Deployment Area Ian Bird Establishing and managing the Grid Service - Middleware, certification, security operations, registration, authorisation, accounting LCG Project Technology Office - David Foster Overall coherence of the project; Pro-active technology watch Long-term grid technology strategy; Computing models
16
HP Puerto-Rico – 9 February 2004 - 16 Project Management Board Project Management Management Team SC2, GDB chairs Experiment Delegates External Projects EDG, GridPP, INFN Grid, VDT, Trillium Other Resource Suppliers IN2P3, Germany, CERN-IT Architects’ Forum Applications Area Manager Experiment Architects Computing Coordinators Grid Deployment Board Experiment delegates, national regional centre delegates PEB deals directly with the Fabric and Middleware areas The GDB negotiates and Agrees operational and security policy, Resource allocation, etc
17
HP Puerto-Rico – 9 February 2004 - 17 LCG-1 components (schematic) Computing clusterNetwork resourcesData storage Operating systemLocal schedulerFile system User accessSecurity Data transfer Information schema Global schedulerData managementInformation system User interfaces Applications Hardware System software “Passive” services “Active” services High level services Closed system (?) HPSS, CASTOR… RedHat Linux NFS, … PBS, Condor, LSF,… VDT (Globus, GLUE) EU DataGrid LCG, experiments
18
HP Puerto-Rico – 9 February 2004 - 18 Elements of a Production Grid Service Middleware: - the systems software that interconnects the computing clusters at regional centres to provide the illusion of a single computing facility Information publishing and finding, distributed data catalogue, data management tools, work scheduler, performance monitors, etc. Operations: Grid infrastructure services Registration, accounting, security Regional centre and network operations Grid operations centre(s) – trouble and performance monitoring, problem resolution – 24x7 around the world Support: Middleware and systems support for computing centres Applications integration, production User support – call centres/helpdesk – global coverage; documentation; training
19
HP Puerto-Rico – 9 February 2004 - 19 Certification and distribution process established Middleware package – components from – European DataGrid (EDG) US (Globus, Condor, PPDG, GriPhyN) the Virtual Data Toolkit Agreement reached on principles for registration and security Rutherford Lab (UK) to provide the initial Grid Operations Centre FZK (Karlsruhe) to operate the Call Centre LCG Service The 1st “certified” release was made available to 14 centres on 1 September – Academia Sinica Taipei, BNL, CERN, CNAF, Cyfronet Cracow, FNAL, FZK, IN2P3 Lyon, KFKI Budapest, Moscow State Univ., Prague, PIC Barcelona, RAL, Univ. Tokyo
20
HP Puerto-Rico – 9 February 2004 - 20 LCG Service – Next Steps Deployment status – 12 sites active when service opened on 15 September ~30 sites now active Pakistan, China, Korea, HP,..preparing to join Preparing now for adding new functionality in November to be ready for 2004 VO management system Integration of mass storage systems Experiments now starting their tests on LCG-1 CMS target is to have 80% of their production on the grid before the end of the PCP of DC04 Essential that experiments use all features (including/especially data management) -- and exercise the grid model even if not needed for short term challenges Capacity will follow readiness of experiments
21
HP Puerto-Rico – 9 February 2004 - 21 LCG Service – Next Steps Deployment status – 12 sites active when service opened on 15 September 28 sites now active HP, Pakistan, Australia, Korea, China,..preparing to join Starting to deploy LCG-2 – upgrade for 2004 VO management system Integration of mass storage systems Experiments now starting their tests on LCG-2 in preparation for Data Challenges CMS target is to have 80% of their production on the grid before the end of the PCP of DC04 Essential that experiments use all features (including/especially data management) -- and exercise the grid model even if not needed for short term challenges Capacity will follow readiness of experiments
22
HP Puerto-Rico – 9 February 2004 - 22 Resources committed for 1Q04 Resources in Regional Centres Resources planned for the period of the data challenges in 2004 CERN ~12% of the total capacity Numbers have to be refined – different standards used by different countries Efficiency of use is a major question mark Reliability Efficient scheduling Sharing between Virtual Organisations (user groups) CPU (kSI2K) Disk TB Support FTE Tape TB CERN70016010.01000 Czech Repub 6052.55 France4208110.2540 Germany207409.062 Holland12434.012 Italy5076016.0100 Japan220455.0100 Poland8695.028 Russia1203010.040 Taiwan220304.0120 Spain150304.0100 Sweden179402.040 Switzerland2652.040 UK165622617.3295 USA80117615.51741 Total56001169120.04223
23
HP Puerto-Rico – 9 February 2004 - 23 LCG Service Time-line open LCG-1 (schedule – 1 July) used for simulated event productions agree spec. of initial service (LCG-1) 2003 2004 2005 2006 2007 first data physicscomputing service Level 1 Milestone – Opening of LCG-1 service 2 month delay, lower functionality than planned use by experiments will only starting now (planned for end August) decision on final set of middleware for the 1H04 data challenges will be taken without experience of production running reduced time for integrating and testing the service with experiments’ systems before data challenges start next spring additional functionality will have to be integrated later open LCG-1 (achieved) – 1 Sept
24
HP Puerto-Rico – 9 February 2004 - 24 LCG Service Time-line used for simulated event productions agree spec. of initial service (LCG-1) 2003 2004 2005 2006 2007 first data physicscomputing service open LCG-1 (achieved) – 1 Sept * TDR – technical design report LCG-2 - upgraded middleware, mgt. and ops tools principal service for LHC data challenges Computing model TDRs * LCG-3 – second generation middleware validation of computing models TDR for the Phase 2 grid Phase 2 service acquisition, installation, commissioning experiment setup & preparation Phase 2 service in production
25
HP Puerto-Rico – 9 February 2004 - 25 LCG and EGEE EU project approved to provide partial funding for operation of a general e-Science grid in Europe, including the supply of suitable middleware – Enabling Grids for e-Science in Europe – EGEE EGEE provides funding for 70 partners, large majority of which have strong HEP ties Similar funding being sought in the US LCG and EGEE work closely together, sharing the management and responsibility for - Middleware – share out the work to implement the recommendations of HEPCAL II and ARDA Infrastructure operation – LCG will be the core from which the EGEE grid develops – ensures compatibility; provides useful funding at many Tier 1, Tier2 and Tier 3 centres Deployment of HEP applications - small amount of funding provided for testing and integration with LHC experiments
26
HP Puerto-Rico – 9 February 2004 - 26 Middleware - Next 15 months Work closely with experiments on developing experience with early distributed analysis models using the grid Multi-tier model Data management, localisation, migration Resource matching & scheduling Performance, scalability Evolutionary introduction of new software – rapid testing and integration into mainline services – – while maintaining a stable service for data challenges! Establish a realistic assessment of the grid functionality that we will be able to depend on at LHC startup – a fundamental input for the Computing Model TDRs due at end 2004
27
HP Puerto-Rico – 9 February 2004 - 27 Grids - Maturity is some way off Research still needs to be done in all key areas of importance to LHC e.g. data management, resource matching/provisioning, security, etc. Our life would be easier if standards were agreed and solid implementations were available – but they are not We are just entering now in the second phase of development Everyone agrees on the overall direction, based on Web services But these are not simple developments And we still are learning how to best approach many of the problems of a grid There will be multiple and competing implementations – some for sound technical reasons We must try to follow these developments and influence the standardisation activities of the Global Grid Forum (GGF) It has become clear that LCG will have to live in a world of multiple grids – but there is no agreement on how grids should inter-operate Common protocols? Federations of grids inter-connected by gateways? Regional Centres connecting to multiple grids? Running a service in this environment will be challenge!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.