Download presentation
Presentation is loading. Please wait.
1
LHC Computing Grid Project Status
Frédérique Chollet LAPP LCG-France T2-T3 coord. Fabio Hernandez CCIN2P3 Deputy Director IHEP Beijing , December 12th 2006
2
Contents LCG : The worlwide LHC computing grid The LCG-France Project
4
Computing Grid for LHC experiments
Needs from experiments : Storage capacities Collisions: ~10 PB/year Recoding rate up to 1 Gb/s Simulation of large sets of Monte Carlo data CPU Processing and reprocessing Monte Carlo Simulation and reconstruction Analysis Inter-sites large bandwidth Distribution of large volume of scientific data worldwide : from CERN to the computing sites and from site to site for the simulated data
5
The Worldwide LHC Computing Grid
Purpose Develop, build and maintain a distributed computing environment for the storage and analysis of data from the four LHC experiments Ensure the computing service … and common application libraries and tools Last year of preparation for LHC grid computing for LHC to work Goal Set up a global infrastructure for the entire High Energy Physics community Funding Principle LHC computing resources will not be centralised at CERN A data handling challenge : data distribution model running over a hierarchy of distributed resources and services Recording rate up to 1 GB/sec WLCG project is setting up the global computing infrastructure for the High Energy Physics community. It is clear by now that not all the resources will be centralized at Cern and that dealing with this challenging data handling problem we need a data distribution model running over a hierarchy of resources and services. LCG project includes Computing grid AND Application support which i will not detail in this presentation. Persistency framework including POOL, Simulation framework and core services and libraries
6
WLCG collaboration WLCG Collaboration still growing
~130 computing centres 12 large centres (Tier-0, Tier-1) 40-50 federations of smaller “Tier-2” centres ~ 120 centres 29 countries Memorandum of Understanding Agreed in October 2005, now being signed Qualification criteria for inclusion Long term commitment in terms of service level and response times not just funding and threshold size
7
LCG Architecture Four-tiered model
Tier-0: accelerator centre (CERN) Data acquisition and initial processing Long term data curation Distribution of data to Tier-1 centres (near online) Tier-1 Viewed as online for the data acquisition process High availability required (24x7) Grid-enabled data service (backed by a mass storage system) Data-heavy analysis Regional support Tier-2 Simulation Interactive and batch end-user data analysis Tier-3 Used by experiments as available End-user analysis Code development, testing, ntupling… Tier-1s and Tier-2s are optimized for computing model efficiency andTier-3s for physicist efficiency Tier-0 (1) Tier-1 (11) Tier-2 (~120) But smaller or less committed centres will be used by the experiments as available, without any specific a MoU agreement Tiers-3 will stand close to the end-physicist needs. I am convinced that they are typical T3 uses cases as well. Tiers 1 and 2 are optimized for computing model efficiency. Tier 3’s are optimized for physicist efficiency We like (and are building) a system where the heavy computing is done on the T2/T1s, but code development, testing and ntupling is done at the T3’s Tier-3
8
Computing Resource Requirements
More than today’s dual-core AMD Opteron Around 355 TFLOPS CPU Disk Tape Large aggregate computing ressource required and pledged More than 40 % of the required CPU in Tiers-2, outside CERN and National Computing Centres
9
LCG hierarchy relies on …an excellent Wide Area Network
IN my understanding of experiments data model, inter-traffic between Tiers-2 will remain at reasonable level but Connection of Tiers1 to National NREN may become a concerns sevral GBit/s required 10 Gb/s
10
LCG Tiers-2 Extended role of Tiers-1 and Tiers-2
Number of Tiers-2 and even Tiers-3 still growing : LHC Computing will be more dynamic and network oriented
11
depends on two major science grid infrastructures
LCG infrastructure depends on two major science grid infrastructures EGEE - Enabling Grids for E-Science OSG - US Open Science Grid
12
Interoperability Both EGEE & LCG are heavily HEP but both are evolving as multi-science grid Support of executing LHC applications across multiple grid infrastructures Core middleware packages based on VDT Transparent use of resources Deploy multiple set of interfaces to the same physical resource (submission using GRAM/condor-G) Deploy compatible interfaces between grid projects (SRM, GlueSChema…) Cross-job submission in used by CMS LCG operations depends on EGEE and OSG for grid infrastructure. Both projects are still heavily HEP but both are evolving as multi-science infrastructures LCG depends on EGEE, VDT, Condor for grid middleware CMS has been working federating EGEE and OSG grid resources
13
OSG - EGEE Interoperation for LCG Resource Selection
VO UI VO RB VO RB VO RB BDII BDII LDAP URLs/ CEMON BDII/ CEMON BDII BDII T2 BDII T2 BDII/ CEMON T2 T2 T2 BDII T2 T2 T2 T2 Site T2 Site GRAM GRAM GRAM GRAM GRAM GRAM GRAM GRAM GRAM GRAM
14
Steady increase in Grid Usage
More than 35K jobs/day on the EGEE Grid LHC VOs 30K jobs/day Spread across the full infrastructure Beetwen EGEE and OSG ~ 60 K jobs/day during data challenges this summer ~15 K simultaneous jobs during prolonged periods 3 x increase in pass twelve months with no effect on operations but still some way to go only ~20 % of the 2008 target
15
Delivered CPU Time CPU Usage – CERN Tier-1s – EGEE + OSG
160 K processor-days/month 66% in Tiers-1s BUT Installed capacity is considerably lower than of usable capacity planned at CERN+Tier-1s in first full year of LHC Challenging ramp-up!
16
Data Distribution Tests CERNTier-1s
1,6 GB/Sec CERN Tier-1s April 06 test period “Nominal “ rate when LHC is operating was achieved1.6 GBytes/sec only for one day Design target should be even higher to enable catch-up after problems Sustained operation at 80 % of the target
17
Inter-sites relationships
Tier-1 Tier-2 Experiment computing models define specific data flows between Tier-1s and Tier-2s Tiers-2 sites getting involved in data transfers exercise On-going work to define inter-sites relationships quantifying the Tier-1 storage and network services required to support each Tier-2 the inter-Tier1 network services and enabling each site to identify the other sites it has to communicate with and check storage & network capabilities
18
Measuring Response times and Availability
Site Availability Monitor SAM monitoring services by running regular tests basic services – SRM, LFC, FTS, CE, RB, Top-level BDII, Site BDII, MyProxy, VOMS, R-GMA, …. VO environment – tests supplied by experiments results stored in database displays & alarms for sites, grid operations, experiments high level metrics for management integrated with EGEE operations-portal - main tool for daily operations Site Reliability for CERN+Tier_1s Average 83 % of 2006 target Best 8 sites average 91 % of target more on Grid operations by Hélène Cordier
19
Service Challenges Purpose Understand what it takes to operate a real grid service – run for weeks/months at a time (not just limited to experiment Data Challenges) Verify Tier1 & large Tier-2 deployment - tested with realistic usage patterns Get the essential grid services ramped up to target levels of reliability, availability, scalability, end-to-end performance Four progressive steps from October 2004 thru September 2006 End SC1 – data transfer to subset of Tier-1s Spring 2005 – SC2 – include mass storage, all Tier-1s, some Tier-2s 2nd half 2005 – SC3 – Tier-1s, >20 Tier-2s –first set of baseline service Jun-Sep 2006 – SC4 – the pilot LHC service A stable service on which experiments can make full demonstration of experiment offline chain Extension of service to most Tier-2 sites See Results of experiment-driven data transfers in later talks of Ghita Rahal, Stéphane Jézéquel, Artem Trunov…
20
Summary Grids are now operational and heavily used
~200 sites between EGEE and OSG Grid operations mature Long periods with ~15K simultaneous jobs with the right load and job mix Baseline services are in operation but few key services still to be introduced Tier-0 and Tier-1 core services progressing well Many Tiers-2 have been involved in test program in 2006 A step road ahead to ramp up the capacity next year Substantial challenges to achieve the target in terms of performance and reliability
21
The LCG-France project
22
LCG-France Project Aims : French Initiative dedicated to LHC computing
Scientific Project Leader : Fairouz Malek IN2P3/LPSC Technical Project Leader : Fabio Hernandez CCIN2P3 Technical coordination of Tier-2s and Tiers-3 : F.Chollet IN2P3/LAPP Partners : High Energy Physics actors Aims : provide a funded Tier-1 centre and an Analysis facility in CC-IN2P3, Lyon supporting the 4 LHC experiments promote the emergence of Tier-2 and even Tier-3 centres With help of Regional funding agencies, Universities… establish collaboration with others Tier 1 and Tier 2 sites make agreements on International MoU commitments with WLCG
23
LCG-France sites WLCG MoU :
Tier-1 (See Dominique Boutigny’s talk) Analysis Facility in Lyon 3 Tiers-2 listed : GRIF, LPC Clermont, Subatech A set of 6 centers : 3 Tiers-2 and 3 Tiers-3 GRIF : join project of 5 laboratories acting as a federation, a unique resource distributed over several sites in Paris region 10 HEP laboratories involved 2 more Tiers-3 candidates (LPSC Grenoble, IPNL Lyon) in 2007
24
LCG-France sites Tier-2: GRIF CEA/DAPNIA LAL LLR LPNHE IPNO
Tier-2: Subatech Tier-2: LPC Tier-3: CPPM Tier-2: GRIF CEA/DAPNIA LAL LLR LPNHE IPNO Analysis Facility Tier-3: LAPP Tier-1: CC-IN2P3 Tier-3: IPHC Lyon Clermont-Ferrand Ile de France Marseille Nantes Strasbourg Annecy
25
LCG-France sites Supported LHC experiments Alice ATLAS CMS LHCb
T1 CCIN2P3 Lyon P T AF Lyon GRIF P LPC Clermont P SUBATECH Nantes T3 CPPM Marseille IPHC Strasbourg LAPP Annecy
26
LCG-France Tiers-2 centres
Computing resources in 2008 WLCG : ~ 40 % of the total CPU resources are required in the Tiers-2 Analysis Facility and Tiers-2 : 33 % of the total CPU resources pledged by LCG-France CPU [k SI2000] Storage [TB] Number of average Tier-2 Average Rev.Required Capacity per T2 LCG-France Tiers-2 in 2008 Average Required Capacity per T2 Alice 640 740 160 187 1,2 Atlas 729 1550 320 410 2,1 CMS 608 1060 168 330 1,9 LHCb 409 252 60 0,6
27
LCG-France Tiers-2 contribution
% of offered Resources by all Tier-2 Sites in 2008 Estimated number of Tier-2 sites: Alice: 16 Atlas: 24 CMS: 27 LHCb: 11
28
LCG-France Tiers-2 contribution
Actual total CPU capacity of CC-IN2P3 Planned Capacity in 2008 1000 kSI2000 ≈ 320 CPUs (AMD Opteron 275 dual-core)
29
Tiers-2 planned capacity
from 2006 to 2010
30
Tier-3s Planned Capacity
500 kSI2000 ≈ 160 CPUs (AMD Opteron 275 dual-core)
31
LCG-France T2-T3 Technical group
set up in April 2006 mailing list, regular visio conferences, wiki pages Objectives : Enhance efficiency of LCG deployment and commissioning LCG-France scientific program support include substantial resources in Tiers-2 Tiers-3 centres for LHC simulation and analysis establish strong collaboration with experiment representatives Technical coordination between Tiers-2 Tiers-3 centres federate technical teams establish strong collaboration with CC-IN2P3
32
LCG-France T2-T3 Technical group
Resources & Services integration into LCG Storage services : work currently being done on SE SRM set-up in Tier-2 & Tier-3 sites Central services deployment in Lyon Tier-1 FTS Networking Connectivity bandwidth estimates Contact with RENATER NREN Test validation Infrastructure support and monitoring within EGEE Needs for LCG national specific support ? LHC computing tracking Collaboration with experimental representatives being part of the Service Challenges and experiments driven tests exercise the reliability of the models of data distribution and processing
33
Other Activities and Concerns
LCG Quator working group (QWG) Operational GDB working group with an official mandate Automatic installation and configuration of LCG software Chairperson : C. Loomis (LAL) Contribution from GRIF, largely used in LCG-France Storage space provision is a major concern for all Tiers Data access patterns required by the expriments Managed disk enabled storage : SRM v2.2 implemetation FS (GPFS, Lustre evaluation) and GSI enabled protocols Node Configuration Management
34
Collaborative work Within EGEE SA1 Within LCG-France
Resources and services provision and reliable operating Within LCG-France Tier-1 – Tier-2s Tier-3s integration Enhance collaboration between sites experts and experiment representatives Working together with experiences Experiment computing models define specific data flows between Tier-1s and Tier-2s
35
Foreign collaborations
Driven by scientific motivation and experiments needs, LCG-France has established foreign collaborations in Europe Belgium CMS Tier-2 Romanian Federation ATLAS Tier-2 in Asia IHEP China - ATLAS and CMS Tier2 ICEPP Japan - ATLAS Tier2 Collaboration on setting up grid computing infrastructure as well
36
Conclusions LCG France not a National grid initiative
LCG France s working as national project driven by LHC Computing needs committed to the success on a worldwide scale of WLCG and EGEE News from Tier-1 : D.Boutigny’s talk Additional resources coming from Tier-2s & even Tier-3 initiatives Perfect context for enhanced collaboration between France and Chine
37
Thanks Thank you for your attention
Many thanks to Prof. Hesheng CHEN, and the Institute of High Energy Physics
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.