CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Summary High Energy Physics and CERN computing problem An excellent computing model: the GRID The Data Grid Initiative (
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May CERN organization CERN organization Largest Particle Physics lab in the world European International Center for Particle Physics Research Budget: 1020 M CHF 2700 staff 7000 physicist users
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May The LHC Detectors CMS ATLAS LHCb 3.5 PetaBytes / year ~10 8 events/year
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May The HEP Problem - Part I The scale...
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May ~10K SI processors Non-LHC technology-price curve (40% annual price improvement) LHC Capacity that can purchased for the value of the equipment present in 2000
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Non-LHC technology-price curve (40% annual price improvement) LHC
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Long Term Tape Storage Estimates Current Experiments COMPASS LHC 0 2'000 4'000 6'000 8'000 10'000 12'000 14' Year TeraBytes
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May HPC or HTC High Throughput Computing mass of modest, independent problems computing in parallel – not parallel computing throughput rather than single-program performance resilience rather than total system reliability Have learned to exploit inexpensive mass market components But we need to marry these with inexpensive highly scalable management tools Much in common with other sciences (see EU-US Annapolis Workshop at Astronomy, Earth Observation, Bioinformatics, and commercial/industrial: data mining, Internet computing, e-commerce facilities, …… Contrast with supercomputing
network servers tape servers disk servers application servers Generic component model of a computing farm
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May The HEP Problem - Part II Geography, Sociology, Funding and Politics...
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May CMS:1800 physicists 150 institutes 32 countries World Wide Collaboration distributed computing & storage capacity
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Regional Centres - a Multi-Tier Model Department Desktop CERN – Tier 0 MONARC report: Tier 1 FNAL RAL IN2P3 622 Mbps 2.5 Gbps 622 Mbps 155 mbps Tier2 Lab a Uni b Lab c Uni n
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Are Grids a solution? Change of orientation of US Meta-computing activity From inter-connected super-computers ….. towards a more general concept of a computational Grid (The Grid – Ian Foster, Carl Kesselman) Has initiated a flurry of activity in HEP US – Particle Physics Data Grid (PPDG) GriPhyN – data grid proposal submitted to NSF Grid technology evaluation project in INFN UK proposal for funding for a prototype grid NASA Information Processing Grid
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May The Grid “Dependable, consistent, pervasive access to [high-end] resources” Dependable: provides performance and functionality guarantees Consistent: uniform interfaces to a wide variety of resources Pervasive: ability to “plug in” from anywhere
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May R&D required Local fabric Management of giant computing fabrics auto-installation, configuration management, resilience, self- healing Mass storage management multi-PetaByte data storage, “real-time” data recording requirement, active tape layer – 1,000s of users Wide-area - building on an existing framework & RN (e.g.Globus, Geant) workload management no central status local access policies data management caching, replication, synchronisation object database model application monitoring
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May HEP Data Grid Initiative European level coordination of national initiatives & projects Principal goals: Middleware for fabric & Grid management Large scale testbed - major fraction of one LHC experiment Production quality HEP demonstrations “mock data”, simulation analysis, current experiments Other science demonstrations Three year phased developments & demos Complementary to other GRID projects EuroGrid: Uniform access to parallel supercomputing resources Synergy to be developed (GRID Forum, Industry and Research Forum)
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Participants Main partners: CERN, INFN(I), CNRS(F), PPARC(UK), NIKEF(NL), ESA-Earth Observation Other sciences: KNMI(NL), Biology, Medicine Industrial participation: CS SI/F, DataMat/I, IBM/UK Associated partners: Czech Republic, Finland, Germany, Hungary, Spain, Sweden (mostly computer scientists) Formal collaboration with USA Industry and Research Project Forum with representatives from: Denmark, Greece, Israel, Japan, Norway, Poland, Portugal, Russia
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Status Prototype work already started at CERN and in most of collaborating institutes Proposal to RN2 submitted Network requirements discussed with Dante/Geant
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May WAN Requirements High bandwidth from CERN to Tier 1 centres (5-6) VPN, Quality of Service Guaranteed performance during limited test periods and at the end of the project for production quality services Target requirements (2003) 2.5 Gb/s Mb/s Mb/s Could saturate for limited amount of test time 2.5 Gb/s (100 MB/s out from a 100 PC farm, we plan for 1000’s PC farm) Reliability is an important factor: from WEB client-server model to GRID peer distributed computing model
CERN TERENA Lisbon F. Gagliardi - CERN/IT-May Conclusions This project, motivated by HEP and other high data and computing demanding sciences, will contribute to develop and implement a new world-wide distributed computing model: The GRID An ideal computing model for the next generation Internet An excellent test case for the next generation of high- performance research networks