Presentation is loading. Please wait.

Presentation is loading. Please wait.

LCG LHC Computing Grid Project From the Web to the Grid 23 September 2003 Jamie Shiers, Database Group IT Division, CERN, Geneva, Switzerland

Similar presentations


Presentation on theme: "LCG LHC Computing Grid Project From the Web to the Grid 23 September 2003 Jamie Shiers, Database Group IT Division, CERN, Geneva, Switzerland"— Presentation transcript:

1 LCG LHC Computing Grid Project From the Web to the Grid 23 September 2003 Jamie Shiers, Database Group IT Division, CERN, Geneva, Switzerland Jamie.Shiers@cern.ch http://cern.ch/jamie/

2 LCG Overview  Very brief overview of CERN  Use of Oracle at CERN – a partnership lasting two decades  From the Large Electron Positron collider (LEP) to the Large Hadron Collider (LHC)  The LHC Computing Grid (LCG) and Oracle’s role

3 LCG The European Organisation for Nuclear Research The European Laboratory for Particle Physics  Fundamental research in particle physics  Designs, builds & operates large accelerators  Financed by 20 European countries (member states) + others (US, Canada, Russia, India, ….)  1MSF budget - operation + new accelerators  2000 staff + 6000 users (researchers) from all over the world  LHC (starts ~2007) experiment: 2000 physicists, 150 universities, apparatus costing ~€300M, computing ~€250M to setup, ~€60M/year to run  10+ year lifetime

4 LCG airport Computer Centre Geneva  27km 

5 LCG LEP : 1989 – 2000 (RIP)  27km ring with counter-circulating electrons-positrons  Oracle Database selected to help with LEP construction  Originally ran on PDP-11, later VAX, IBM, Sun, now Linux  Oracle now used during LEP dismantling phase  Data on LEP components must be kept forever  Oracle is now used across entire spectrum of lab’s activities  Several Sun-based clusters (8i OPS, 9i RAC)  Many stand-alone Linux-based systems  Both database and increasingly Application Server

6 LCG High-lights of the LEP Era  LEP Computing started with the MAINFRAME  Initially IBM running VM/CMS, large VAXcluster, also Cray  In 1989, first proposal of what led to Web was made  Somewhat heretical at the time: strongly based on e.g. use of Internet protocols, whereas official line was OSI…  Goal was to simplify task of sharing information amongst physicists: by definition distributed across the world  Technology convergence: explosion of Internet – explosion of Web  In early 1990s, first steps towards fully distributed computing with farms of RISC processors running Unix  The “SHIFT” project, winner of ComputerWorld Honors Award

7 LCG The Large Hadron Collider (LHC) A New World-Class Machine in the LEP Tunnel (First proposed in 1979!)

8 LCG

9 The LHC machine Two counter- circulating proton beams Collision energy 7 + 7 TeV 27 Km of magnets with a field of 8.4 Tesla Super-fluid Helium cooled to 1.9°K The world’s largest superconducting structure

10 LCG The ATLAS detector – the size of a 6 floor building!

11 LCG The Atlas Cavern – January 03

12 LCG Data Acquisition Multi-level trigger Filters out background Reduces data volume Record data 24 hours a day, 7 days a week Equivalent to writing a CD every 2 seconds Level 3 – Giant PC Cluster 160 Hz 160 Hz (320 MB/sec) Data Recording & Offline Analysis Level 2 - Embedded Processors 40 MHz interaction rate equivalent to 2 PetaBytes/sec Level 1 - Special Hardware Atlas detector

13 LCG Oracle for Physics Data  Work on LHC Computing started ~1992 (some would say earlier…)  Numerous projects kicked off 1994/5 to look at handling multi-PB of data; move from Fortran to OO (C++) etc.  Led to production solutions from ~1997  Always said that ‘disruptive technology’, like Web, would have to be taken into account  In 2002, major project started to move 350TB of data out of ODBMS solution; >100MB/s for 24 hour periods  Now ~2TB of physics data stored in Oracle on Linux servers  A few % of total data volume; expected to double in 2004

14 LCG Linux for Physics Computing  First steps with Linux started ~1993  Port of Physics Application Software to Linux on PCs  1996: proposal to setup Windows-based batch farms for Physics Data Processing  Overtaken by developments in Linux:  Windows a poor match for batch environment  Linux essential trivial to port to from Solaris, HP/UX etc  Convergence of technologies: PC h/w offers unbeatable price- performance; Linux becomes robust  ~All Physics Computing at CERN now based on Linux / Intel  Strategic platform for the LHC

15 LCG The Grid The Solution to LHC Computing? LHC Computing Project = LHC Computing Grid (LCG)

16 LCG LHC Computing Grid (LCG)  Global requirements: handle processing and data handling needs of 4 main LHC collaborations  Total of 12-14 PB of data per year (>20 million CDs); lifetime 10+ years  Analysis will require equivalent of 70,000 of today’s fastest PCs  LCG project established to meet these unprecedented requirements  Builds on work of European DataGrid (EDG) and Virtual Data Toolkit (US)  Physicists access world-wide distributed data & resources as if local  System determines where job runs, based on resources required/available  Initial partners include sites in CH, F, D, I, UK, US, Japan, Taiwan & Russia

17 LCG Centres taking part in the initial LCG service (2003-05) around the world  around the clock

18 LCG LCG and Oracle  Current thinking is that bulk data will be streamed to files  RDBMS backend also being studied for ‘analysis data’  File catalog (10 9 files) and file-level metadata will be stored in Oracle in a Grid-aware catalog  In longer term, event level metadata may also be stored in the database, leading to much larger data volumes  A few PB, assuming total data volume of 100-200PB  Current storage management system – CASTOR at CERN – also uses a database to manage the naming / location of files  bulk data stored in tape silos and faulted in to huge disk caches

19 LCG Storage Element Replica Location Services Replica Manager Local Replica Catalog Replica Metadata Catalog Storage Element Files have replicas stored at many Grid sites on Storage Elements. Each file has a unique GUID. Locations corresponding to the GUID are kept in the Replica Location Service. Users may assign aliases to the GUIDs. These are kept in the Replica Metadata Catalog. The Replica Manager provides atomicity for file operations, assuring consistency of SE and catalog contents. james.casey@cern.ch

20 LCG Today’s Deployment at CERN rlsatlasrlsalicerlscmsrlslhcbrlsdteamrlscert02rlscert01 lxshare071d rlstest lxshare169d lxshare183d lxshare069d  Oracle Application Server hosting Grid Middleware per VO  Shared Oracle Database for LHC Experiments  Based on ‘standard parts’ out of CERN stores  Disk server (1TB mirrored disk); Farm node (dual processor)

21 LCG Future Deployment  Currently studying 9iRAC on supported h/w configurations  Expect Grid infrastructure to move to AS cluster + RAC in Q1/2 2004  Expect CASTOR databases to move to RAC also in 2004  May also move few TB of event-level metadata (COMPASS) to a single RAC  All based on Linux / Intel

22 LCG Summary  During past decade, have moved from era of Web to Grid  Rise of Internet Computing, move from mainframes to RISC to farms of dual-processor Intel boxes running Linux  Use of Oracle has expanded from small, dedicated service for LEP construction to all areas of the lab’s work, including handling Physics Data  Both Oracle DB and AS, including for Grid infrastructure  The Grid viewed as ‘disruptive technology’ in that it will change the way we think about computing, much like the Web


Download ppt "LCG LHC Computing Grid Project From the Web to the Grid 23 September 2003 Jamie Shiers, Database Group IT Division, CERN, Geneva, Switzerland"

Similar presentations


Ads by Google