Middleware Development and Deployment Status

Slides:



Advertisements
Similar presentations
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
Advertisements

Particle physics – the computing challenge CERN Large Hadron Collider –2007 –the worlds most powerful particle accelerator –10 petabytes (10 million billion.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
3 June 2004GridPP10Slide 1 GridPP Dissemination Sarah Pearce Dissemination Officer
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Training and the NGS Mike Mineter
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
LCG EGEE is a project funded by the European Union under contract IST LCG PEB, 7 th June 2004 Prototype Middleware Status Update Frédéric Hemmer.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
…building the next IT revolution From Web to Grid…
Tony Doyle - University of Glasgow 8 July 2005Collaboration Board Meeting GridPP Report Tony Doyle.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
INFSO-RI Enabling Grids for E-sciencE Experience of using gLite for analysis of ATLAS combined test beam data A. Zalite / PNPI.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Guy Warner NeSC Training Team Induction to Grid Computing.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
GridPP, The Grid & Industry
Bob Jones EGEE Technical Director
Grid Computing: Running your Jobs around the World
The EDG Testbed Deployment Details
EGEE Middleware Activities Overview
U.S. ATLAS Grid Production Experience
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
Moving the LHCb Monte Carlo production system to the GRID
EO Applications Parallel Session
Practical: The Information Systems
Collaboration Meeting
Stephen Pickles Technical Director, GOSC
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Comparison of LCG-2 and gLite v1.0
INFN-GRID Workshop Bari, October, 26, 2004
Introduction to Grid Technology
The GENIUS portal Roberto Barbera University of Catania and INFN
UK GridPP Tier-1/A Centre at CLRC
Grid Portal Services IeSE (the Integrated e-Science Environment)
Building a UK Computing Grid for Particle Physics
Short update on the latest gLite status
LCG middleware and LHC experiments ARDA project
Operating the LCG and EGEE Production Grid for HEP
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
LHC Data Analysis using a worldwide computing grid
Report on GLUE activities 5th EU-DataGRID Conference
Collaboration Board Meeting
The GENIUS portal and the GILDA t-Infrastructure
Status of Grids for HEP and HENP
gLite The EGEE Middleware Distribution
gLite and LCG-2: an overview
The LHCb Computing Data Challenge DC06
Presentation transcript:

Middleware Development and Deployment Status Tony Doyle 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk Contents What are the Challenges? What is the scale? How does the Grid work? What is the status of (EGEE) middleware development? What is the deployment status? What is GridPP doing as part of the International effort? What was GridPP1? Is GridPP a Grid? What is planned for GridPP2? What lies ahead? Summary Why? What? How? When? 9 November 2004 PPE & PPT Lunchtime Talk

Science generates data and might require a Grid? Earth Observation Bioinformatics Astronomy Digital Curation Healthcare ? Collaborative Engineering 9 November 2004 PPE & PPT Lunchtime Talk

What are the challenges? Must share data between thousands of scientists with multiple interests link major (Tier-0 [Tier-1]) and minor (Tier-1 [Tier-2]) computer centres ensure all data accessible anywhere, anytime grow rapidly, yet remain reliable for more than a decade cope with different management policies of different centres ensure data security be up and running routinely by 2007 9 November 2004 PPE & PPT Lunchtime Talk

What are the challenges? Data Management, Security and Sharing 2. Software efficiency 1. Software process 3. Deployment planning 4. Link centres 10. Policies 5. Share data Data Management, Security and Sharing 9. Accounting 8. Analyse data 7. Install software 6. Manage data 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk Tier-1 Scale Step-1.. financial planning Step-2.. Compare to (e.g. Tier-1) expt. requirements Ian Foster / Carl Kesselman: "A computational Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities." Step-3.. Conclude that more than one centre is needed Step-4.. A Grid? Currently network performance doubles every year (or so) for unit cost. 9 November 2004 PPE & PPT Lunchtime Talk

What is the Grid? Hour Glass I. Experiment Layer e.g. Portals II. Application Middleware e.g. Metadata III. Grid Middleware e.g. Information Services IV. Facilities and Fabrics e.g. Storage Services 9 November 2004 PPE & PPT Lunchtime Talk

How do I start? http://www.gridpp.ac.uk/start/ Getting started as a Grid user Quick start guide for LCG2 GridPP guide to starting as a user of the Large Hadron Collider Computing Grid. Getting an e-science certificate In order to use the Grid you need a Grid certificate. This page introduces the UK e-Science Certification Authority, which issues cerficates to users. You can get a certificate from here. Using the LHC Computing Grid (LCG) CERN's guide on the steps you need to take in order to become a user of the LCG. This includes contact details for support. LCG user scenario This describes in a practical way the steps a user has to follow to send and run jobs on LCG and to retrieve and process the output successfully. Currently being improved.. 9 November 2004 PPE & PPT Lunchtime Talk

Job Submission (behind the scenes) Replica Catalogue UI JDL Input “sandbox” DataSets info grid-proxy-init Information Service Output “sandbox” SE & CE info Resource Broker Output “sandbox” Expanded JDL Job Submit Event Job Query Job Status Input “sandbox” + Broker Info Publish Author. &Authen. Storage Element Globus RSL Job Submission Service Job Status Logging & Book-keeping Compute Element Job Status 9 November 2004 PPE & PPT Lunchtime Talk

Enabling Grids for E-sciencE Deliver a 24/7 Grid service to European science build a consistent, robust and secure Grid network that will attract additional computing resources. continuously improve and maintain the middleware in order to deliver a reliable service to users. attract new users from industry as well as science and ensure they receive the high standard of training and support they need. 100 million euros/4years, funded by EU >400 software engineers + service support 70 European partners 9 November 2004 PPE & PPT Lunchtime Talk

Prototype Middleware Status & Plans (I) Workload Management AliEn TaskQueue EDG WMS (plus new TaskQueue and Information Supermarket) EDG L&B Computing Element Globus Gatekeeper + LCAS/LCMAPS Dynamic accounts (from Globus) CondorC Interfaces to LSF/PBS (blahp) “Pull components” AliEn CE gLite CEmon (being configured) Blue: deployed on development testbed Red: proposed LHCC Comprehensive Review – November 2004 11

Prototype Middleware Status & Plans (II) Storage Element Existing SRM implementations dCache, Castor, … FNAL & LCG DPM gLite-I/O (re-factored AliEn-I/O) Catalogs AliEn FileCatalog – global catalog gLite Replica Catalog – local catalog Catalog update (messaging) FiReMan Interface RLS (globus) Data Scheduling File Transfer Service (Stork+GridFTP) File Placement Service Data Scheduler Metadata Catalog Simple interface defined (AliEn+BioMed) Information & Monitoring R-GMA web service version; multi-VO support LHCC Comprehensive Review – November 2004 12

Prototype Middleware Status & Plans (III) Security VOMS as Attribute Authority and VO mgmt myProxy as proxy store GSI security and VOMS attributes as enforcement fine-grained authorization (e.g. ACLs) globus to provide a set-uid service on CE Accounting EDG DGAS (not used yet) User Interface AliEn shell CLIs and APIs GAS Catalogs Integrate remaining services Package manager Prototype based on AliEn backend evolve to final architecture agreed with ARDA team LHCC Comprehensive Review – November 2004 13

PPE & PPT Lunchtime Talk CB PMB Deployment Board User Board Tier1/Tier2, Testbeds, Rollout Service specification & provision Requirements Application Development User feedback Metadata Storage Workload Network Security Info. Mon. ARDA Expmts EGEE LCG 9 November 2004 PPE & PPT Lunchtime Talk

Middleware Development Network Monitoring Configuration Management Grid Data Management Storage Interfaces Information Services Security 9 November 2004 PPE & PPT Lunchtime Talk

Application Development ATLAS LHCb CMS BaBar (SLAC) SAMGrid (FermiLab) QCDGrid PhenoGrid 9 November 2004 PPE & PPT Lunchtime Talk

GridPP Deployment Status GridPP deployment is part of LCG (Currently the largest Grid in the world) The future Grid in the UK is dependent upon LCG releases Three Grids on Global scale in HEP (similar functionality) sites CPUs LCG (GridPP) 90 (15) 8700 (1500) Grid3 [USA] 29 2800 NorduGrid 30 3200 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk LCG Overview By 2007: 100,000 CPUs - More than 100 institutes worldwide building on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT) prototype went live in September 2003 in 12 countries Extensively tested by the LHC experiments during this summer 9 November 2004 PPE & PPT Lunchtime Talk

Deployment Status (26/10/04) Incremental releases: significant improvements in reliability, performance and scalability within the limits of the current architecture scalability is much better than expected a year ago Many more nodes and processors than anticipated installation problems of last year overcome many small sites have contributed to MC productions Full-scale testing as part of this year’s data challenges GridPP “The Grid becomes a reality” – widely reported British Embassy (USA) Technology Sites British Embassy (Russia) 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk Data Challenges Ongoing.. Grid and non-Grid Production Grid now significant ALICE - 35 CPU Years Phase 1 done Phase 2 ongoing LCG CMS - 75 M events and 150 TB: first of this year’s Grid data challenges Entering Grid Production Phase.. 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk Data Challenge 7.7 M GEANT4 events and 22 TB UK ~20% of LCG Ongoing.. (3) Grid Production ~150 CPU years so far Largest total computing requirement Small fraction of what ATLAS need.. Entering Grid Production Phase.. 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk LHCb Data Challenge 424 CPU years (4,000 kSI2k months), 186M events UK’s input significant (>1/4 total) LCG(UK) resource: Tier-1 7.7% Tier-2 sites: London 3.9% South 2.3% North 1.4% DIRAC: Imperial 2.0% L'pool 3.1% Oxford 0.1% ScotGrid 5.1% Entering Grid Production Phase.. DIRAC alone LCG in action 1.8 106/day LCG paused Phase 1 Completed 3-5 106/day restarted 186 M Produced Events 9 November 2004 PPE & PPT Lunchtime Talk

Paradigm Shift Transition to Grid… 424 CPU · Years May: 89%:11% 11% of DC’04 Jun: 80%:20% 25% of DC’04 Jul: 77%:23% 22% of DC’04 Aug: 27%:73% 42% of DC’04 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk More Applications ZEUS uses LCG needs the Grid to respond to increasing demand for MC production 5 million Geant events on Grid since August 2004 QCDGrid For UKQCD Currently a 4-site data grid Key technologies used - Globus Toolkit 2.4 - European DataGrid eXist XML database managing a few hundred gigabytes of data 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk Issues First large-scale Grid production problems being addressed… at all levels “LCG-2 MIDDLEWARE PROBLEMS AND REQUIREMENTS FOR LHC EXPERIMENT DATA CHALLENGES” https://edms.cern.ch/file/495809/2.2/LCG2-Limitations_and_Requirements.pdf 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk 5 Is GridPP a Grid? Coordinates resources that are not subject to centralized control … using standard, open, general-purpose protocols and interfaces … to deliver nontrivial qualities of service http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf YES. This is why development and maintenance of LCG is important. VDT (Globus/Condor-G) + EDG/EGEE(Glite) ~meet this requirement. LHC experiments data challenges over the summer of 2004. http://agenda.cern.ch/fullAgenda.php?ida=a042133 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk What was GridPP1? A team that built a working prototype grid of significant scale > 1,500 (7,300) CPUs > 500 (6,500) TB of storage > 1000 (6,000) simultaneous jobs A complex project where 82% of the 190 tasks for the first three years were completed A Success “The achievement of something desired, planned, or attempted” 9 November 2004 PPE & PPT Lunchtime Talk

Aims for GridPP2? From Prototype to Production BaBar BaBarGrid EGEE CDF D0 SAMGrid ATLAS LHCb GANGA EDG ARDA LCG ALICE CMS LCG CERN Tier-0 Centre CERN Prototype Tier-0 Centre CERN Computer Centre UK Tier-1/A Centre RAL Computer Centre UK Prototype Tier-1/A Centre 4 UK Tier-2 Centres 19 UK Institutes 4 UK Prototype Tier-2 Centres Separate Experiments, Resources, Multiple Accounts Prototype Grids 'One' Production Grid 2004 2001 2007 9 November 2004 PPE & PPT Lunchtime Talk

Planning: GridPP2 ProjectMap Structures agreed and in place (except LCG phase-2) 9 November 2004 PPE & PPT Lunchtime Talk

What lies ahead? Some mountain climbing.. Annual data storage: 12-14 PetaBytes per year CD stack with 1 year LHC data (~ 20 km) 100 Million SPECint2000 Importance of step-by-step planning… Pre-plan your trip, carry an ice axe and crampons and arrange for a guide… Concorde (15 km) In production terms, we’ve made base camp  100,000 PCs (3 GHz Pentium 4) We are here (1 km) Quantitatively, we’re ~9% of the way there in terms of CPU (9,000 ex 100,000) and disk (3 ex 12-14*3 years)… 9 November 2004 PPE & PPT Lunchtime Talk

PPE & PPT Lunchtime Talk Why? 2. What? 3. How? 4. When? From Particle Physics perspective the Grid is: 1. needed to utilise large-scale computing resources efficiently and securely 2. a) a working prototype running today on large testbed(s)… b) about seamless discovery of computing resources c) using evolving standards for interoperation d) the basis for computing in the 21st Century e) not (yet) as transparent or robust as end-users need 3. see the GridPP getting started pages (two-day EGEE training courses available) a) Now, at prototype level, for simple(r) applications (e.g. experiment Monte Carlo production) b) September 2007 for more complex applications (e.g. data analysis) – ready for LHC 9 November 2004 PPE & PPT Lunchtime Talk