INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org The EGEE Project Status Ian Bird EGEE Operations Manager CERN Geneva, Switzerland ISGC, Taipei.

Slides:



Advertisements
Similar presentations
INFSO-RI Enabling Grids for E-sciencE The EGEE project Fabrizio Gagliardi Project Director EGEE CERN, Switzerland Research Infrastructures.
Advertisements

EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
LCSC October The EGEE project: building a grid infrastructure for Europe Bob Jones EGEE Technical Director 4 th Annual Workshop on Linux.
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
INFSO-RI Enabling Grids for E-sciencE Status of EGEE Operations Ian Bird, CERN SA1 Activity Leader EGEE 3 rd Conference Athens,
Dr. Harald Kornmayer A distributed, Grid-based analysis system for the MAGIC telescope, CHEP 2004, Interlaken1 H. Kornmayer, IWR, Forschungszentrum Karlsruhe.
Enabling, facilitating and delivering quality training in the UK and Internationally The challenge of grid training and education David Fergusson, Deputy.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
INFSO-RI Enabling Grids for E-sciencE Comparison of LCG-2 and gLite Author E.Slabospitskaya Location IHEP.
1 Introduction to EGEE-II Antonio Fuentes Tutorial Grid Madrid, May 2007 RedIRIS/Red.es (Slices of Bob Jone, Director of EGEE-II.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
GGF12 – 20 Sept LCG Incident Response Ian Neilson LCG Security Officer Grid Deployment Group CERN.
INFSO-RI Enabling Grids for E-sciencE iASTRO MC MEETING&WORKSHOP, 27-30, APRIL, 2005,SOFIA, BULGARIA Introduction to Grid Technologies.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Related Projects Dieter Kranzlmüller Deputy.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
INFSO-RI Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.
INFSO-RI Enabling Grids for E-sciencE Status and Plans of gLite Middleware Erwin Laure 4 th ARDA Workshop 7-8 March 2005.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.
INFSO-RI Enabling Grids for E-sciencE Plan until the end of the project and beyond, sustainability plans Dieter Kranzlmüller Deputy.
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
EGEE is a project funded by the European Union under contract IST Middleware Planning for LCG/EGEE Bob Jones EGEE Technical Director e-Science.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE II: an eInfrastructure for Europe and.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
INFSO-RI Enabling Grids for E-sciencE External Projects Integration Summary – Trigger for Open Discussion Fotis Karayannis, Joanne.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ian Bird LCG Deployment Area Manager & EGEE Operations Manager IT Department, CERN Presentation to HEPiX 22 nd October 2004 LCG Operations.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
EGEE-II INFSO-RI EGEE and gLite are registered trademarks The EGEE Production Grid Ian Bird EGEE Operations Manager HEPiX Jefferson Lab, 12 th October.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
INFSO-RI Enabling Grids for E-sciencE Experience of using gLite for analysis of ATLAS combined test beam data A. Zalite / PNPI.
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project David Foster Communications Systems Group Leader Bob Jones EGEE Technical.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Gergely Sipos Activity Deputy Manager MTA.
INFSO-RI Enabling Grids for E-sciencE EGEE ‘s Strategy on Grid and Web Services Fabrizio Gagliardi Open Middleware Infrastructure.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
INFSO-RI Enabling Grids for E-sciencE Comp Chem and MAGIC (based on info from Den Haag and follow-up since) F Harris on behalf of.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
EGEE is a project funded by the European Union under contract IST EGEE Summary NA2 Partners April
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
INFSO-RI Enabling Grids for E-sciencE Quality Assurance Gabriel Zaquine - JRA2 Activity Manager - CS SI EGEE Final EU Review
INFSO-RI Enabling Grids for E-sciencE Technical Overview Bob Jones, Technical Director, CERN EGEE 1 st EU Review 9-11/02/2005.
EGEE Project Review Fabrizio Gagliardi EDG-7 30 September 2003 EGEE is proposed as a project funded by the European Union under contract IST
H. Kornmayer MAGIC-Grid EGEE, Panel discussion, Pisa, Monte Carlo Production for the MAGIC telescope A generic application of EGEE Towards.
DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 The EU DataGrid Project Three years.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
INFSO-RI Enabling Grids for E-sciencE An introduction to EGEE Mike Mineter NeSC Edinburgh
INFSO-RI Enabling Grids for E-sciencE Introduction to EGEE and the training infrastructure Dr. Rüdiger Berlich, Forschungszentrum.
INFSO-RI Enabling Grids for E-sciencE Status and evolution of the EGEE Project and its Grid Middleware By Frédéric Hemmer Middleware.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using EGEE middleware: Putting it all together!
INFSO-RI Enabling Grids for E-sciencE EGEE general project update Fotis Karayannis EGEE South East Europe Project Management Board.
INFSO-RI Enabling Grids for E-sciencE Introduction to EGEE Dr. Rüdiger Berlich, Forschungszentrum Karlsruhe / Germany Dr. Mike Mineter.
INFSO-RI Enabling Grids for E-sciencE Overview Dr. Marcel Kunze Institut für Wissenschaftliches Rechnen Forschungszentrum Karlsruhe.
Bob Jones EGEE Technical Director
EGEE Middleware Activities Overview
Ian Bird GDB Meeting CERN 9 September 2003
Long-term Grid Sustainability
EGEE support for HEP and other applications
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE The EGEE Project Status Ian Bird EGEE Operations Manager CERN Geneva, Switzerland ISGC, Taipei 27 th April 2005

Enabling Grids for E-sciencE INFSO-RI ISGC Contents The EGEE Project –Overview and Structure –Grid Operations –Middleware –Networking Activities –Applications  HEP  …  Biomedical Summary

Enabling Grids for E-sciencE INFSO-RI ISGC EGEE goals Goal of EGEE: develop a service grid infrastructure which is available to scientists 24 hours-a-day The project concentrates on: –building a consistent, robust and secure Grid network that will attract additional computing resources –continuously improve and maintain the middleware in order to deliver a reliable service to users –attracting new users from industry as well as science and ensure they receive the high standard of training and support they need

Enabling Grids for E-sciencE INFSO-RI ISGC EGEE EGEE is the largest Grid infrastructure project in Europe: 70 leading institutions in 27 countries, federated in regional Grids Leveraging national and regional grid activities ~32 M Euros EU funding for initially 2 years starting 1st April 2004 EU review, February 2005 successful Preparing 2 nd phase of the project – proposal to EU Grid call September 2005 Promoting scientific partnership outside EU

Enabling Grids for E-sciencE INFSO-RI ISGC EGEE Activities 48 % service activities (Grid Operations, Support and Management, Network Resource Provision) 24 % middleware re-engineering (Quality Assurance, Security, Network Services Development) 28 % networking (Management, Dissemination and Outreach, User Training and Education, Application Identification and Support, Policy and International Cooperation) Emphasis in EGEE is on operating a production grid and supporting the end-users

Enabling Grids for E-sciencE INFSO-RI ISGC EGEE Activities Emphasis in EGEE is on operating a production grid and supporting the end-users

Enabling Grids for E-sciencE INFSO-RI ISGC Country providing resources Country anticipating joining EGEE/LCG In EGEE-0 (LCG-2):  >130 sites  >14,000 CPUs  >5 PB storage Computing Resources – April 2005 This greatly exceeds the project expectations for numbers of sites Shows that the main issue of complexity is the number of sites

Enabling Grids for E-sciencE INFSO-RI ISGC SA1 – Operations Structure Operations Management Centre (OMC): –At CERN – coordination etc Core Infrastructure Centres (CIC) –Manage daily grid operations – oversight, troubleshooting –Run essential infrastructure services –Provide 2 nd level support to ROCs –UK/I, Fr, It, CERN, + Russia (M12) –Taipei will also run a CIC Regional Operations Centres (ROC) –Act as front-line support for user and operations issues –Provide local knowledge and adaptations –One in each region – many distributed User Support Centre (GGUS) –In FZK – manage PTS – provide single point of contact (service desk) –Not foreseen as such in TA, but need is clear

Enabling Grids for E-sciencE INFSO-RI ISGC Grid Operations The grid is flat, but Hierarchy of responsibility –Essential to scale the operation CICs act as a single Operations Centre –Operational oversight (grid operator) responsibility –rotates weekly between CICs –Report problems to ROC/RC –ROC is responsible for ensuring problem is resolved –ROC oversees regional RCs ROCs responsible for organising the operations in a region –Coordinate deployment of middleware, etc CERN coordinates sites not associated with a ROC CIC RC ROC RC ROC RC ROC RC ROC OMC RC - Resource Centre ROC - Regional Operations Centre CIC – Core Infrastructure Centre

Enabling Grids for E-sciencE INFSO-RI ISGC Grid monitoring –GIIS Monitor + Monitor Graphs –Sites Functional Tests –GOC Data Base –Scheduled Downtimes –Live Job Monitor –GridIce – VO + fabric view –Certificate Lifetime Monitor Operation of Production Service: real-time display of grid operations Accounting information Selection of Monitoring tools:

Enabling Grids for E-sciencE INFSO-RI ISGC Operations focus Main focus of activities now: –Improving the operational reliability and application efficiency:  Automating monitoring  alarms  Ensuring a 24x7 service  Removing sites that fail functional tests  Operations interoperability with OSG and others –Improving user support:  Demonstrate to users a reliable and trusted support infrastructure –Deployment of gLite components:  Testing, certification  pre-production service  Migration planning and deployment – while maintaining/growing interoperability  Further developments now have to be driven by experience in real use LCG-2 (=EGEE-0) prototyping product LCG-3 (=EGEE-x?) product

Enabling Grids for E-sciencE INFSO-RI ISGC EGEE Activities Emphasis in EGEE is on operating a production grid and supporting the end-users

Enabling Grids for E-sciencE INFSO-RI ISGC gLite middleware –The 1st release of gLite (v1.0) made end March’05   –Lightweight services –Interoperability & Co-existence with deployed infrastructure –Performance & Fault Tolerance –Portable –Service oriented approach –Site autonomy –Open source license

Enabling Grids for E-sciencE INFSO-RI ISGC Job management Services –Workload Management –Computing Element –Logging and Bookkeeping Data management Services –File and Replica catalog –File Transfer and Placement Services –gLite I/O Information Services –R-GMA –Service Discovery Security Deployment Modules –Distribution available as RPM’s, Binary Tarballs, Source Tarballs and APT cache gLite Release 1.0 Grid Access Service API Access Services Job Provenance Job Management Services Computing Element Workload Management Package Manager Metadata Catalog Data Services Storage Element Data Management File & Replica Catalog Authorization Security Services Authentication Auditing Information & Monitoring Information & Monitoring Services Application Monitoring Site Proxy Accounting JRA3 UK CERN IT/CZ Serious testing & certification is just starting

Enabling Grids for E-sciencE INFSO-RI ISGC gLite Services for Release 1.0 Components Summary and Origin Computing Element –Gatekeeper, WSS (Globus) –Condor-C (Condor) –CE Monitor (EGEE) –Local batch system (PBS, LSF, Condor) Workload Management –WMS (EDG) –Logging and bookkeeping (EDG) –Condor-C (Condor) Storage Element –File Transfer/Placement (EGEE) –glite-I/O (AliEn) –GridFTP (Globus) –SRM: Castor (CERN), dCache (FNAL, DESY), other SRMs Catalog –File and Replica Catalog (EGEE) –Metadata Catalog (EGEE) Information and Monitoring –R-GMA (EDG) –Service Discovery (EGEE) Security –VOMS (DataTAG, EDG) –GSI (Globus) –Authentication for C and Java based (web) services (EDG)

Enabling Grids for E-sciencE INFSO-RI ISGC Main Differences to LCG-2 Workload Management System works in push and pull mode Computing Element moving towards a VO based scheduler guarding the jobs of the VO (reduces load on GRAM) Re-factored file & replica catalogs Secure catalogs (based on user DN; VOMS certificates being integrated) Scheduled data transfers SRM based storage Information Services: R-GMA with improved API, Service Discovery and registry replication Move towards Web Services

Enabling Grids for E-sciencE INFSO-RI ISGC EGEE Activities Emphasis in EGEE is on operating a production grid and supporting the end-users

Enabling Grids for E-sciencE INFSO-RI ISGC Outreach & Training Public and technical websites constantly evolving to expand information available and keep it up to date 2 conferences organised –~ Cork, ~ Den Haag Athens 3rd project conference April ’05 – Pisa 4th project conference October ’05 More than 70 training events (including the GGF grid school) across many countries –~1000 people trained  induction; application developer; advanced; retreats –Material archive with more than 100 presentations Strong links with GILDA testbed and GENIUS portal developed in EU DataGrid

Enabling Grids for E-sciencE INFSO-RI ISGC Deployment of applications Pilot applications –High Energy Physics –Biomed applications Generic applications – Deployment under way –Computational Chemistry –Earth science research –EGEODE: first industrial application –Astrophysics With interest from –Hydrology –Seismology –Grid search engines –Stock market simulators –Digital video etc. –Industry (provider, user, supplier) Many users –broad range of needs –different communities with different background and internal organization PilotNew

Enabling Grids for E-sciencE INFSO-RI ISGC High Energy Physics Very experienced and large international user community –Involvement in many projects worldwide and users of several grids (e.g. all LHC experiments do use multiple grids at the same time for their data challenges) –LG experiments; ZEUS, D0, CDF, H1, Babar Production infrastructure (LCG/EGEE) –Intensive usage during 2004 data challenges –LHCb – 3500 concurrent jobs for long periods –Many issues of functionality and performance were exposed –Data challenges were also first real use of LCG-2 – only limited testing had been done in advance –Major issue was reliability – badly configured and unstable sites –Nevertheless significant work was done:  >1 M SI2K years of cpu time (~1000 cpu years)  400 TB of data generated, moved and stored  simultaneous jobs (~4 times CERN grid capacity) ARDA role in application development and middleware testing –Helping the evolution of the experiments specific middleware towards analysis usage  Large effort on the 4 LHC experiments’ prototypes  CMS prototype migrated to gLite version 1 and exposed to several users –Early feedback on the utilisation of the gLite prototype right from the start of EGEE –Contribution to the common testing effort together with JRA1, SA1 and NA4-testing Improved reliability has been achieved by selecting well maintained sites Efficiencies of better than 90% have been possible – D0, CMS, ATLAS, in well controlled conditions This remains main area of focus for improvement – due in large part to number of sites in the infrastructure

Enabling Grids for E-sciencE INFSO-RI ISGC Recent ATLAS work ATLAS jobs in EGEE/LCG-2 in 2005 In latest period up to 8K jobs/day Used a combination of RB and Condor_G submissions Number of jobs/day ~10,000 concurrent jobs in the system

Enabling Grids for E-sciencE INFSO-RI ISGC … ZEUS on LCG-2

Enabling Grids for E-sciencE INFSO-RI ISGC LCG Deployment Schedule LHC starts in 2007 Ramp-up with series of service challenges to ensure key services & infrastructure in place Extremely aggressive timescale

Enabling Grids for E-sciencE INFSO-RI ISGC Introduction: The MAGIC Telescope Ground based Air Cerenkov Telescope Gamma ray: 30 GeV - TeV LaPalma, Canary Islands ( 28° North, 18° West ) 17 m diameter operation since autumn 2003 (still in commissioning) Collaborators: IFAE Barcelona, UAB Barcelona, Humboldt U. Berlin, UC Davis, U. Lodz, UC Madrid, MPI München, INFN / U. Padova, U. Potchefstrom, INFN / U. Siena, Tuorla Observatory, INFN / U. Udine, U. Würzburg, Yerevan Physics Inst., ETH Zürich Physics Goals: Origin of VHE Gamma rays Active Galactic Nuclei Supernova Remnants Unidentified EGRET sources Gamma Ray Burst

Enabling Grids for E-sciencE INFSO-RI ISGC ~ 10 km Particle shower Introduction – ground γ-ray astronomy ~ 1 o Cherenkov light ~ 120 m Gamma ray GLAST (~ 1 m 2 ) Cherenkov light Image of particle shower in telescope camera reconstruct: arrival direction, energy reject hadron background

Enabling Grids for E-sciencE INFSO-RI ISGC MAGIC – Hadron rejection Based on extensive Monte Carlo Simulation –air shower simulation program CORSIKA –Simulation of hadronic background is very CPU consuming  to simulate the background of one night, 70 CPUs (P4 2GHz) needs to run days  to simulate the gamma events of one night for a Crab like source takes 288 days. –At higher energies (> 70 GeV) observations are possible already by On-Off method (This reduces the On-time by a factor of two) –Lowering the threshold of the MAGIC telescope requires new methods based on Monte Carlo Simulations

Enabling Grids for E-sciencE INFSO-RI ISGC Experiences Data challenge Grid-1 12M hadron events jobs needed started march 2005 up to now ~ 4000 jobs First tests: with manual GUI submission Reasons for failure Network problems RB problems Queue problems Job successful: Output file registered at PIC Diagnostic: no tools found complex and time consuming  use metadata base, log the failure, resubmit and don‘t care 170/3780 Jobs failed  4.5 % failure

Enabling Grids for E-sciencE INFSO-RI ISGC Biomed applications Loosely coupled community Had to go the long way of getting up to speed –VO creation and core services installation –Setting up a task force of experts –Recently joined the user support at application level Applications –See list and description from web site  –12 applications running today New applications emerging –medical imaging, bioinformatics, phylogenetics, molecule structures and drug discovery... Grown to a significant infrastructure usage –29kCPU hours and 24k jobs reported on January

Enabling Grids for E-sciencE INFSO-RI ISGC Bioinformatics Grid Protein Sequence Analysis –NPSA is a web portal offering proteins databases and sequence analysis algorithms to the bioinformaticians (3000 hits per day) is a gridified version with increased computing power –Need for large databases and big number of short jobs xmipp_MLrefine –3D structure analysis of macromolecules from (very noisy) electron microscopy images –Maximum likelihood approach for finding the optimal model –Very compute intensive Drug discovery –Health related area with high performance computation need –An application currently being ported in Germany (Fraunhofer institute)

Enabling Grids for E-sciencE INFSO-RI ISGC Medical imaging GATE –Radiotherapy planning –Improvement of precision by Monte Carlo simulation –Processing of DICOM medical images –Objective: very short computation time compatible with clinical practice –Status: development and performance testing CDSS –Clinical Decision Support System –knowledge databases assembling –image classification engines widespreading –Objective: access to knowledge databases from hospitals –Status: from development to deployment, some medical end users

Enabling Grids for E-sciencE INFSO-RI ISGC Medical imaging SiMRI3D –3D Magnetic Resonance Image Simulator –MRI physics simulation, parallel implementation –Very compute intensive –Objective: offering an image simulator service to the research community –Satus: parallelized and now running on LCG2 resources gPTM3D –Interactive tool for medical images segmentation and analysis –A non gridified version is distributed in several hospitals –Need for very fast scheduling of interactive tasks –Objectives: shorten computation time using the grid –Status: development of the gridified version being finalized

Enabling Grids for E-sciencE INFSO-RI ISGC Evolution of biomedical applications Growing interest of the biomedical community –Partners involved proposing new applications –New application proposals (in various health-related areas) –Enlargement of the biomedical community (drug discovery) Growing scale of the applications –Progressive migration from prototypes to pre-production services for some applications –Increase in scale (volume of data and number of CPU hours)

Enabling Grids for E-sciencE INFSO-RI ISGC EGEE Geographical Extensions EGEE is a truly international under-taking Collaborations with other existing European projects, in particular:  GÉANT, DEISA, SEE-GRID Relations to other projects/proposals:  OSG: OpenScienceGrid (USA)  Asia: Korea, Taiwan, EU-ChinaGrid  BalticGrid: Lithuania, Latvia, Estonia  EELA: Latin America  EUMedGrid: Mediterranean Area  … Expansion of EGEE infrastructure in these regions is a key element for the future of the project and international science

Enabling Grids for E-sciencE INFSO-RI ISGC Summary EGEE is a first attempt to build a worldwide Grid infrastructure for data intensive applications from many scientific domains A large-scale production grid service is already deployed and being used for HEP and BioMed applications with new applications being ported Resources & user groups are expanding A process is in place for migrating new applications to the EGEE infrastructure A training programme has started with many events already held “next generation” middleware is being tested (gLite) First project review by the EU successfully passed in Feb’05 Plans for a follow-on project are being prepared