EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Steven Newhouse Technical Director CERN EGEE-III First Review, June, 2009 Technical Status
Enabling Grids for E-sciencE EGEE-III INFSO-RI Project Overview users 139,000 LCPUs (cores) 25Pb disk 39Pb tape 12 million jobs/month +45% in a year 268 sites +5% in a year 48 countries +10% in a year 162 VOs +29% in a year Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI So what does EGEE actually do? Builds and supports user communities on the grid Integrates and provides a worldwide infrastructure Collaboration and Technical Leadership worldwide Technical Status - Steven Newhouse - EGEE-III First Review June Training Application Porting User Support Software Development Integration, Test & Certification DeploymentOperations Collaborating Projects StandardsPolicy
Enabling Grids for E-sciencE EGEE-III INFSO-RI Supporting New Communities Winter and Summer Grid Schools for the community –gLite, UNICORE, Globus, GridSAM, Condor, OGSA-DAI,... Regionally Driven Training Events –101 training events at 56 locations in 29 countries –1424 unique participants attending 4431 training days –High satisfaction: 5.1/6.0 Application Porting Support –15 applications ported, 10 currently underway Recommended External Software for EGEE CommuniTies –Public criteria and assessment process for entry into RESPECT Software that builds on gLite and supported by the community –11 programs covering: Simplified access, Workload management, New Resources, Infrastructure Services Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Supporting Science Technical Status - Steven Newhouse - EGEE-III First Review June Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences Changes in Resource Utilisation Number of jobs x2 over the period Proportion of HEP usage ~77% End-user activity 13,000 end-users in 112 VOs +44% users in a year 23 core VOs A core VO has >10% of usage within its science cluster
Enabling Grids for E-sciencE EGEE-III INFSO-RI Physical Resources Connecting Users to Resources Technical Status - Steven Newhouse - EGEE-III First Review June ComputersDisks Tape Middleware Applications
Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE Maintained Components External Components gLite Middleware Technical Status - Steven Newhouse - EGEE-III First Review June Physical Resources General Services LHC File Catalogue LHC File Catalogue Hydra Workload Management Service Workload Management Service File Transfer Service File Transfer Service Logging & Book keeping Service Logging & Book keeping Service AMGA Storage Element Disk Pool Manager dCache Information Services BDII MON User Interface User Access Security Services Security Services Virtual Organisation Membership Service Virtual Organisation Membership Service Authz. Service SCAS Proxy Server LCAS & LCMAPS Compute Element CREAM LCG-CE gLExec BLAH Worker Node User Interface
Enabling Grids for E-sciencE EGEE-III INFSO-RI New gLite Releases Incremental delivery of functionality –gLite 3.1: 22 updates across all node types –gLite 3.2: Releases for the Worker Node and User Interface –Ability to roll back when an issue is found with a release Focus on maintenance to improve reliability & stability –Improvement of multi-platform support –Incremental introduction of IPv6 support Introduction of CREAM to replace the LCG-CE –Provides ‘next generation’ CE with increased capability Implementation of an Authorization Service (Argus) –Consistent framework for site, region, VO & grid authorization –Initial rollout planned during EGEE-III for site level functionality Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Infrastructure Operations Improved Reliability and Availability –Introduction of local site monitoring –Now a larger infrastructure with fewer staff –Figures reflect software & hardware issues Weighted by site size within a region Summed across all regions Fire at AGSC took whole data centre out! Deployment of seed resources –Bootstrapping new user communities –Distributed across 4 sites 257 cores and 27 TB of disk space Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Network Support End to end support for networking issues Integrating network monitoring tools into support portal Progress continues on porting to IPv6 through testbed Design and implementation of the LHC Optical Private Network operational model Technical Status - Steven Newhouse - EGEE-III First Review June ENOC ensuring E2E connectivity for Grid sites on the whole path GÉANT2 NREN A RC 1 Grid site 1 NREN B RC 2 Grid site 2 Operated by DANTE Operated by NOC of NREN A Operated by NOC of NREN B Operated by NOC of RC2 Operated by NOC of RC1
Enabling Grids for E-sciencE EGEE-III INFSO-RI Interoperation & Interoperability End-user driven relationships –Federation of Open Science Grid & Nordic Data Grid Facility –Workload Management System: Submit to ARC in NDGF Actively used by the CMS experiment Production Grid Infrastructures –Build on experience of ARC, UNICORE and gLite –Work within the Open Grid Forum for next generation specification for job submission Nationally –Interaction with collaborating e-Infrastructures –Interaction with national software deployments Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Community Engagement EGEE’08, Istanbul –529 participants from 47 countries EGEE 4th User Forum, Catania –Joint event with OGF 25 & OGF-Europe –18 demos –37 posters –101 oral presentations Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Collaboration Collaborating Projects –17 active projects with 6 completed during the first period –15 letters of support have been signed –Memorandum of Understanding to formalise collaboration Infrastructure: EDGeS, BalticGrid-II, SEE-GRID-SCI, Kazakh-British Technical University (Kazakhstan Grid) General: OGF-Europe & GENESI-DR Drafts: EELA-2 & RESERVOIR Bridging between e-Infrastructures –Application level use of EGEE & DEISA resources –Demonstrated with the EUFORIA project using Kepler (workflow) –9 applications ported by Fusion cluster to EGEE Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Policy European e-Infrastructure Forum (EEF) –Purpose: “… discussion of principles and practices to create synergies for distributed Infrastructures” –Membership: EGEE, DEISA, GEANT, PRACE, EGI, Terena –Meeting quarterly for 2-3 hours Infrastructure Policy Group (IPG) –Purpose: “meeting of the major worldwide e-infrastructure projects” –Membership: EGEE, DEISA, TeraGrid, OSG, NAREGI –Meeting at OGF for 2-3 hours. –Recent topics: Alignment of security, accounting & resource allocation policies –Further details: Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Standards: Open Grid Forum Organizational roles –Strategic Management: Board –Area Directors: Applications, Data & Security Key Technical Leadership –GLUE Working Group GLUE 2.0 specification Complete. –Production Grid Infrastructure Working Group Evolution of BES 1.0 and JSDL 1.0 specifications –Grid Storage Resource Management Working Group Revisions of the SRM specification to track production usage Strong relationship with OGF-Europe –EGEE UF4, CloudScape Workshop, Business Outreach Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Grids and Clouds Analysis of the Cloud in the context of EGEE Grids –“An EGEE Comparative study: Grids and Clouds – evolution or revolution?” Long term can envisage several scenarios: –Provision of VO specific virtualised Worker Nodes –Virtualise Worker Nodes for scale out to the cloud –Completely virtual EGEE site RESERVOIR collaboration to explore issues –Draft MoU Technical Status - Steven Newhouse - EGEE-III First Review June Public/Private Cloud Provider Site Services National Infrastructure Services Site Services Worker Nodes Virtual Machine Infrastructure Public/Private Cloud Provider Worker Nodes Virtual Machine Infrastructure
Enabling Grids for E-sciencE EGEE-III INFSO-RI Technical Coordination Technical Management Board meets regularly –Representation from all the stakeholders within the project –Working groups MPI: Investigates deployment issues relating to MPI uptake CREAM: Development of certification and deployment plans Security Coordination Group –Integrates various security functions –OSCT: Security service challenge –JSPG: Policy for VOs, Portals,... –MWSG: Meetings with: UNICORE OSG Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI European Grid Initiative EGEE has been engaging with EGI Design Study –DNA1.4 collected responses from EGEE and related projects –All Activity meeting in January 2009 highlighted several issues Most of the engagement to date has been managerial –Project office and activity/task leaders –Experts participating in EGI_DS Task Forces –Migrating to the EGI model is the main objective for Year II Specific presentation on Thursday morning –Technical understanding of EGI model continues What does it mean for middleware integration & deployment? How does the operational model need to change with ~40 NGIs? There are many open questions.... some of them critical! Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Risks Avoided Partner(s) fail to complete their task Mis-alignment of strategy and implementation with collaborating infrastructure projects Dissemination of incorrect information Failure to attract suitable trainers Resource congestion due to LHC startup Inadequate support for third party components Grid operations remains a labour intensive task Malicious attacks on the grid infrastructure or tools Unannounced network availability Slow standardisation and industry uptake Delays in the development roadmap * From EGEE-III DoW, Section 3.2.3, Table 11-13, Pages 216+ Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Risks Encountered Failure to provide required functionality to application community –MPI support continues to be an issue and is being followed up through the TMB Low business uptake of gLite –Standalone adoption by business is slow, but plenty of engagement by companies in the support of research projects Failure to implement EGI transition while maintaining production service –EGI structures represent mostly an evolution from EGEE –EGI risks and timeline addressed in Year II plans presentation Technical Status - Steven Newhouse - EGEE-III First Review June
Enabling Grids for E-sciencE EGEE-III INFSO-RI Summary User community and usage continues to grow –Diversity of supported application communities increases –Training and technical support for new and existing users Incremental middleware releases through gLite –Primary focus in EGEE-III is on support & maintenance –Stabilisation provides a platform for other groups Delivery of leading world class e-infrastructure –Incremental growth of the physical infrastructure –Availability and reliability continues to improve Leadership & Collaboration in Europe and Worldwide –Technical within the OGF and collaborating projects –Policy interactions through EEF, IPG, and other bodies Transition to EGI provides many challenges for year II Technical Status - Steven Newhouse - EGEE-III First Review June