Download presentation
Presentation is loading. Please wait.
Published byDale Kennedy Modified over 9 years ago
1 EGI-InSPIRE RI-261323 EGI-InSPIRE EGI-InSPIRE RI-261323 SA1 and JRA1 Operations and Operational Tools D. Cesini, JRA1 Activity Manager - INFN T. Ferrari, Chief Operations Officer - SA1 & JRA1 - EGI-InSPIRE Review 2011 1 30/05/2011
2 EGI-InSPIRE RI-261323 Outline PART I –Objectives, tasks, effort, partners PART II –Resource Infrastructure PART III –Service infrastructure: status and achievements PART IV –Issues, use of resources, impact and plans SA1 & JRA1 - EGI-InSPIRE Review 20112 30/05/2011
3 EGI-InSPIRE RI-261323 SA1 Overview 30/05/2011 SA1 & JRA1 - EGI-InSPIRE Review 20113 43 Countries 45 Beneficiaries 5238 PMs 109.14 FTEs WPBeneficiaryTotal PM WP4-EEGI.eu36 WP4-ECERN59 WP4-ECNRS12 WP4-ECSC23 WP4-ECSIC29 WP4-ECYFRONET23 WP4-EGRNET70 WP4-EINFN48 WP4-EKIT-G70 WP4-ELIP17 WP4-ENCF40 WP4-ESRCE11 WP4-ESTFC75 WP4-EVR-SNIC23 WP4-NARNES94 WP4-NCESNET128 WP4-NCNRS316 WP4-NCSC67 WP4-NCSIC372 WP4-NCYFRONET156.1 WP4-NE-ARENA71 WP4-NGRENA19 WP4-NGRNET180 WP4-NICI58 WP4-NICT-BAS124 WP4-NIIAP NAS RA19 WP4-NIMCS-UL52 WP4-NINFN378 WP4-NIPB118 WPBeneficiaryTotal PM WP4-NIUCC25 WP4-NKIT-G278 WP4-NLIP107 WP4-NMTA KFKI118 WP4-NNCF159 WP4-NRENAM20 WP4-NSIGMA82 WP4-NSRCE72 WP4-NSTFC277 WP4-NSWITCH86 WP4-NTCD94 WP4-NTUBITAK130 WP4-NUCPH81 WP4-NUCY48 WP4-NUI SAV96 WP4-NUIIP NASB30 WP4-NUKIM71 WP4-NUOBL ETF75 WP4-NUOM71 WP4-NUPT32 WP4-NVR-SNIC84 WP4-NVU22 WP4-NASGC193 WP4-NASTI156 WP4-NKEK1 WP4-NKISTI92 WP4-NUNIMELB36 WP4-NNUS14 France Finland Spain Poland Greece Italy Germany Portugal Netherlands Croatia UK Sweden Slovenia Czech Republic Russia Georgia Romania Bulgaria Armenia Latvia Serbia Israel Hungary Moldova Norway Switzerland Ireland Turkey Denmark Cyprus Slovakia Belarus FYR Macedonia Bosnia & Herzegovina Montenegro Albania Lithuania Taiwan Philippines Japan Korea Australia Singapore
4 EGI-InSPIRE RI-261323 Objectives Operate a secure, reliable European-wide federated production grid infrastructure that is integrated and interoperates with other grids worldwide SA1 & JRA1 - EGI-InSPIRE Review 20114 TasksTask Objectives O1 TSA1.2 Maintain a secure infrastructure O2 TSA1.3 Validate new technology releases (tools and middleware) O3 TSA1.7 Support end-users and Resource Centre administrators O4 TSA1.8 Service Level Management, grid oversight, documentation and procedures O5 TSA1.4 TSA1.5 TSA1.6 Operate tools, the accounting infrastructure and the EGI Helpdesk O6 JRA1.2 JRA1.3 JRA1.4 JRA1.5 Evolve the operational tools used by the production infrastructure -Maintenance, development and support of national deployment -Accounting for the use of new resources (desktop, virtualization, storage, data, application and billing) 30/05/2011
5 EGI-InSPIRE RI-261323 SA1 tasks and resource distribution SA1 & JRA1 - EGI-InSPIRE Review 20115 TaskLeader/PartnerTask effort distribution TSA1.1Activity ManagementT. Ferrari/EGI.eu0.70% TSA1.2Secure InfrastructureM. Ma/STFC8.60% TSA1.3Service Deployment ValidationM. David/LIP11.00% TSA1.4Infrastructure for Grid Management E. Imamagic/ SRCE20.66% TSA1.5AccountingJ. Gordon/STFC5.81% TSA1.6Helpdesk InfrastructureT. Antoni/KIT8.76% TSA1.7Support TeamsR. Trompert/SARA28.16% TSA1.8Providing a Reliable Grid Infrastructure and core services C. Kanellopoulos/AUTH 16.31% 30/05/2011
6 EGI-InSPIRE RI-261323 JRA1 Overview 30/05/2011 SA1 & JRA1 - EGI-InSPIRE Review 20116 WPTaskBeneficiaryTotal PMs WP7-ETJRA1.1INFN24 WP7-ETJRA1.2KIT-G47 WP7-ETJRA1.2CSIC12 WP7-ETJRA1.2CNRS12 WP7-ETJRA1.2GRNET12 WP7-ETJRA1.2SRCE12 WP7-ETJRA1.2STFC24 WP7-ETJRA1.2CERN12 WP7-GTJRA1.3CSIC3 WP7-GTJRA1.3CNRS3 WP7-GTJRA1.3SRCE3 WP7-GTJRA1.3STFC3 WP7-GTJRA1.3CERN6 WP7-GTJRA1.4KIT-G18 WP7-GTJRA1.4CSIC18 WP7-GTJRA1.4INFN26 WP7-GTJRA1.4STFC27 WP7-GTJRA1.5CNRS53 7 Countries 8 Beneficiaries 315 PMs 8.67 FTE Italy Germany Spain Greece Croatia CERN France UK
7 EGI-InSPIRE RI-261323 JRA1 tasks and resource distribution SA1 & JRA1 - EGI-InSPIRE Review 20117 TaskLeader/PartnerTask effort distribution TJRA1.1Activity ManagementD. Cesini/INFN7.6% TJRA1.2Maintenance and development of the deployed operational tools T. Antoni/KIT41.6% TJRA1.3Supporting National Deployment models P. Solagna/EGI.eu5.7% (PY1 only) TJRA1.4Accounting for usage of different resource types Cloud, HPC, Desktop Grid, Storage/Data Usage Application Usage Billing system J. Gordon/SFTC28.5% (PY2-PY4 only) TJRA1.5Integrated Operations Portal Service Oriented model Harmonization with GOCDB Porting to Symfony New DCI integration C. L’Orphelin/CNRS16.6% (PY1-PY3 only) 30/05/2011
8 EGI-InSPIRE RI-261323 PART II SA1 & JRA1 - EGI-InSPIRE Review 20118 PART I –Objectives, tasks, effort, partners PART II –Resource Infrastructure Architecture Resource capacity and utilization PART III –Service infrastructure: status and achievements PART IV –Issues, use of resources, impact and plans 30/05/2011
9 EGI-InSPIRE RI-261323 EGI Resource Infrastructure SA1 & JRA1 - EGI-InSPIRE Review 20119 Resource Infrastructure Resource Centres Resource Infrastructure Resource Centres Resource Infrastructure Resource Centres Network Resource Provider NGI/EIRO Resource Provider MoUs Layer I. Resource Centre (RC) A localised or geographically distributed administration domain, where EGI resources (CPUs, data storage, instruments and digital libraries) are managed and operated to be accessed by end-users Layer II. Resource Infrastructure The federation of Resource Centres, which are interconnected by the National Research and Education Networks (NRENs) and GÉANT. Integrated Infrastructures: operated by a non-EGI-InSPIRE partner but relying on EGI operational services, e.g. Latina American and Caribbean Peer infrastructures: accessible to EGI users, but relying on own operational services, e.g. Open Science Grid (USA) Resource infrastructure Provider (RP) The legal organisation responsible for any matter that concerns the respective Resource Infrastructure EGI Participant: National Grid Initiatives (NGIs), European Intergovernmental Organizations (EIROs) Layer III. EGI Resource Infrastructure 30/05/2011
10 EGI-InSPIRE RI-261323 Status and yearly increase SA1 & JRA1 - EGI-InSPIRE Review 201110 Resource Centres 338 +6.8% 96 supporting MPI +31.5% Europe, Asia Pacific, North and South America Countries 51 (57 with integrated RPs) +18.75% Capacity 240,000 CPU cores (339,000 with integrated and peer RPs) 24.9% 1.89 Million HEP-SPEC 06* 102 PB disk, 89 PB tape * HEP-SPEC 06: Computing benchmark based on SPECCPU2006, 10 HEP-SPEC = 4 kSI2k 30/05/2011
11 EGI-InSPIRE RI-261323 From EGEE federations to NGIs April 2010: 12 EGEE federated regional infrastructures April 2011 40 European NGIs and 1 EIRO (CERN) Integrated resource infrastructures –Asia Pacific federation –Canada federation –Latin American and Caribbean Grid Initiative –Latin American federation Transition completed in January 2011 SA1 & JRA1 - EGI-InSPIRE Review 201111 30/05/2011
12 EGI-InSPIRE RI-261323 Service/Resource Centre (RC) Availability and Reliability Availability –the percentage of time that the service/RC was up and running (uptime / total_time) x 100 –Minimum RC availability: 70% Reliability –the percentage of time that the service/RC was up and running, excluding periods of scheduled interventions ( uptime / (total time – scheduled time)] x 100 ). –Min RC reliability: 75% Suspension policy –RC availability < 50% for 3 consecutive months –6 RCs suspended –Stricter policy from PY2: from 50 to 70% SA1 & JRA1 - EGI-InSPIRE Review 201112 Monthly performance reports per RC Monthly performance reports New ticket-based procedure for monitoring of underperforming RCs 30/05/2011 Overall PY1 EGI availability: 92.73% Overall PY1 EGI reliability: 93.85%
13 EGI-InSPIRE RI-261323 Usage statistics SA1 & JRA1 - EGI-InSPIRE Review 201113 MetricUnitPer monthPer day (yearly increase) Average Number Jobs (all VOs) number27.8 Million914,000 (+82%) Average Number Jobs (non-HEP VOs) number2.8 Million (10% of total) 100,000 (+47%) CPU wall clock (all VOs) hours74.8 Million2.5 Million Normalized CPU wall clock (all VOs) HEP-SPEC 06 hours 563.2 Million18.5 Million 30/05/2011
14 EGI-InSPIRE RI-261323 PART III SA1 & JRA1 - EGI-InSPIRE Review 201114 PART I –Objectives, tasks, effort, partners PART II –Resource Infrastructure PART III –Service infrastructure: status and achievements Infrastructure Services Technical Services Support Services Human Services PART IV –Issues, use of resources, impact and plans 30/05/2011
15 EGI-InSPIRE RI-261323 Operations Centres Resource Infrastructure SA1 & JRA1 - EGI-InSPIRE Review 201115 Resource Infrastructure and partners Operations Centres EGI Service Infrastructure The service infrastructure enables secure, interoperable and reliable access to distributed resources. EGI services are provided locally by Operations Centres and globally by I.Infrastructure Services tools II. Technical Services Grid middleware III. Support Services Helpdesk IV. Human Services Service Level Management, security, documentation, coordination Service categories: Resource Infrastructure Local Services Global Services 30/05/2011
16 EGI-InSPIRE RI-261323 I. Infrastructure Services SA1 & JRA1 - EGI-InSPIRE Review 201116 Infrastructure Services SA1/ JRA1 tasks Central componentsLocal components Broker network TSA1.4, JRA1.2 ActiveMQ brokers- Service Availability Monitoring TSA1.4, JRA1.2, JRA1.3 MyEGI portal, Aggregated Topology Provider, Metrics Description Database (rw), Metrics Results Store MyEGI portal, Aggregated Topology Provider, Metrics Description Database (r), Metrics Results Store, Nagios Configuration Generator Operations portal and dashboard TSA1.4, JRA1.2, JRA1.5 Central instanceLocal instance Accounting TSA1.5, JRA1.4 APEL central databases, portalSensors, national /regional repositories and portal APEL local database under development Helpdesk TSA1.6, JRA1.2 Global Grid User SupportSupport Unit/Local helpdesk Central Tools TSA1.4, JRA1.2 GOCDB Under development 30/05/2011
17 EGI-InSPIRE RI-261323 I. Infrastructure Services Service Availability Monitoring (SAM) 1/2 SAM (CERN, AUTH, SRCE) monitoring framework for RCs and services one of the main data sources for the Operations Dashboard data source to create Availability/Reliability statistics composed of various components: 1.The test submission framework: based on the NAGIOS system set up and customized by the NAGIOS Configurator (NCG) 2.The DataBase components for storage of information about topology, metrics and results 3.A message bus to publish the monitoring results (load balanced ActiveMQ broker network) 4.A visualization tool GUI: MyEGIMyEGI SA1 & JRA1 - EGI-InSPIRE Review 201117 30/05/2011
18 EGI-InSPIRE RI-261323 Achievements 8 releases (new EGI release procedure as distributed software) myEGI visualization portal in production (central and local instances) –New look and feel –MyEGI Web Service available –GridMap style plots added Database components re-engineering –ATP as new topology provider (replacing the old SAM database) Probes –Integration of ARC and GLOBUS5 probes (UNICORE in progress) –New probe for testing of the Certification Authority certificate distribution with automatic discovery of the latest version Support for –robot certificates –monitoring of uncertified sites –authorization plugin (messaging infrastructure) for denial of all broker-to-broker communications (for accounting) Other : creation of the second 2 nd level support and handover of probes development to EMI and IGE (in progress) SA1 & JRA1 - EGI-InSPIRE Review 201118 I. Infrastructure Services Service Availability Monitoring (SAM) 2/2 30/05/2011
19 EGI-InSPIRE RI-261323 I. Infrastructure Services Operations Portal and Dashboard Achievements 8 releases package for local deployment released and updated (deployed in 4 NGIs) Porting to a new web framework almost completed Improvements to all the modules –VO ID Cards module implementation driven by NA3 requirements Integration with security dashboard (in progress) New “Central Operator on Duty” view released SA1 & JRA1 - EGI-InSPIRE Review 201119 Operations PortalOperations Portal (CNRS) Broadcast tool Operational Dashboard VO Identity Cards 30/05/2011
20 EGI-InSPIRE RI-261323 I. Infrastructure Services Accounting Repository and Portal Accounting Repository (STFC) -usage of compute resources within the production infrastructure -based on gLite-APEL Accounting PortalAccounting Portal (FCTSG) GUI for access to data from the Accounting Repository Achievements New: complete integration of the APEL accounting system with the message broker network Porting of APEL tests to Nagios Design and implementation of a distributable Regional Accounting Server (in progress) Portal modified to support new GOCDB4 PI and Ops Portal XML feeds NGI View added in the portal Decommissioning of central R-GMA accounting services (Feb 2011) SA1 & JRA1 - EGI-InSPIRE Review 201120 30/05/2011
21 EGI-InSPIRE RI-261323 I. Infrastructure Services EGI Helpdesk EGI Helpdesk (KIT)EGI Helpdesk –distributed helpdesk with central coordination: Global Grid User Support (GGUS) Achievements –9 releases –Update of support teams and units –Integration of the new NGIs (31 NGIs interfaced, 22 as support units, 6 with a local helpdesk) –Definition and implementation of new workflows for technology support (1 st line, 2 nd line and 3 rd line provided by the Technology Providers – EMI, IGE etc.) and the respective access privileges support of software provisioning and bug reporting processes that involve EGI and its external technology providers –Local view (xGUS) available and deployed by various NGIs/projects SA1 & JRA1 - EGI-InSPIRE Review 201121 30/05/2011
22 EGI-InSPIRE RI-261323 I. Infrastructure Services Central configuration repository SA1 & JRA1 - EGI-InSPIRE Review 201122 Achievements -Decommissioning of GOCDB3, release and deployment of new GOCDB4 -Prototype for local deployment available but w/o synchronization system -Naming schema modification to integrate UNICORE services -GLUE2.0 compatibility for service names (ongoing) GOCDBGOCDB (STFC) EGI relies on a central configuration database to record static information contributed by the resource providers as to the service instances that they are running and the individual contact, role and status information for those responsible for particular services 30/05/2011
23 EGI-InSPIRE RI-261323 Project Management Tool Metrics Portal Metrics PortalMetrics Portal (FCTSG) prototype tool being developed for a manual/automatic collection of EGI-InSPIRE metrics from different information sources to track project and partner performance SA1 & JRA1 - EGI-InSPIRE Review 201123 30/05/2011
24 EGI-InSPIRE RI-261323 II. Technical Services SA1 & JRA1 - EGI-InSPIRE Review 201124 Technical Services SA1 tasks Central componentsLocal components Requirements TSA1.1 Gathering and prioritization at the Operations Management Board Gathering from Resource Centres and prioritization Technology Staged Rollout TSA1.3 CoordinationDeployment validation by a restricted list of Resource Centres Interoperability TSA1.4 CoordinationCollection of local requirements, GLOBUS and UNICORE integration task forces Core services TSA1.8 Authentication services for infrastructure VOs (DTEAM), WMS and top-BDII for monitoring of uncertified sites, core services for small user communities, catch-all CA File catalogues, workload managers, authentication and authorization services, data transfer schedulers Purpose: to improve the usage of the production infrastructure and generally of the technology that makes up the production infrastructure 30/05/2011
25 EGI-InSPIRE RI-261323 New process for requirements gathering (tools and deployed software) every 3 months SA1 & JRA1 - EGI-InSPIRE Review 201125 Virtual Organisation Virtual research communities User Community Board EGI Request Tracker Resource Centres Resource infrastructure Provider Operations Tools Advisory Group Operations Management Board Technology Coordination Board and EGI- JRA1 II. Technical Services Requirements gathering Requirements gathering Prioritisation Discussion with Technology Providers 30/05/2011
26 EGI-InSPIRE RI-261323 II. Technical Services Staged Rollout New software updates (grid middleware and tools) are deployed into the production infrastructure incrementally through a staged rollout to ensure that they are reliable in actual use, following successful verification of the software component against published criteria Early Adopters are the production Resource Centres willing to deploy one or more new releases –Automation of the process based on RT –Process tested with the validation of gLite 3.1/3.2 releases and SAM SA1 & JRA1 - EGI-InSPIRE Review 201126 AchievementsValue Max number of components tested/rejected in staged rollout per PQ 29/3 Max number of staged rollout tests undertaken40 (PQ4) Number of EA teams45 Middleware stacks/componentsARC, gLite, UNICORE, SAM, CA trust chain, GLOBUS (in progress) 30/05/2011
27 EGI-InSPIRE RI-261323 II. Technical Services Interoperability SA1 & JRA1 - EGI-InSPIRE Review 201127 Deployed middleware -ARC (2.38%), gLite (97.62%), UNICORE -More ARC and UNICORE installations expected in 2011 -Croatia, Germany, Poland, Romania, The Netherlands, UK integrating GLOBUS and/or UNICORE GLOBUS and UNICORE task forces Accomplishments ARC fully integrated in to GOCDB, accounting and SAM Integration of UNICORE and GLOBUS in progress Open Grid Forum Production Grid Infrastructure WG Grid Interoperability Now WG Infrastructure Policy Group 30/05/2011
28 EGI-InSPIRE RI-261323 II. Technical Services Core services Achievements Core Grid services for new/small VOs New infrastructure for the DTEAM VO membership management (troubleshooting) Membership management for OPS VO (monitoring) New infrastructure for monitoring of uncertified sites Catch all CA Local core Grid service instances –135 workload management services (WMS) –45 file catalogues (LFC) –118 information discovery services (top-BDII) –41 VO membership services (VOMS) SA1 & JRA1 - EGI-InSPIRE Review 201128 30/05/2011
29 EGI-InSPIRE RI-261323 III. Support Services 1/2 SA1 & JRA1 - EGI-InSPIRE Review 201129 Technical Services SA1 tasks Central componentsLocal components 1 st line support TSA1.7 Triage of tickets in GGUS1 st line support for tickets opened locally Grid oversight TSA1.7 Central operations support and escalation of tickets not managed locally Local operations support Network Support TSA1.7 Support to connectivity and performance problems (contact point to the NREN PERT teams) 2 nd line support: Deployment Middleware Support Unit (SA2) 3 rd line support: Technology providers 30/05/2011
30 EGI-InSPIRE RI-261323 Accomplishments New training and dissemination channels for new NGI support teams, monthly newsletter Most of the new NGIs successfully established their own local support structures Support for network performance issues in place (relying on tools for monitoring and troubleshooting) – contact point with NREN PERT teams But Grid oversight workload affected by new Operations Centres starting operations, now progressively reducing Support problems faced in some NGIs now under resolution SA1 & JRA1 - EGI-InSPIRE Review 201130 MetricValme Average number of EGI tickets CREATED/month965 tickets (~constant) Average monthly response time2.7 operating hours Average median of monthly solution time5.8 operating hours III. Support Services 2/2 30/05/2011
31 EGI-InSPIRE RI-261323 IV. Human Services SA1 & JRA1 - EGI-InSPIRE Review 201131 Human Services SA1 tasks Central componentsLocal components Service Level Management TSA1.8 CoordinationMonitoring of local performance and support to Resource Centre administrators Operational security TSA1.2 CoordinationIncident response (EGI CSIRT), security monitoring, security drills, software vulnerability assessment Documentation TSA1.8 CoordinationContribution to manuals, procedures and best practices Operations Management TSA1.1 Coordination of the Operations Management Board Local operations management 30/05/2011
32 EGI-InSPIRE RI-261323 IV. Human Services Service Level Management Purpose provide the metrics for conformance of the achieved level of service to the agreed one ensure that the agreed level of service is provided (monitoring and reporting on Service Levels) Achievements –Definition of the EGI Resource Centre Operational Level Agreement [ITIL v3]: duties, services and the related quality parameters. agreement between an IT Service Provider (EGI) and another part of the same Organization (Resource Centre) an OLA supports the IT Service Provider's (EGI) delivery of IT Services (Grid) to Customers (end-users) –Resource Provider OLA in progress –Definition of new GGUS-based process for Service Level Management (involving the central operators on duty – COD) –New suspension policy SA1 & JRA1 - EGI-InSPIRE Review 201132 30/05/2011
33 EGI-InSPIRE RI-261323 IV. Human Services Operational security Handling potential vulnerabilities reported Vulnerability assessment Secure coding education Computer Security Incident Response Team Security Coordination Group Coordinate overall EGI security activities Security Policy Group Develop and maintain security policies EUGridPMA European Policy Management Authority for Grid Authentication: Coordinate the trust fabric for e- Science authentication in Europe Software Vulnerability Group Security incident response Security monitoring of EGI infrastructure Security drills Security training and dissemination Achievements EGI CSIRTSecurity Service Challenge 4 13 RCs tested (including WLCG Tier1 sites) 9 security incidents handled 12 advisories issued (3 critical) 3 critical vulnerabilities mitigated within 7 days 1 security training session (EGI TF) SVG29 software vulnerabilities reported 15 concerning Grid middleware 4 fixed (others have not passed their Target Date yet) Procedures3 new procedures Software vulnerability handling Critical vulnerability handling Security incident (exploited vulnerability) handling Resource Centres suspended 0 30/05/2011 SA1 & JRA1 - EGI-InSPIRE Review 201133
34 EGI-InSPIRE RI-261323 IV. Human Services Documentation Documentation collected at the EGI wiki (160 operations pages)wiki 9 new procedures defined and approvedprocedures 3 new manuals and 4 how-TOs (in progress)manuals Migration and update of existing legacy technical documentation (in progress) Mirroring of EGI wiki at ASGC SA1 & JRA1 - EGI-InSPIRE Review 201134 30/05/2011
35 EGI-InSPIRE RI-261323 PART IV SA1 & JRA1 - EGI-InSPIRE Review 201135 PART I –Objectives, tasks, effort, partners PART II –Resource Infrastructure PART III –Service infrastructure: status and achievements PART IV –Issues, use of resources, impact and plans 30/05/2011
36 EGI-InSPIRE RI-261323 Issues SA1 –Pending integration of two NGIs –Establishment of the NGI as reference provider in the country JRA1 –Development for local deployment tools delayed –No funded effort for 2 nd level support of distributed tools SAM Operations Portal SA1 & JRA1 - EGI-InSPIRE Review 201136 30/05/2011
37 EGI-InSPIRE RI-261323 Use of Resources 1/2 SA1 –98% PMs achieved (aggregated) – Global Services Some marginal cases of overspending due to transition TSA1.8E: 59% achieved due to issues in claiming effort within the JRU (but activities successfully delivered) –NGI Local Services Few cases of under/overspending that will be compensated over the duration of the project JRA1 –80% PMs achieved (aggregated on all tasks) SA1 & JRA1 - EGI-InSPIRE Review 201137 30/05/2011
38 EGI-InSPIRE RI-261323 Use of Resources 2/2 TJRA1.2 – Maintenance –total spent 86% –Unspent effort can be compensated during the coming years – 4 years task TJRA1.3 – Development for the Regionalisation of Ops tools –total spent 63% –Under spending by almost all the partners and development not completed Hiring issues for some partners Consolidation of use cases Dependencies among tool development roadmaps –Propose extension of TJRA1.3 into PY2 TJRA1.5 (CNRS) –Total spent 76% –Harmonization of operations portal with GOCDB postponed SA1 & JRA1 - EGI-InSPIRE Review 201138 30/05/2011
39 EGI-InSPIRE RI-261323 Plans for next year SA1 –Extending participation in Staged Rollout actvities –Integration New NGIs and MoUs with new integrated RPs Finish UNICORE and GLOBUS integration Desktop grids and PRACE (pilots) –Operational tools availability reports (Global and Local) –Automation of service level management processes –Day-by-day operations (security, support, oversight) JRA1 –Accounting New APEL Publisher September 2011 Regional Accounting Server packaged and released to NGIs December 2011 New resources and billing (roadmap under discussion) careful prioritization needed –Local deployment models to be completed ( synchronisation system for regional GOCDB) –Operations Portal: Integration of security dashboard, creation of VO dashboards under discussion SA1 & JRA1 - EGI-InSPIRE Review 201139 30/05/2011
40 EGI-InSPIRE RI-261323 Project targets SA1 & JRA1 - EGI-InSPIRE Review 201140 Project ObjectivesMetricsTarget Y1Achieved by PQ4 PO1: Expansion of a nationally based production infrastructure Number of production Resource Centres in EGI (M.SA1.Size.1) 300347 Number of CPU cores available in EGI (M.SA1.Size.2) – Integrated 300,000338,895 Number of CPU cores available in EGI (M.SA1.Size.2) – Project 200,000239,840 EGI Reliability (M.SA1.Operation.5)90%94.6% PO2: Support of European researchers and international collaborators through VRCs Number of jobs done a day (M.SA1.Usage.1)500 000914,000 PO3: Sustainable support for Heavy User Communities Number of Resource Centre with MPI (M.SA1.Integration.2) 5096 PO6: Integration of new technologies and resources Number of HPC clusters (M.SA1.Integration.1)149 Number of virtualised resources (M.SA1.Integration.4) 016,108 Number of desktop resource (M.SA1.Integration.3)0 1,562 30/05/2011
41 EGI-InSPIRE RI-261323 Activity impact and value Project objectiveSA1/JRA1 Achievements O1 The continued operation and expansion of today’s production Infrastructure SA1 and JRA1 provided continued, open and available services to all disciplines Radical transition to a NGI-based model >20 NGIs NGIs at different levels of maturity but active, increasingly sustainable and improving their performance OMB and OTAG established >40 members Installed capacity and Resource Centres integrated continued to grow +25% CPU cores, +85% job run 28 operational tool releases 6 task forces O4 Interfaces that expand access to new user communities Support of MPI expanding +31.5% Integration of UNICORE HPC O5 Mechanisms to integrate existing infrastructure providers in Europe and around the world New procedures and processes +9 Collaboration with integrated RPs through MoUs O6 Establish processes and procedures to allow the integration of new DCI technologies Accounting infrastructure migrated to messaging ARC fully integrated, GLOBUS and UNICORE in progress Integration of virtual Grid sites (StratusLab) SA1 & JRA1 - EGI-InSPIRE Review 201141 30/05/2011
42 EGI-InSPIRE RI-261323 Summary All project metric targets met SA1 and JRA1 effectively contributed to the accomplishments of the project objectives –continued operation with improving performance and increasing integration –new operational structure from 12 federations to NGIs and a framework for collaboration with integrated infrastructures –expansion of the resource infrastructure and utilization +25% sites +84% jobs run SA1 & JRA1 - EGI-InSPIRE Review 201142 30/05/2011
43 EGI-InSPIRE RI-261323 EGI-InSPIRE EGI-InSPIRE RI-261323 Backup slides SA1 & JRA1 - EGI-InSPIRE Review 201143 30/05/2011
44 EGI-InSPIRE RI-261323 Technology Helpdesk DMSU EGI-SA2 Technology Provider (EMI / IGE) TPM GGUS RT Technology Helpdesk announce accept/reject Workflow for Bugs found in production Technology release workflow SA1 & JRA1 - EGI-InSPIRE Review 201144 30/05/2011
45 EGI-InSPIRE RI-261323 GGUS architecture SA1 & JRA1 - EGI-InSPIRE Review 201145 30/05/2011
46 EGI-InSPIRE RI-261323 GGUS HA GGUS HIGH AVAILABILITY Active – active high availablility concept Data layer done Logic and presentation layer ready by the end of the first half of 2011 30/05/2011 SA1 & JRA1 - EGI-InSPIRE Review 201146
47 EGI-InSPIRE RI-261323 SAM BACKUP SA1 & JRA1 - EGI-InSPIRE Review 201147 30/05/2011
48 EGI-InSPIRE RI-261323 OPS PORTAL BACKUP SA1 & JRA1 - EGI-InSPIRE Review 201148 30/05/2011
49 EGI-InSPIRE RI-261323 OPS PORTAL BACKUP Envisaged solution for GOCDB/OPS_PORTAL harmonization SA1 & JRA1 - EGI-InSPIRE Review 201149 Current Situation Harmonized tools 30/05/2011
50 EGI-InSPIRE RI-261323 GOCDB BACKUP SA1 & JRA1 - EGI-InSPIRE Review 201150 CENTRAL GOCDB4 WSGU I GOCDB module REGION / NGI Local users INPUT GOCDB4 WSGU I GOCDB module central users EGI tools Read/Write Read only GOCDBPI_v4 GOCDB4 DATABASE GOCDB4 data schema is designed in an object fashion. This object model is implemented at database level using a methodology known as Pseudo-Relational Object Model (PROM) GOCDB PI GOCDB Programmatic Interface is a REST (Representational State Transfer) based interface over https. REST URLs are properly secured when transiting. Some of the methods are nonetheless public and don't require client side authentication Output format is XML. There are currently 3 protection levels for all methods of the interface (public, private and protected) OCDB/Release4/Architect ure 30/05/2011
51 EGI-InSPIRE RI-261323 Accounting SA1 & JRA1 - EGI-InSPIRE Review 201151 30/05/2011
52 EGI-InSPIRE RI-261323 EGI-InSPIRE EGI-InSPIRE RI-261323 EGI RT staged-rollout EGI RT staged-rollout EGI Repository EGI Repository EGI Mail manager GGUS StageRollOut New Open Staged Rollout Manager Repository URL EGI Document server DB EGI Document server DB EGI Mail manager Select EAs EA teams EA 1 EA n … notification Test? accept :reject Verification accept Verification accept Test ? reject GGUS ticket GGUS ticket Technology Providers DMSU assign URL reference Report ID Staged Rollout managers accept Report outcome? accept:reject submit URL reference Resolved Outcome? accept:reject Staged Rollout test Staged Rollout test Staged Rollout Manager Staged rollout done 30/05/2011 SA1 & JRA1 - EGI-InSPIRE Review 201152
53 EGI-InSPIRE RI-261323 ROD - COD SA1 & JRA1 - EGI-InSPIRE Review 201153 30/05/2011
54 EGI-InSPIRE RI-261323 … site Security Nagios Pakiti probes Messaging system Patch monitoring Pulling results NGI View Site View Global View EGI CSIRT Results Security Dashboard Sites ViewNGIs View Manage Monitor Develop (Under develop) 30/05/2011 SA1 & JRA1 - EGI-InSPIRE Review 201154
Similar presentations
© 2025 Inc.
All rights reserved.