Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-223782 SA1 Status Report Status and Progress of the ETICS Services ETICS2 First Review Alberto AIMAR CERN Brussels 3 April 2009.

Similar presentations


Presentation on theme: "INFSO-RI-223782 SA1 Status Report Status and Progress of the ETICS Services ETICS2 First Review Alberto AIMAR CERN Brussels 3 April 2009."— Presentation transcript:

1 INFSO-RI-223782 SA1 Status Report Status and Progress of the ETICS Services ETICS2 First Review Alberto AIMAR CERN Brussels 3 April 2009

2 INFSO-RI-223782 Outline Tasks and Deliverables Features and Achievements Outlook on Year 2 Conclusions 2 ETICS2 ReviewSA1 Status Report

3 INFSO-RI-223782 Tasks and Deliverables 3 ETICS2 ReviewSA1 Status Report

4 INFSO-RI-223782 SA1 Tasks SA1.1 – Work Package Coordination Regular coordination of the Work Package, reporting and review of milestones and deliverables SA1.2 – Core Service Maintenance and Extensions Maintenance of ETICS core services, fixing reported bugs and new requirements for core services. Design and implementation of integrated release management tools, federated repository and APIs. Integration into core services of contributions from SA2 and JRA activities SA1.3 – Core Service Documentation Maintenance and updating of the core service documentation SA1.4 – Infrastructure Deployment, Maintenance and Upgrades Extending current deployment strategies of ETICS services and infrastructure management, such that the service maintains a high-level of quality and availability Load-balancing deployment for solicited and/or critical services Improving coverage of self-monitoring system of deployed services and underlying infrastructures SA1.5 – Core Service Certification Applying the proposed ETICS Certification Process to the ETICS software itself Demonstrate its applicability and that the ETICS service provides required information to determine the level of certification of the software projects 4 ETICS2 ReviewSA1 Status Report

5 INFSO-RI-223782 SA1 Deliverables DSA1.1 – Execution plan for first 12 months of infrastructure operation M03 This deliverable describes the execution plan for the first half of the ETICS 2 project, including the core service roadmap and the infrastructure deployment plan DSA1.2 – ETICS Core Services Design Specification M06 This deliverable describes the overall architecture of the ETICS 2 core services DSA1.3 – ETICS Site Service Level Agreement M09 This deliverable describes the Service Level Agreements upon which the ETICS service will be provided. The SLAs will define the service level the users can expect from the service in terms of availability, accessibility and support DSA1.4 – Execution plan for second 12 months of infrastructure operation M12 This deliverable describes the execution plan for the second half of the ETICS 2 project, including the core service roadmap and the infrastructure deployment plan DSA1.5 – Infrastructure and core services certification and usage report M21 This deliverable reports on the release management cycles and certification of the ETICS 2 infrastructure and core services, including lessons learned and corrective action to apply 5 SA1 Status ReportETICS2 Review

6 INFSO-RI-223782 Features and Achievements 6 ETICS2 ReviewSA1 Status Report

7 INFSO-RI-223782 ETICS SA1 Services 7 ETICS2 ReviewSA1 Status Report

8 INFSO-RI-223782 Release and Development Infrastructure SA1.2 – Core Service Maintenance and Extensions Production Installation (prod) The officially released supported ETICS Release Candidate Installation (rc) “Next” production, available for final certification & testing by selected users Integration Testing Installation (test) All the release candidates of the packages are tagged at project level and installed for integration tests Development Installation (dev) A shared installation where developers can test their packages with the release candidates of other packages Individual Development (dev-...) Installations: developers or teams can instantiate they infrastructure, often in reduced scale for individual development and testing 8 ETICS2 ReviewSA1 Status Report.

9 INFSO-RI-223782 Infrastructure Monitoring SA1.4 – Infrastructure Deployment, Maintenance and Upgrades 9 ETICS2 ReviewSA1 Status Report Monitoring and Alarms System Integrated in the CERN Monitoring System (web, sms, messaging, etc) More in Year 2

10 INFSO-RI-223782 Functional Regression Testing SA1.2 – Core Service Maintenance and Extensions 10 ETICS2 ReviewSA1 Status Report Integration and Release Candidate Defined jobs, testing different functionalities and platforms

11 INFSO-RI-223782 ETICS2 ReviewSA1 Status Report ETICS Resource Pool(s) SA1.4 – Infrastructure Deployment, Maintenance and Upgrades 11 Installation and Maintenance of Resources Pools Worker nodes for build and test jobs on all platforms supported (>80 WNs)

12 INFSO-RI-223782 ETICS Production Platforms SA1.4 – Infrastructure Deployment, Maintenance and Upgrades Production PlatformsHardware SLC3-32: Scientific Linux (CERN) 3Intel (x86) SLC4-32: Scientific Linux (CERN) 4Intel (x86) SLC4-64: Scientific Linux (CERN) 4Intel (x86_64) SL5-32: Scientific Linux 5Intel (x86) SL5-64: Scientific Linux 5Intel (x86_64) Debian 4-32: Debian Linux 4.0Intel (x86) Debian 4-64: Debian Linux 4.0Intel (x86_64) RH4-32: Red Hat AS 4Intel (x86) RH4-64: Red Hat Linux 4AMD64 MAC OS XIntel (x86) 12 ETICS2 ReviewSA1 Status Report Installation, Maintenance, Security Upgrades of Prod Platforms Install worker nodes for build and test jobs in production Make all external software packages available on each platform SL5 platforms for gLite Several other platforms for porting and testing

13 INFSO-RI-223782 Usage of the Resources SA1.4 – Infrastructure Deployment, Maintenance and Upgrades Build/test typeQ2Q3Q4 Production 137031712122035 Test ~600~3000~700 Other ~300~650~1200 13 ETICS2 ReviewSA1 Status Report ProjectQ2Q3Q4 org.glite 746434233415 org.etics 367274647722 org.glite.testsuites 215422212255 org.gcube 135521485 torquemaui 3513242 externals 346879 unicore 3313187 ARC --86

14 INFSO-RI-223782 ETICS Client Performance SA1.2 – Core Service Maintenance and Extensions Client 1.4 Released Improved performance from 200% to 900% depending on the task to be executed and the available hardware Very important for developers but also for remote execution The original XML-based implementation did not scale, new implementation is based on sqlite, the de-facto standard in multiplatform embedded database engines 14 ETICS2 ReviewSA1 Status Report Old clientNew clientSpeed-upModules gLite~35h~4h875%384 WMS 1h 43m 41s 14m 16s735%110 Data Management 1h 12m 18s 10m 34s720%104 Security 29m 38s 5m 45s483%65 LB 14m 32s 2m 51s460%42

15 INFSO-RI-223782 15 Worker Nodes Virtualization SA1.4 – Infrastructure Deployment, Maintenance and Upgrades All ETICS Worker Nodes are Virtual Machines CERN moved to double 4-core nodes (8 cores/each machine) in summer 2008 ETICS had to move to virtual images because 8-core WN are not useful for build and tests WN in the ETICS pool are (2-cores) VMS Static creation of virtual machines Prepared a Library of Virtual Images Provide all maintenance, security updates, etc Very Flexible Infrastructure Instantiate new machines or change platforms with a few commands Further Improvement ETICS bootstrapper will download and start a virtual machine directly on the WN (for using other hw infrastructures). Possibly in Year 2. ETICS2 ReviewSA1 Status Report

16 INFSO-RI-223782 ETICS Repository Improvements SA1.4 – Infrastructure Deployment, Maintenance and Upgrades The ETICS Repository has been reorganized Major important improvements Scalable and faster statistics New versions of tools used (Java, etc) New browser interface and addressing based on REST Presented to the user with a more intuitive tree of directories and files with icons. Reports and the packages are now stored on a HA file system (AFS) 16 ETICS2 ReviewSA1 Status Report

17 INFSO-RI-223782 Integration with External Repositories SA1.2 – Core Service Maintenance and Extensions 17 ETICS2 ReviewSA1 Status Report Generation of RPM and Tar packages already available The Debian users and gLite needed other distribution formats Dynamic YUM Repositories were requested Glite uses YUM repositories as distribution mechanism Permanent YUM repository for registered repository Using the standard YUM client all binaries can be deployed Further improvements in Y2 of the Projects Feasibility and prototypes of integration Driven by the need of a sustainable future for ETICS

18 INFSO-RI-223782 ETICS Web Client SA1.2 – Core Service Maintenance and Extensions ETICS Build and Test Portal (restarted Sept 2008) Improved the External Requests and Submission web interface Y2: Streamline interface for repetitive non-expert tasks (re-run build, test, etc) vs. expert tasks (new package, configuration, etc) Web Build and Test Application(restarted Oct 2008) Porting to Firefox 3 was the major improvement Fixing bugs in the Web Apps Changes required by others (multi-packaging, etc) Disseminator (restarted Oct 2008) Deployed on an internal INFN machine to be tried and tested, Need to be completed as the metrics are a cornerstone of many ETICS activities (Plug-ins, QA, A-QCM, gLite) Not many resources for this fundamental component until Oct 2008 18 ETICS2 ReviewSA1 Status Report

19 INFSO-RI-223782 Service Level Agreement (I) SA1.4 – Infrastructure Deployment, Maintenance and Upgrades The ETICS SLAs describe the terms of Quality and Availability of Services Quality of Services Installation- Installation procedures are present and used regularly to instantiate development services. Complete Installation and testing will take less than 24 hours because it includes OS installation on servers, restore from backups, security certificates, firewall settings, etc Backup - Backup sets are generated using the standard CERN procedures Server backups are performed every night All permanent files are stored on AFS, a mirrored and archived central file system at CERN. Restore - Restoration of full ETICS Services should take up to one day Providing the availability of hardware and of the above mentioned services (AFS, TSM, networking, certificates, etc) that are used by the ETICS Services Redundancy - The Service is not redundant but the servers can be restored in a few hours All the hardware is standard commodity material or is virtualized, we can easily find hardware and prepare new sets of ETICS Services Virtualization - All Worker Nodes are VM-based and are fully redundant for the platforms needed by the current ETICS users (SLC4/5, RH, and Debian operating on 32- and 64-bits platforms) Supply - All ETICS hardware is CERN standard, but if needed it can be purchased at any IT store, provided that the hardware supports SL4, which is the standard Linux OS used at CERN No foreseeable supply problems in case of urgently needed hardware. Software Dependencies - The software used by the Services is all widely-used open source and therefore there is no danger of lacking supply or licenses availability 19 ETICS2 ReviewSA1 Status Report

20 INFSO-RI-223782 Service Level Agreement (II) SA1.4 – Infrastructure Deployment, Maintenance and Upgrades ServiceYearly AvailabilityYearly Reliability Access to Project Binary packages 98%99% Access to Build Reports and Metrics Repository 97%99% Build and Configuration Portal 95%97% Support requests (creation of projects, new users, etc). 98%100% Note: Availability and reliability values are determined by taking into account issues due to the ETICS Services functions; but not those caused by the services used by ETICS. E.g. if there is no network connectivity at CERN for 24h this 24h will not be considered an ETICS downtime; although ETICS team will take measures to minimize also these effects. 20 ETICS2 ReviewSA1 Status Report Availability and Reliability Targets For accessing different artefacts on the Build and Test processes In Year 2 we will complete 2 SLA documents (gLite and D4Science)

21 INFSO-RI-223782 Outlook on Year 2 21 ETICS2 ReviewSA1 Status Report

22 INFSO-RI-223782 Improvements of ETICS Infrastructure DSA1.4: Execution Plan for the Second 12 Months Milestones for Y2 of the ETICS2 projects are oriented towards a sustainable ETICS Disentangle the ETICS Services from the current solutions based on the current partners Services that can be based on, or interfaced to, external resources and organizations. Adding features that are typically needed by commercial User Projects. M14 - Definition of Monitoring Parameters for the ETICS Infrastructure and Resources Parameters to be collected and monitored in the infrastructure and resources of the ETICS Services These metrics will be used for monitoring and reporting about the ETICS Services. M18 – Extended Monitoring for Other Infrastructures General monitoring interface should be implemented and support some of the common monitoring and messaging system used at the Grid sites M22 - Reports on the Usage of the ETICS Infrastructure Automated reports with detailed usage of the ETICS Infrastructure. Load on the servers and WNs by User Projects and platforms should be available. Initial step to define the costs associated to User Projects and to the support of different platforms. M15 - Security Assessment of the ETICS Services Aspect regarding security of the ETICS Services should be assessed and certified following the current standards in place at major Sites Describe security status of the ETICS Services and, if necessary, the changes to undertake. This report may introduce an additional milestone to implement the required changes 22 ETICS2 ReviewSA1 Status Report

23 INFSO-RI-223782 New Features of the ETICS Services DSA1.4: Execution Plan for the Second 12 Months M16 - Metrics Disseminator for Trend Analysis (delayed from Y1) The ETICS Services will be able to collect the metrics and display their results for any User Project. The disseminator should provide a web interface to select metrics data and customize its visualization for trend analysis plots. M18 - Support of Distributed Multi-Node Testing (delayed from Y1) The ETICS Services will be extended to map test definitions generated by the distributed testing design tools implemented by the ETICS 2 JRA2 work package (Test Management Tools). M17 - Feasibility Study of ETICS integration with External Resources (delayed from Y1) The feasibility study will investigate the possibility of connecting the ETICS Services to establish external code repositories (e.g. Sourceforge, Google Code, etc) and to computing resources to use as submission engines (e.g. Amazon EC2, Google App Engine, etc). M20 – Test of Integration of ETICS Services with External Resources For the sustainability of the ETICS Services beyond the ETICS 2 project, it is important that external computing and storage resources can be used for all components of the Services. An advanced prototype should validate whether it is possible to run the ETICS Services on completely external commercial resources. M14 - SLA document defined with 2 major User Projects (delayed from Y1) The SLA framework that was delayed at M12 should be implemented for 2 major ETICS User Projects and clearly specify the level of support and quality of the services provided. The two User Projects are most likely going to be gLite and D4Science. 23 ETICS2 ReviewSA1 Status Report

24 INFSO-RI-223782 New Features of the ETICS Services DSA1.4: Execution Plan for the Second 12 Months M17 - Interfaces to Existing Build Systems Currently the structure of a User Project and its entire configuration must be described in the ETICS Services in order to profit from all the functionalities of the ETICS System. Existing large projects find it difficult to integrate ETICS into their existing processes as they already have their configuration and build tools. They need to be able to use ETICS with as little effort and changes as possible. Therefore interfaces and import tools will facilitate the usage of the ETICS System to established projects. For example, interfaces should be provided for popular existing configuration tools (i.e. Maven for the Java community, CMT for the Physics community, etc). The structure and the tags of the modules of a project may be imported from the source code structure (e.g. in CVS, SVN, etc). M18 - Implementation of New Privacy, Authentication and Authorization Policies The current ETICS Services have been developed focusing on the needs of open sources and research projects where authentication and authorization were considered necessary but not privacy. Stricter policies concerning privacy of source code, reports and binaries must be implemented by the ETICS Services. In addition authentication, currently performed via certificates, must be possible via other methods of common usage like “username/password” identification. M21 - Web Task-Oriented Interfaces Currently ETICS has two user interfaces: a command line interface and a Web-based interface. Both interfaces provide access to all functionalities and all options of the ETICS operations in a very complete, but also quite complex, manner. A simpler task-oriented interface must be provided for the main common operations of the ETICS users (i.e. register, add, build, test, etc) where, for instance, the most typical options are already set or selected by the project administrators 24 ETICS2 ReviewSA1 Status Report

25 INFSO-RI-223782 Integration with Other Activities DSA1.4: Execution Plan for the Second 12 Months M17 - A-QCM Certification of the ETICS Software (delayed from Y1) The ETICS Services should be certified at level 2 for each of the four quality aspects in the A-QCM quality standard, in view of reaching level 3 at the end of the ETICS 2 project as defined in the ETICS 2 Description of Work. M18 - Integration of ETICS Services with gLite and UNICORE (part2) Currently the ETICS Services submit the build and test jobs to Worker Nodes managed by the Condor Metronome submission system. It is necessary to extend the possibility to use other resources in addition to the ones currently available, and provide a general interface allowing the plugin of new submission engines. In the previous year a working, but limited, implementation has been developed. In the second year, a more complete integration in the ETICS Services of job submission system will be implemented using the EGEE/gLite and DEISA/UNICORE middleware. This will allow the submission of ETICS build and test jobs on these grid infrastructures. This milestone depends on the availability of the submission plugins that will be provided by the SA2 activity. M16 - Integration of Testing and Metrics Collectors Plugins Several testing and metrics plugins are under development at the end of the first year of the ETICS 2 projects. These plugins should be integrated into the ETICS Systems so that User Projects can select which tests execute and which metrics should be collected during the build and test processes of modules part of the given User Project. M18 - Integration of Reporting Plugins Reporting facilities are under development at the end of the first year of the ETICS 2 projects. They should be integrated into the ETICS Systems so that User Projects can select which reports should be generated for the modules part of the given User Project. Export facility (i.e. CSV, XML formats) will allow the presentation of the metrics collected with ETICS with external applications (i.e. Excel, etc). 25 ETICS2 ReviewSA1 Status Report

26 INFSO-RI-223782 Risks DSA1.4: Execution Plan for the Second 12 Months Risk 1 - Lack of human or material resources needed at the sites This risk can be mitigated by looking for new resource sources; for example contribution of the User Projects to the HR or HW resources of ETICS. If unavoidable one could define SLA agreements that control the usage of the ETICS Services by every given project (e.g. number of builds per day). Effort must be put in ensuring the commitment of the partners in providing hardware resource by specific contractual Consortium Agreements. Risk 2 - Difficulty in Integrating the Work of Other ETICS 2 Activities Issues incurred by SA2, JRA1 and JRA2 could affect their ability to deliver their results to SA1, thus limiting the availability of these features in production on the ETICS Services. The risk can be mitigated by constant training, common work and frequent checkpoints of the SA1 work and one of the other ETICS 2 activities. Risk 3 – Still gLite-centric ETICS Services and Infrastructure after ETICS 2 The needs of a major project such as gLite can sometimes be in conflict, or competing for other features, with those needed to provide a set of ETICS Services sustainable after the end of the ETICS 2 project. The input from other user communities highlights the need of a higher degree of privacy and data protection and extensibility than those currently required by the gLite project. If in the second year of the ETICS 2 project the external needs are not taken more into account, it is unlikely that the ETICS Services will have a sustainable future. 26 ETICS2 ReviewSA1 Status Report

27 INFSO-RI-223782 Conclusions and Summary Main Objectives (and Additional Achievements) of the First Year Automation, Performance, Virtualization, High Availability Lack of resources for the first 6 months cause delays on some tasks but will be recovered in Year 2 Maintenance and Upgrade of the Services Platforms, Updates, Virtual Images, External Software Urgent Requests for Main User Projects Year 2: Focus on Sustainability Usage of External Resources for CPU (WN) and Storage Add Features needed: Privacy, Authorization, etc Interface to/Import from popular configuration systems Integrate SA2 Submission Engines Testing and Metrics Plugins from JRA2 Cross Site submission with JRA1 A-QCM Certification and Reports with NA2 27 ETICS2 ReviewSA1 Status Report


Download ppt "INFSO-RI-223782 SA1 Status Report Status and Progress of the ETICS Services ETICS2 First Review Alberto AIMAR CERN Brussels 3 April 2009."

Similar presentations


Ads by Google