Parag Mhashilkar Computing Division, Fermilab.  Status  Effort Spent  Operations & Support  Phase II: Reasons for Closing the Project  Phase II:

Slides:



Advertisements
Similar presentations
Dec 14, 20061/10 VO Services Project – Status Report Gabriele Garzoglio VO Services Project WBS Dec 14, 2006 OSG Executive Board Meeting Gabriele Garzoglio.
Advertisements

The future for Test Automation Sarah Saltzman EMEA Manager for Quality Test Management Summit January 31 st, 2007.
Roadmap to Continuous Integration Testing and Benefits Gowri Selka, Walgreens Natalie Koltun, Walgreens May 20th, 2014 ©2013 Walgreen Co. All rights reserved.
Jan 2010 Current OSG Efforts and Status, Grid Deployment Board, Jan 12 th 2010 OSG has weekly Operations and Production Meetings including US ATLAS and.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
F Run II Experiments and the Grid Amber Boehnlein Fermilab September 16, 2005.
SCD FIFE Workshop - GlideinWMS Overview GlideinWMS Overview FIFE Workshop (June 04, 2013) - Parag Mhashilkar Why GlideinWMS? GlideinWMS Architecture Summary.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
OSG Public Storage and iRODS
OSG Area Coordinators Meeting Security Team Report Mine Altunay 12/21/2011.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
Integration and Sites Rob Gardner Area Coordinators Meeting 12/4/08.
Publication and Protection of Site Sensitive Information in Grids Shreyas Cholia NERSC Division, Lawrence Berkeley Lab Open Source Grid.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
University of Palestine Faculty of Applied Engineering and Urban Planning Software Engineering Department Prepared By Ahmed Obaid Wassim Salem Supervised.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
Sep 21, 20101/14 LSST Simulations on OSG Sep 21, 2010 Gabriele Garzoglio for the OSG Task Force on LSST Computing Division, Fermilab Overview OSG Engagement.
OSG Area Coordinators Meeting Security Team Report Mine Altunay 04/3/2013.
Overview of Monitoring and Information Systems in OSG MWGS08 - September 18, Chicago Marco Mambelli - University of Chicago
Mar 28, 20071/9 VO Services Project Gabriele Garzoglio The VO Services Project Don Petravick for Gabriele Garzoglio Computing Division, Fermilab ISGC 2007.
INFSO-RI Enabling Grids for E-sciencE VO BOX Summary Conclusions from Joint OSG and EGEE Operations Workshop - 3 Abingdon, 27 -
Production Coordination Staff Retreat July 21, 2010 Dan Fraser – Production Coordinator.
May 11, 20091/17 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting May 11, 2009 Gabriele Garzoglio.
Mar 28, 20071/18 The OSG Resource Selection Service (ReSS) Gabriele Garzoglio OSG Resource Selection Service (ReSS) Don Petravick for Gabriele Garzoglio.
1 Evolution and Revolution: Windows 7 and Desktop Virtualization How to Accelerate Migration to Windows 7 Miguel Sian, Sr. Enterprise Solutions Consultant.
Jan 10, 20091/16 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Jan 10, 2009 Gabriele Garzoglio.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
OSG Production Report OSG Area Coordinator’s Meeting Aug 12, 2010 Dan Fraser.
Partnerships & Interoperability - SciDAC Centers, Campus Grids, TeraGrid, EGEE, NorduGrid,DISUN Ruth Pordes Fermilab Open Science Grid Joint Oversight.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
July 25, 20071/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green OSG Information Services, VO Monitoring Services and Resource Selection.
BNL Tier 1 Service Planning & Monitoring Bruce G. Gibbard GDB 5-6 August 2006.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
Grid Operations Lessons Learned Rob Quick Open Science Grid Operations Center - Indiana University.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Production Coordination Area VO Meeting Feb 11, 2009 Dan Fraser – Production Coordinator.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
Introduction Complex and large SW. SW crises Expensive HW. Custom SW. Batch execution Structured programming Product SW.
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
Sep 25, 20071/5 Grid Services Activities on Security Gabriele Garzoglio Grid Services Activities on Security Gabriele Garzoglio Computing Division, Fermilab.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL May 8 th, 2008.
High Throughput Data Program (HTDP) at FNAL Mission: investigate the impact of and provide solutions for the scientific computing challenges in Big Data.
Jun 18, 20071/26 Security Policies and Middleware in OSG Gabriele Garzoglio Security Policies and Middleware in OSG June 18, 2007 JRA1 All Hands Meeting.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
OSG Area Coordinators Meeting Security Team Report Mine Altunay 02/13/2012.
OSG Storage VDT Support and Troubleshooting Concerns Tanya Levshina.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Area Coordinator Report for Operations Rob Quick 4/10/2008.
Sep 17, 20081/16 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Sep 17, 2008 Gabriele Garzoglio.
The Resource Selection Service (ReSS) Activity Gabriele Garzoglio Fermilab, Computing Division March 14, 2006.
ISIS Project Status Report May 18, 2006 Prepared by MAXIMUS, Inc Education Systems Division for the ABT Committee.
Project Life Presented by Chuck Ray, PMP ITS Project Manager.
July 26, 2007Parag Mhashilkar, Fermilab1 DZero On OSG: Site And Application Validation Parag Mhashilkar, Fermi National Accelerator Laboratory.
OSG Area Coordinators Meeting Security Team Report Mine Altunay 8/15/2012.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards an Information System Product Team.
OSG Facility Miron Livny OSG Facility Coordinator and PI University of Wisconsin-Madison Open Science Grid Scientific Advisory Group Meeting June 12th.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
INFSOM-RI WP3: WP3: Software configuration tools and methodologies Status Report ETICS All-Hands – 23 May 2007 E. Ronchieri.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
OSG User Group August 14, Progress since last meeting OSG Users meeting at BNL (Jun 16-17) –Core Discussions on: Workload Management; Security.
Understanding The Cloud
Continuous Delivery- Complete Guide
Monitoring and Information Services Technical Group Report
Infrastructure Orchestration to Optimize Testing
Software Product Testing
Leigh Grundhoefer Indiana University
Executive Project Kickoff
Presentation transcript:

Parag Mhashilkar Computing Division, Fermilab

 Status  Effort Spent  Operations & Support  Phase II: Reasons for Closing the Project  Phase II: Features Delivered  Outstanding Risks  Next Steps  Lessons Learned  Conclusion 8/30/2010ReSS Phase II: Project Closing Meeting

 Phase II started in Sep 2008  Currently supported VOs  CMS  DES  DZero  Engagement  FermiGrid  Operations  Stable operations ▪ Issues related to HA have been addressed  Ongoing support 8/30/2010ReSS Phase II: Project Closing Meeting

PersonnelFTEResponsibilities Current StatusParag MhashilkarCurrent: 20%Project Lead/Core Developer/Support Gabriele, Garzoglio, Tanya Levshina As neededBackup After Project Closure Parag MhashilkarAs neededConsulting and Support Gabriele, Garzoglio, Tanya Levshina As neededBackup 8/30/2010ReSS Phase II: Project Closing Meeting

 Operations: The ReSS services operated and well supported by the FermiGrid group.  Support: Tickets Triaged through CD Servicedesk.  Fermigrid Support Group: ▪ Disruption of ReSS service(s) ▪ The service cannot be contacted ▪ Machine hosting the service is not reachable  ReSS Developer’s Support Group: ▪ Site(s) not reporting to ReSS ▪ How to extract required information from ReSS ▪ Possible bugs preventing site(s) to advertise to ReSS or possible bugs such that site advertise incorrect information. 8/30/2010ReSS Phase II: Project Closing Meeting

 Phase II started to add new features to the ReSS project.  Support for MPI  Support for advertising SE  Improved robustness through HA, etc.  […]  The project has achieved the initial goals stated in the charter  The project has provided additional features as per user change-requests made by the stakeholders during the lifetime of the project.  As of now there are no outstanding user requests known to the project. 8/30/2010ReSS Phase II: Project Closing Meeting

8/30/2010ReSS Phase II: Project Closing Meeting Milestones/DeliverablesStakeholderPlannedCompleted Support for MPI usersOSG12/31/2008 Improved support for Storage Elements registration with ReSS OSG 12/31/2008 Test suite to identify installation/deployment issuesReSS03/31/200905/07/2009 Compliance with the OSG 1.2 Generic Information Services OSG02/28/200907/27/2009 Compliance with the Generic Information Provider to support Glue Schema V2 OSG-Not Completed Improved security for resource registration with ReSSReSS, OSG, Engagement 11/30/200908/12/2010 Support to run ReSS services in High Availability deployment mode FermiGrid03/31/200906/17/2009 Compliance of ReSS with the FermiGrid Software Acceptance Process FermiGrid09/31/200909/15/2009 ReSS Security Review *Comp Div10/31/200911/06/2009 RSV Probes for ReSS *OSG02/28/2010 Monitoring of resources in the condor collector in HA *ReSS03/31/2010

RiskImpact Level Risk Plan Actions Support for CEMon dropped by GLite HighThis will need working closely with GIP group to find an alternative means to achieve the functionality provided by CEMon in case this happens. OSG can also evaluate and adopt advertising tool developed my Brian Bockelman. This tool is currently deployed on Compute elements in University of Nebraska at Lincoln. Chance of support for CEMon being withdrawn by the CEMon group is minimal but the impact on OSG Information Services could be significant. Adaptation to GLUE Schema V2 MediumAt the time of closing this project, OSG has yet to adapt Glue Schema V2. The changes to adapt Glue Schema V2 could be complex and may not integrate with the existing ReSS services. 8/30/2010ReSS Phase II: Project Closing Meeting

 No current plan to open a new phase of the project.  Starting September 2010, Parag Mhashilkar will ramp down the effort level to consulting and emergency maintenance only as required.  If need be; Open new phase of the project in future only after evaluating -  Any new user requests in future  If the need be to add features to the existing ReSS services. 8/30/2010ReSS Phase II: Project Closing Meeting

 Not easy managing the code (OSG plug-in) that depended on third party s/w  Good collaboration between gLite and the ReSS project.  Lesson: Write access to code repositories could have made the turn-around time faster  The troubleshooting was often complicated and time consuming.  Failures were often in dependent packages, GIP, CE security config or Tomcat config  RSV probes: Number of such problems reported to ReSS team have gone down.  Lesson: Automation of the troubleshooting tasks can be extended to software in OSG stack.  Sites slow in deploying patches to the OSG software stack  Sites miss on some crucial patches and face already addressed problems  Lesson: Sites should be encouraged/enforced to deploy the patches more rapidly and also make the upgrade process much transparent to the sites  ReSS team worked with several OSG VOs. The project group learned several valuable lessons and developed proficiencies in the OSG Information Services.  Recommendation: ReSS project recommends the OSG and the Computing Division to involve the members of ReSS team in the investigations on the next generation of OSG’s Information Services 8/30/2010ReSS Phase II: Project Closing Meeting

 Stable operations  Features in the original charter delivered  Some tasks got delayed  Reduction in % FTE  Accepted change requests not accounted in the charter  Successful collaboration with the CEMon group  ReSS project recommends the OSG and the Computing Division to involve the members of ReSS team in the investigations on the next generation of OSG’s Information Services  Project has been successful, thanks to, OSG, CD Fermilab, ReSS stakeholders, OSG GOC, CEMon-Glite group. 8/30/2010ReSS Phase II: Project Closing Meeting