OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10 (Asia/Taipei) – Room 2, BHSS OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10.

Slides:



Advertisements
Similar presentations
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Advertisements

Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Services and Operations in Polish NGI M. Radecki,
Open Science Grid Frank Würthwein UCSD. 2/13/2006 GGF 2 “Airplane view” of the OSG  High Throughput Computing — Opportunistic scavenging on cheap hardware.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
What You Need to Know.  Lose temporary access to your account  This could be for a day, a week or more. This applies for the entire campus and the entire.
OSG Area Coordinators Meeting Operations Rob Quick 2/22/2012.
MyOSG: A user-centric information resource for OSG infrastructure data sources Arvind Gopu, Soichi Hayashi, Rob Quick Open Science Grid Operations Center.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
OSG Area Coordinators Meeting Operations Rob Quick 2/22/2012.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
Integration and Sites Rob Gardner Area Coordinators Meeting 12/4/08.
Publication and Protection of Site Sensitive Information in Grids Shreyas Cholia NERSC Division, Lawrence Berkeley Lab Open Source Grid.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Overview of Monitoring and Information Systems in OSG MWGS08 - September 18, Chicago Marco Mambelli - University of Chicago
02/07/09 1 WLCG NAGIOS Kashif Mohammad Deputy Technical Co-ordinator (South Grid) University of Oxford.
OSG Software and Operations Plans Rob Quick OSG Operations Coordinator Alain Roy OSG Software Coordinator.
Evolution of the Open Science Grid Authentication Model Kevin Hill Fermilab OSG Security Team.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Production Coordination Staff Retreat July 21, 2010 Dan Fraser – Production Coordinator.
Enabling Grids for E-sciencE EGEE-II INFSO-RI OSG-doc-498 Maite Barroso: Grid Operations LHCC review, CERN,25 th September Operations EGEE.
Rob Quick OSG Operations Area Coordinator Manager High Throughput Computing Indiana University Integrating OSG Operational Services Rob Quick OSG Operations.
OSG Production Report OSG Area Coordinator’s Meeting Aug 12, 2010 Dan Fraser.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
Michael Fenn CPSC 620, Fall 09.  Grid computing is the process of allowing loosely-coupled virtual organizations to share resources over a wide area.
Grid Operations Lessons Learned Rob Quick Open Science Grid Operations Center - Indiana University.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Meeting Minutes and TODOs TG has no distributed monitoring. During incident response, use a manual twiki page to distribute information TG monitors the.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Wojciech Lapka SAM Team CERN EGEE’09 Conference,
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
Leveraging the InCommon Federation to access the NSF TeraGrid Jim Basney Senior Research Scientist National Center for Supercomputing Applications University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Operational Architecture of PL-Grid project M.Radecki,
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
RSV: OSG Grid Fabric Monitoring and Interoperation with WLCG Monitoring Systems Rob Quick, Arvind Gopu, and Soichi Hayashi Computing in High Energy and.
OSG Area Coordinators Meeting Operations Rob Quick 1/11/2012.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
Operations Activity Doug Olson, LBNL Co-chair OSG Operations OSG Council Meeting 3 May 2005, Madison, WI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
ICT Mission To facilitate learning, teaching, and research in London South Bank University by providing first class IT infrastructure and services.
OSG Storage VDT Support and Troubleshooting Concerns Tanya Levshina.
Area Coordinator Report for Operations Rob Quick 4/10/2008.
Mardi 8 mars 2016 Status of new features in CIC Portal Latest Release of 22/08/07 Osman Aidel, Hélène Cordier, Cyril L’Orphelin, Gilles Mathieu IN2P3/CNRS.
WLCG Technical Evolution Group: Operations and Tools Maria Girone & Jeff Templon GDB 12 th October 2011, CERN.
User Support of WLCG Storage Issues Rob Quick OSG Operations Coordinator WLCG Collaboration Meeting Imperial College, London July 7,
Opensciencegrid.org Operations Interfaces and Interactions Rob Quick, Indiana University July 21, 2005.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
OSG Security: Updates on OSG CA & Federated Identities Mine Altunay, PhD OSG Security Team OSG AHM March 24, 2015.
RSV: OSG Grid Monitoring and User Customizable Views Rob Quick, Arvind Gopu, and Soichi Hayashi High Performance Distributed Computing Location: Munich,
March 2014 Open Science Grid Operations A Decade of HTC Infrastructure Support Kyle Gross Operations Support Lead Indiana University / Research Technologies.
OSG Facility Miron Livny OSG Facility Coordinator and PI University of Wisconsin-Madison Open Science Grid Scientific Advisory Group Meeting June 12th.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
What is OSG? (What does it have to do with Atlas T3s?) What is OSG? (What does it have to do with Atlas T3s?) Dan Fraser OSG Production Coordinator OSG.
OSG/EGI/WLCG Interoperations Operational Communication September 19, 2011 EGI Technical Forum Lyon, France OSG/EGI/WLCG Interoperations Operational Communication.
MyOSG and MyEGI - One Stop Shopping for Grid (Operations) Information CHEP 2010, 20 October 15:00 (Asia/Taipei) – Auditorium, BHSS MyOSG and MyEGI - One.
Open Science Grid and GLUE 2.0 Rob Quick OSG Operations Area Coordinator Manager High Throughput Computing Indiana University.
Regional Operations Centres Core infrastructure Centres
Operations Interfaces and Interactions
LCG Security Status and Issues
Ian Bird GDB Meeting CERN 9 September 2003
ATLAS support in LCG.
Solutions for federated services management EGI
Leigh Grundhoefer Indiana University
"The vision of a champion is someone who is bent over, drenched in sweat, at the point of exhaustion when no one else is watching." - Anson Dorrance.
Presentation transcript:

OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10 (Asia/Taipei) – Room 2, BHSS OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10 (Asia/Taipei) – Room 2, BHSS Rob Quick OSG Operations Coordinator Indiana University

OSG Council Aug 18 th 2010 OSG 2

OSG Council Aug 18 th 2010 OSG 3 “The Open Science Grid (OSG) advances science through open distributed computing. The OSG is a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to meet the needs of research and academic communities at all scales.”

OSG Council Aug 18 th 2010 A Few More Notes Open Science Grid  ~44 Registered Vos  Physics, Biology, Chemistry, Nanotechnology, Etc.  37 Active  Current OSG Grant - October 1, 2006 to September 30,

OSG Council Aug 18 th 2010 OSG Production Production  Operations  Infrastructure  Support  Security Operations  Site and VO Coordination  Integration 5

OSG Council Aug 18 th 2010 OSG Operations (Infrastructure) Infrastructure Services  Administrative Services  OIM Registrations Database  Information and Accounting Services  BDII, Gratia and ReSS  Monitoring Services  RSV, SAM Reporting, Ops Monitoring (Munin)  Software Caches  OSG Middleware packages, CA Distribution, OSG Configuration  Communication Tools  MyOSG, Twiki, Ticketing Interface, OSG Status Display, Notification Tools 6

OSG Council Aug 18 th 2010 OSG Operations (Support) Support Services  24x7 Ticketing  24x7 Security Incident Response  24x7 Critical Service Response (BDII, MyOSG)  User and Admin Support  Troubleshooting  Community Notification  Documentation 7

OSG Council Aug 18 th 2010 Brainstorming - Lessons Learned Technology Visibility Local Support Relationships Communication Flexibility Reliability Experience 8

OSG Council Aug 18 th 2010 My Over 30 Soccer Team 9

OSG Council Aug 18 th 2010 Technology 10 Changes Quickly Beware of “Shiny Objects” Define Service Levels (SLAs) Sometimes Looking Good is as Important than Being Good

OSG Council Aug 18 th 2010 Visibility Transparency is your friend Know when Operations Visibility is Good  But also know when it is Bad Tell everyone the story…  But not until the story is over “The vision of a champion is someone bent over, drenched in sweat, to a point of exhaustion, when no one else is watching.” Anson Dorrance 11

OSG Council Aug 18 th 2010 Local Support Financial Support  Staff  Equipment Moral Support 12

OSG Council Aug 18 th 2010 Relationships With Customers With Stakeholders With Peering Organizations Trust "I know it sounds awful, but it just hit me half-way through my stag night that I'd rather be going to the match with the lads than marrying Nicola.” - Hereford fan, cancelling his wedding to watch FA Cup game v Aylsebury. 13

OSG Council Aug 18 th 2010 Communication Let the Community know what is happening Back to Transparency Set Expectation Up Front (SLAs Again) Be part of the rumor mill Be available 14

OSG Council Aug 18 th 2010 Flexibility Be able to adapt quickly  Physically and Programmatically Find a way to watch the real usage  Not just what you think is happening Build flexibility (depth) into the environment 15

OSG Council Aug 18 th 2010 A Flexibility Story 16

OSG Council Aug 18 th 2010 Reliability Of Services  Hardware  Software  Raw Uptimes Over Past Year ~99.78%  Redundant BDII and MyOSG Of People  See Communication and Experience Slides 17

OSG Council Aug 18 th 2010 Experience No Substitute  20+ Years on Staff  Over 9000 Tickets Resolved Let the Experience Show Enjoy the Ride “Some people believe football is a matter of life and death. I'm very disappointed with that attitude. I can assure you it is much, much more important than that.” Bill Shankly 18

OSG Council Aug 18 th 2010 GOOOOOOAAAAAAAAALLLLL! 19