OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10 (Asia/Taipei) – Room 2, BHSS OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10 (Asia/Taipei) – Room 2, BHSS Rob Quick OSG Operations Coordinator Indiana University
OSG Council Aug 18 th 2010 OSG 2
OSG Council Aug 18 th 2010 OSG 3 “The Open Science Grid (OSG) advances science through open distributed computing. The OSG is a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to meet the needs of research and academic communities at all scales.”
OSG Council Aug 18 th 2010 A Few More Notes Open Science Grid ~44 Registered Vos Physics, Biology, Chemistry, Nanotechnology, Etc. 37 Active Current OSG Grant - October 1, 2006 to September 30,
OSG Council Aug 18 th 2010 OSG Production Production Operations Infrastructure Support Security Operations Site and VO Coordination Integration 5
OSG Council Aug 18 th 2010 OSG Operations (Infrastructure) Infrastructure Services Administrative Services OIM Registrations Database Information and Accounting Services BDII, Gratia and ReSS Monitoring Services RSV, SAM Reporting, Ops Monitoring (Munin) Software Caches OSG Middleware packages, CA Distribution, OSG Configuration Communication Tools MyOSG, Twiki, Ticketing Interface, OSG Status Display, Notification Tools 6
OSG Council Aug 18 th 2010 OSG Operations (Support) Support Services 24x7 Ticketing 24x7 Security Incident Response 24x7 Critical Service Response (BDII, MyOSG) User and Admin Support Troubleshooting Community Notification Documentation 7
OSG Council Aug 18 th 2010 Brainstorming - Lessons Learned Technology Visibility Local Support Relationships Communication Flexibility Reliability Experience 8
OSG Council Aug 18 th 2010 My Over 30 Soccer Team 9
OSG Council Aug 18 th 2010 Technology 10 Changes Quickly Beware of “Shiny Objects” Define Service Levels (SLAs) Sometimes Looking Good is as Important than Being Good
OSG Council Aug 18 th 2010 Visibility Transparency is your friend Know when Operations Visibility is Good But also know when it is Bad Tell everyone the story… But not until the story is over “The vision of a champion is someone bent over, drenched in sweat, to a point of exhaustion, when no one else is watching.” Anson Dorrance 11
OSG Council Aug 18 th 2010 Local Support Financial Support Staff Equipment Moral Support 12
OSG Council Aug 18 th 2010 Relationships With Customers With Stakeholders With Peering Organizations Trust "I know it sounds awful, but it just hit me half-way through my stag night that I'd rather be going to the match with the lads than marrying Nicola.” - Hereford fan, cancelling his wedding to watch FA Cup game v Aylsebury. 13
OSG Council Aug 18 th 2010 Communication Let the Community know what is happening Back to Transparency Set Expectation Up Front (SLAs Again) Be part of the rumor mill Be available 14
OSG Council Aug 18 th 2010 Flexibility Be able to adapt quickly Physically and Programmatically Find a way to watch the real usage Not just what you think is happening Build flexibility (depth) into the environment 15
OSG Council Aug 18 th 2010 A Flexibility Story 16
OSG Council Aug 18 th 2010 Reliability Of Services Hardware Software Raw Uptimes Over Past Year ~99.78% Redundant BDII and MyOSG Of People See Communication and Experience Slides 17
OSG Council Aug 18 th 2010 Experience No Substitute 20+ Years on Staff Over 9000 Tickets Resolved Let the Experience Show Enjoy the Ride “Some people believe football is a matter of life and death. I'm very disappointed with that attitude. I can assure you it is much, much more important than that.” Bill Shankly 18
OSG Council Aug 18 th 2010 GOOOOOOAAAAAAAAALLLLL! 19