OSG Operations All Hands Meeting Rob Quick (Ops Coordinator) Slides by: Scott Teige and Kyle Gross.

Slides:



Advertisements
Similar presentations
Jan 2010 Current OSG Efforts and Status, Grid Deployment Board, Jan 12 th 2010 OSG has weekly Operations and Production Meetings including US ATLAS and.
Advertisements

Introduction to Continuous Integration Mike Roberts.
#RefreshCache CI - Daily Builds w/Jenkins – an Open Source Continuous Integration Server Nick Airdo Community Developer Advocate Central Christian Church.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
EGI-Engage Recent Experiences in Operational Security: Incident prevention and incident handling in the EGI and WLCG infrastructure.
OSG Area Coordinators Meeting Operations Rob Quick 2/22/2012.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
OSG Area Coordinators Meeting Operations Rob Quick 2/22/2012.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
Integration and Sites Rob Gardner Area Coordinators Meeting 12/4/08.
Royal Latin School. Spec Coverage: a) Explain the advantages of networking stand-alone computers into a local area network e) Describe the differences.
G RID M IDDLEWARE AND S ECURITY Suchandra Thapa Computation Institute University of Chicago.
Overview of Monitoring and Information Systems in OSG MWGS08 - September 18, Chicago Marco Mambelli - University of Chicago
OSG Software and Operations Plans Rob Quick OSG Operations Coordinator Alain Roy OSG Software Coordinator.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Evolution of the Open Science Grid Authentication Model Kevin Hill Fermilab OSG Security Team.
05/29/2002Flavia Donno, INFN-Pisa1 Packaging and distribution issues Flavia Donno, INFN-Pisa EDG/WP8 EDT/WP4 joint meeting, 29 May 2002.
Open Science Grid OSG CE Quick Install Guide Siddhartha E.S University of Florida.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios for Grid Services E. Imamagic, SRCE.
March 11, 2008 USCMS Tier-2 Workshop Oh Dear God Alain made a PowerPoint presentation 1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Site Monitoring with Nagios E. Imamagic,
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Direct gLExec integration with PanDA Fernando H. Barreiro Megino CERN IT-ES-VOS.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
OSG Technology Area Brian Bockelman Area Coordinator’s Meeting February 15, 2012.
Grid Operations Lessons Learned Rob Quick Open Science Grid Operations Center - Indiana University.
State of the OSG Software Stack Alain Roy OSG Software Coordinator.
8 th CIC on Duty meeting Krakow /2006 Enabling Grids for E-sciencE Feedback from SEE first COD shift Emanoil Atanassov Todor Gurov.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
Top 10 Reasons to Upgrade to OSG Version Rob Quick OSG Operations Coordinator.
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
RSV: OSG Grid Fabric Monitoring and Interoperation with WLCG Monitoring Systems Rob Quick, Arvind Gopu, and Soichi Hayashi Computing in High Energy and.
Operations Activity Doug Olson, LBNL Co-chair OSG Operations OSG Council Meeting 3 May 2005, Madison, WI.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Open Science Grid Build a Grid Session Siddhartha E.S University of Florida.
VOX Project Tanya Levshina. 05/17/2004 VOX Project2 Presentation overview Introduction VOX Project VOMRS Concepts Roles Registration flow EDG VOMS Open.
Area Coordinator Report for Operations Rob Quick 4/10/2008.
Open Science Grid OSG Resource and Service Validation and WLCG SAM Interoperability Rob Quick With Content from Arvind Gopu, James Casey, Ian Neilson,
Operations Area Coordinator Report. 31 Jan Overview Operations Current Initiatives  RSV Version 2  New Probes, Easier Configuration, Improved.
User Support of WLCG Storage Issues Rob Quick OSG Operations Coordinator WLCG Collaboration Meeting Imperial College, London July 7,
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
Opensciencegrid.org Operations Interfaces and Interactions Rob Quick, Indiana University July 21, 2005.
A closer look at the VDT RPMs Alain Roy OSG Software Coordinator.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Integration TestBed (iTB) and Operations Provisioning Leigh Grundhoefer.
The Great Migration: From Pacman to RPMs Alain Roy OSG Software Coordinator.
Ruth Pordes, March 2010 OSG Update – GDB Mar 17 th 2010 Operations Services 1 Ramping up for resumption of data taking. Watching every ticket carefully.
RSV: OSG Grid Monitoring and User Customizable Views Rob Quick, Arvind Gopu, and Soichi Hayashi High Performance Distributed Computing Location: Munich,
March 2014 Open Science Grid Operations A Decade of HTC Infrastructure Support Kyle Gross Operations Support Lead Indiana University / Research Technologies.
RSV and Nagios in OSG Rob Quick. March 11, 2008 USCMS Tier-2 Workshop 2 Current State of OSG ~ 100 Sites ~ 30 VOs April 8th:  216,000 jobs (85% successful)
Certificate Security For Users Obtaining and Using Your Personal Certificate using the OSG PKI Kyle Gross – OSG Operations Support Lead Elizabeth Prout.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10 (Asia/Taipei) – Room 2, BHSS OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10.
Software Tools Group & Release Process Alain Roy Mine Altunay.
Operations Interfaces and Interactions
Operating a glideinWMS frontend by Igor Sfiligoi (UCSD)
Grid Service Monitoring Working Group
The ATLAS software in the Grid Alessandro De Salvo <Alessandro
Leigh Grundhoefer Indiana University
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

OSG Operations All Hands Meeting Rob Quick (Ops Coordinator) Slides by: Scott Teige and Kyle Gross

March 2011 Support Overview Communications Hub Coordinate Ticketing & Exchanges End-user Support OSG RA Documentation 2

March 2011 Communications Hub 24x7 Telephone – x7 – 24x7 Ticket Creation  Leverage the 24 hour coverage of the GRNOC at IU Community Notification Tools Blogspot postings, twitter and RSS feed   Twitter: OSGGOC (test) Weekly Operations Meeting  Mondays 3

March 2011 Ticketing & Ticket Exchange Central OSG Ticket System GOCTicket interface  Ticket Exchange – SC, GGUS, GOC-TX 10,000 ticket milestone – 2/22/2011 4

March 2011 End User Support OIM Registration  VOMS (MIS, OSGEDU, CSIU) Certificate Requests Twiki Support 5

March 2011 OSG RA Alain Deximo as new OSG RA Updating Procedures/Docs for effective backup Other than new POC (Alain), transparent to users 6

March 2011 Documentation Work with OSG Documentation Team  Help them with Twiki setup  Cleaning up Operations Docs 7

March 2011 Service Overview Information Services  Information to people  Information to machines Accounting Services Monitoring Services Collaborative Services

March 2011 MyOSG

March 2011 Display

March 2011 OIM Open Science Grid Information Management Semi-static information to people and machines Find contacts, VO information, resources, much more

March 2011 BDII Berkeley Database Information Interface Mostly provides information to machines Most critical service for GOC Dynamic information, ~2 minute period Many services depend on BDII  Some information to people

March 2011 Ticket Don’t get stuck, cut a ticket Ticket Exchange  GOC ticketing system interacts with other support organization ticket systems via the ticket exchange.  Allows seamless interaction of multiple ticket systems, seem to behave as one system.

March 2011 RSV Resource and Service Validation

March 2011 WLCG Comparison A accounting service Some OSG resources are also WLCG resources Separate accounting systems

March 2011 Software Cache Pointers to VDT software Certificate Authority Distribution  VO package Certificate requests

March 2011 xxx-ITB Ditto above but for testing 1 st and 3 rd Tuesdays updates to ITB  You are encouraged to test services, particularly those of interest to you 2 nd and 4 th Tuesdays updates to Prod. 5 th Tuesday, the GOC rests. 17

March 2011 Change Management and Ops Meetings Change Management Review  Tuesdays  ngeMgmtMeetingMinutes 18

March 2011 Recap from the Ops Coordinator 15 Minutes Sustainability “Yet, in spite of these spectacular strides in science and technology, and still unlimited ones to come, something basic is missing… We have learned to fly the air like birds and swim the sea like fish, but we have not learned the simple art of living together as brothers.” -MLK 19

Three things you’ve just gotta know about the VDT (And Frank) Alain Roy Open Science Grid Software Coordinator

March 2011 But first a poem 21 I have a flower on my head By Andrea Roy I have a Flower on my head What should I do? Should I water it? I think so.

March 2011 The three things you just gotta know about the VDT 1.RSV is way cooler 2.RPMs for the VDT are on the way 3.CREAM is coming to the VDT soon 22

March RSV is way cooler As of February 7th, OSG , RSV is just so much cooler for two main reasons: 1.Common RSV tasks are made simple with the new rsv-control command. 2.It is really easy to extend RSV with new probes  If you can write a script to test something, you can put it into RSV.  Is there something else you’d like to test? 3.Standalone installations are much easier (with config.ini) 23

March 2011 Easy to list your RSV probes! % rsv-control --list Metrics enabled for host: osg-edu.cs.wisc.edu:10443 | Service org.osg.srm.srmcp-readwrite | OSG-SRM org.osg.srm.srmping | OSG-SRM Metrics enabled for host: osg-edu.cs.wisc.edu | Service org.osg.batch.jobmanager-default-status | OSG-CE org.osg.batch.jobmanagers-available | OSG-CE org.osg.certificates.cacert-expiry | OSG-CE... 24

March 2011 Easy to see the RSV jobs! 25 % rsv-control --job-list Hostname: osg-edu.cs.wisc.edu ID OWNER ST NEXT RUN TIME METRIC rsv I :08 org.osg.globus.gridftp-simple rsv I :32 org.osg.gip.lastrun rsv R :47 org.osg.general.vdt-version... Hostname: osg-edu.cs.wisc.edu:10443 ID OWNER ST NEXT RUN TIME METRIC rsv I :33 org.osg.srm.srmping rsv R :28 org.osg.srm.srmcp-readwrite ID OWNER ST CONSUMER rsv R html-consumer rsv R gratia-consumer

March 2011 Easy to enable/disable RSV probes! 26 % rsv-control --enable --host osg-edu.cs.wisc.edu \ org.osg.ress.ress-classad-exists Enabling metric 'classad-exists' for host 'osg-edu.cs.wisc.edu' One or more metrics have been enabled and will be started the next time RSV is started. To turn them on immediately run 'rsv-control --on'.

March 2011 Easy to run a probe right now! 27 % rsv-control --run --host osg-edu.cs.wisc.edu org.osg.general.osg-version Running metric org.osg.general.osg-version: metricName: org.osg.general.osg-version metricType: status timestamp: :24:42 CST metricStatus: OK serviceType: OSG-CE serviceURI: osg-edu.cs.wisc.edu gatheredAt: osg-edu.cs.wisc.edu summaryData: OK detailsData: OSG EOT

March 2011 Easy to run all probes to refresh 28 % rsv-control --run –all-enabled Running metric org.osg.certificates.cacert-expiry (1 of 24) metricName: org.osg.certificates.cacert-expiry metricType: status timestamp: :40:40 CST metricStatus: OK serviceType: OSG-CE serviceURI: osg-edu.cs.wisc.edu gatheredAt: osg-edu.cs.wisc.edu summaryData: OK detailsData: Security Probe Version: 1.1 OK: CAs are in sync with OSG distribution EOT Running metric org.osg.general.osg-directories-CE-permissions (2 of 24)...

March 2011 Straightforward to get debugging info 29 % rsv-control --verify Testing if Condor-Cron is running... OK Testing if metrics are running... OK (24 running metrics) Testing if consumers are running... OK (2 running consumers) Checking which consumers are configured... The following consumers are enabled: html-consumer gratia-consumer % rsv-control --profile Running the rsv-profiler... OSG-RSV Profiler Analyzing... Making tarball (rsv- profiler.tar.gz)

March 2011 And now a slight detour: Frank Frank [ last-name removed ] Wrote some code for Condor that “worked”. But he meant: Works == Compiles A common mistake for beginners, so we won’t hold it against him. But it’s a useful indication of progress: A lot has been done, but it requires more before you can test it. 30

March RPMs for the VDT are on the way We have franked binary RPMs without configuration for:  gLexec  (Actually, they’ve been tested pretty well)  Xrootd  95% of the worker node (56/59 RPMs)  Currently missing: FTS client They are in a yum repo, will be available for testing soon. 31

March CREAM is coming to the VDT soon Basic CREAM install via Pacman  Currently franks, but known problems  End of March CREAM install via RPMs  End of April And then a period of testing/finalizing Ready for production by September Timeline driven by ATLAS needs 32

March 2011 I’m happy if you leave with those three things 1.RSV is way cooler 2.RPMs for the VDT are on the way 3.CREAM is coming to the VDT soon But I’ll say a two more things: 33

March 2011 Two More Things Plan for next round of OSG:  Do RPMs right: source packages, intermix with external dependencies neatly…  Community-oriented distributions We are getting better about collecting accurate requirements and reporting work plans/time lines 34

March 2011 But wait! There’s more! The Second Annual OSG Summer School!  June 26-30, 2011  Learn about high-throughput computing, OSG, and more!  Tell anyone that would be interested, spread the word!  ucation/OSGSummerSchool

March 2011 Any Questions? I’m here until Thursday—please come and talk to me. Or me: 36