OSG Production Report OSG Area Coordinator’s Meeting Nov 17, 2010 Dan Fraser.

Slides:



Advertisements
Similar presentations
London Tier2 Status O.van der Aa. Slide 2 LT 2 21/03/2007 London Tier2 Status Current Resource Status 7 GOC Sites using sge, pbs, pbspro –UCL: Central,
Advertisements

Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
Quarterly report ScotGrid Quarter Fraser Speirs.
ATLAS Installation System 2 (status update) Alessandro De Salvo A. De Salvo – 28 May 2013 Status.
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Campus High Throughput Computing (HTC) Infrastructures (aka Campus Grids) Dan Fraser OSG Production Coordinator Campus Grids Lead.
Experiment support at IN2P3 Artem Trunov CC-IN2P3
Jan 2010 Current OSG Efforts and Status, Grid Deployment Board, Jan 12 th 2010 OSG has weekly Operations and Production Meetings including US ATLAS and.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
CVMFS AT TIER2S Sarah Williams Indiana University.
OSG Area Coordinators Meeting Security Team Report Mine Altunay 01/29/2014.
Quarterly report SouthernTier-2 Quarter P.D. Gronbech.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
WLCG Service Report ~~~ WLCG Management Board, 27 th October
Monitoring the Grid at local, national, and Global levels Pete Gronbech GridPP Project Manager ACAT - Brunel Sept 2011.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Quarterly report ScotGrid Quarter Fraser Speirs.
Overview of day-to-day operations Suzanne Poulat.
WLCG Service Report ~~~ WLCG Management Board, 24 th November
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:
OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL
Production Coordination Staff Retreat July 21, 2010 Dan Fraser – Production Coordinator.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
WLCG Service Report ~~~ WLCG Management Board, 1 st September
OSG Area Coordinators Meeting Security Team Report Mine Altunay 8/15/2012.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
OSG Production Report OSG Area Coordinator’s Meeting Aug 12, 2010 Dan Fraser.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL
MW Readiness Verification Status Andrea Manzi IT/SDC 21/01/ /01/15 2.
OSG PKI Transition: Transition Phase Report Von Welch OSG PKI Transition Lead Indiana University Center for Applied Cybersecurity Research.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
Julia Andreeva, CERN IT-ES GDB Every experiment does evaluation of the site status and experiment activities at the site As a rule the state.
CMS Computing Model Simulation Stephen Gowdy/FNAL 30th April 2015CMS Computing Model Simulation1.
High Energy FermiLab Two physics detectors (5 stories tall each) to understand smallest scale of matter Each experiment has ~500 people doing.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
Production Coordination Area VO Meeting Feb 11, 2009 Dan Fraser – Production Coordinator.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
April 26, Executive Director Report Executive Board 4/26/07 Things under control Things out of control.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
Production Oct 31, 2012 Dan Fraser. Current Production Focus Transition to RPMs 52(44) sites using RPM based installs 52(44) sites using RPM based installs.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
High Availability Technologies for Tier2 Services June 16 th 2006 Tim Bell CERN IT/FIO/TSI.
OSG Area Report Production – Operations – Campus Grids Jan 11, 2011 Dan Fraser.
ATP Future Directions Availability of historical information for grid resources: It is necessary to store the history of grid resources as these resources.
WLCG Service Report ~~~ WLCG Management Board, 18 th September
WLCG Service Report ~~~ WLCG Management Board, 23 rd November
OSG Area Coordinator’s Report: Workload Management March 25 th, 2010 Maxim Potekhin BNL
OSG Area Coordinator’s Report: Workload Management October 6 th, 2010 Maxim Potekhin BNL
WLCG Information System Use Cases Review WLCG Operations Coordination Meeting 18 th June 2015 Maria Alandes IT/SDC.
A. Mohapatra, T. Sarangi, HEPiX-Lincoln, NE1 University of Wisconsin-Madison CMS Tier-2 Site Report D. Bradley, S. Dasu, A. Mohapatra, T. Sarangi, C. Vuosalo.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
OSG Area Report Production – Operations – Campus Grids June 19, 2012 Dan Fraser Rob Quick.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
OSG Technology Area Brian Bockelman Area Coordinator’s meeting October 12, 2011.
WLCG Service Report ~~~ WLCG Management Board, 10 th November
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016.
Parag Mhashilkar Computing Division, Fermilab.  Status  Effort Spent  Operations & Support  Phase II: Reasons for Closing the Project  Phase II:
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
WLCG IPv6 deployment strategy
Brookhaven National Laboratory Storage service Group Hironori Ito
Presentation transcript:

OSG Production Report OSG Area Coordinator’s Meeting Nov 17, 2010 Dan Fraser

Overall Production

OSG Display OSG delivered across 80 sites In the last 24 Hours: 732,000 Jobs 1,114,000 CPU Hours 121,000 Transfers 805 TB Transferred In the last 30 Days: 13,088,000 Jobs 36,099,000 CPU Hours 50,860,000 Transfers 29,726 TB Transferred In the last Year: 152,016,000 Jobs 332,227,000 CPU Hours 493,663,000 Transfers 204,653TB Transferred Status at 11:50 AM

Some Production Examples… Effort from the entire team Tracking and solving BDII Issues Tracking and solving BDII Issues One BDII can no longer handle the entire load Plan to add more BDIIs into round-robin Plan to add more BDIIs into round-robin Plan to upgrade to BDII v5 Plan to upgrade to BDII v5 Stress testing; consistency verification; VM testing (problems) Stress testing; consistency verification; VM testing (problems) Patched by GOC Patched by GOC General CERN BDII Instabilities (often detected by OSG) BNL dropping out of CERN BDII Turned out to be a GigaPOP issue at Indiana Turned out to be a GigaPOP issue at Indiana SL5 Kernel vulnerability patch SL5 Kernel vulnerability patch GOC now using Puppet for s/w management GOC now using Puppet for s/w management Atlas critical alarm against T1 dCache system Atlas critical alarm against T1 dCache system Verification of alarm process

Updated View from Production New VOs can quickly come up to speed (with handholding) LSST capable of getting >60k hours/day LSST capable of getting >60k hours/day Pilot factory running at SDSC now supporting multiple VOs CMS, HCC, SBGrid/NEBioGrid CMS, HCC, SBGrid/NEBioGrid GlueX, IceCube, Glow (setup but not really used yet) GlueX, IceCube, Glow (setup but not really used yet) Opportunistic storage is the #1 problem A very difficult problem A very difficult problem No OSG solution on the horizon. No OSG solution on the horizon. CMS and Atlas experimenting with an Xrootd based data access strategy using “transparent remote streaming and data caching” to create a more “global” system (Brian) CMS and Atlas experimenting with an Xrootd based data access strategy using “transparent remote streaming and data caching” to create a more “global” system (Brian) It could eventually have some implications for OSG … It could eventually have some implications for OSG …