Feb 23, 20111/22 LSST Simulations on OSG Feb 23, 2011 Gabriele Garzoglio for the OSG Task Force on LSST Computing Division, Fermilab Overview The Open.

Slides:



Advertisements
Similar presentations
Dec 14, 20061/10 VO Services Project – Status Report Gabriele Garzoglio VO Services Project WBS Dec 14, 2006 OSG Executive Board Meeting Gabriele Garzoglio.
Advertisements

ANTHONY TIRADANI AND THE GLIDEINWMS TEAM glideinWMS in the Cloud.
Campus High Throughput Computing (HTC) Infrastructures (aka Campus Grids) Dan Fraser OSG Production Coordinator Campus Grids Lead.
Introduction The Open Science Grid (OSG) is a consortium of more than 100 institutions including universities, national laboratories, and computing centers.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
A tool to enable CMS Distributed Analysis
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
SCD FIFE Workshop - GlideinWMS Overview GlideinWMS Overview FIFE Workshop (June 04, 2013) - Parag Mhashilkar Why GlideinWMS? GlideinWMS Architecture Summary.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
OSG Public Storage and iRODS
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Campus Grids Report OSG Area Coordinator’s Meeting Dec 15, 2010 Dan Fraser (Derek Weitzel, Brian Bockelman)
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Sep 21, 20101/14 LSST Simulations on OSG Sep 21, 2010 Gabriele Garzoglio for the OSG Task Force on LSST Computing Division, Fermilab Overview OSG Engagement.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
CSIU Submission of BLAST jobs via the Galaxy Interface Rob Quick Open Science Grid – Operations Area Coordinator Indiana University.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
Mar 28, 20071/9 VO Services Project Gabriele Garzoglio The VO Services Project Don Petravick for Gabriele Garzoglio Computing Division, Fermilab ISGC 2007.
ALICE-USA Grid-Deployment Plans (By the way, ALICE is an LHC Experiment, TOO!) Or (We Sometimes Feel Like and “AliEn” in our own Home…) Larry Pinsky—Computing.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Mar 28, 20071/18 The OSG Resource Selection Service (ReSS) Gabriele Garzoglio OSG Resource Selection Service (ReSS) Don Petravick for Gabriele Garzoglio.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center February.
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
Sep 25, 20071/5 Grid Services Activities on Security Gabriele Garzoglio Grid Services Activities on Security Gabriele Garzoglio Computing Division, Fermilab.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Fabric for Frontier Experiments at Fermilab Gabriele Garzoglio Grid and Cloud Services Department, Scientific Computing Division, Fermilab ISGC – Thu,
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Parag Mhashilkar (Fermi National Accelerator Laboratory)
VO Experiences with Open Science Grid Storage OSG Storage Forum | Wednesday September 22, 2010 (10:30am)
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
5/12/06T.Kurca - D0 Meeting FNAL1 p20 Reprocessing Introduction Computing Resources Architecture Operational Model Technical Issues Operational Issues.
OSG User Support OSG User Support March 13, 2013 Chander Sehgal
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Honolulu - Oct 31st, 2007 Using Glideins to Maximize Scientific Output 1 IEEE NSS 2007 Making Science in the Grid World - Using Glideins to Maximize Scientific.
Accessing the VI-SEEM infrastructure
U.S. ATLAS Grid Production Experience
Workload Management System
Provisioning 160,000 cores with HEPCloud at SC17
HTCondor and LSST Stephen Pietrowicz Senior Research Programmer National Center for Supercomputing Applications HTCondor Week May 2-5, 2017.
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
Simulation use cases for T2 in ALICE
US CMS Testbed.
Grid Canada Testbed using HEP applications
Haiyan Meng and Douglas Thain
DØ MC and Data Processing on the Grid
High Throughput Computing for Astronomers
Presentation transcript:

Feb 23, 20111/22 LSST Simulations on OSG Feb 23, 2011 Gabriele Garzoglio for the OSG Task Force on LSST Computing Division, Fermilab Overview The Open Science Grid (OSG) The Large Synoptic Survey Telescope (LSST) OSG User Support and LSST LSST Image Simulations on OSG Operational Experience

Feb 23, 20112/22 LSST Simulations on OSG Acknowledgments Bo Xin (Purdue) – LSST simulations on OSG and app integration Parag Mhashilkar (Fermilab) –LSST app integration in OSG John Peterson (Purdue) –Resp. for LSST production simulation Brian Bockelman, Derek Weitzel (UNL) –Job Management Infrastructure Administrators Ian Shipsey (Purdue) –LSST Management / LSST slides Ruth Pordes (Fermilab) –OSG Management / OSG slides

Feb 23, 20113/22 LSST Simulations on OSG The Components of the Open Science Grid (OSG) All Hands Meeting Consortium Project Collaboration Mission: The Open Science Grid aims to promote discovery and collaboration in data-intensive research by providing a computing facility and services that integrate distributed, reliable and shared resources to support computation at all scales.

Feb 23, 20114/22 LSST Simulations on OSG The Architecture of the Enterprise Virtual Organizations Common Services & Software Resources, Sites, & Campuses

Feb 23, 20115/22 LSST Simulations on OSG interfaces to clusters (small to large), 55 interfaces to data caches, 8 campus “grids” Resources, Sites & Campuses Every site has OSG proponent and collaborators in place. All sites support owner VO + OSG-mgmt VOs + at least one other VO.

Feb 23, 20116/22 LSST Simulations on OSG Mainly Particle Physics, Many non- Physics, Several multi-disciplinary Regional Communities

Feb 23, 20117/22 LSST Simulations on OSG

Feb 23, 20118/22 LSST Simulations on OSG

Feb 23, 20119/22 LSST Simulations on OSG

Feb 23, /22 LSST Simulations on OSG

Feb 23, /22 LSST Simulations on OSG Full LSST end-to-end photon Simulation Collaborators: Berkeley, Purdue, U. Washington, SLAC Simulation: Cosmological Models Galaxy Spatial Models & Spectra Atmosphere Optics Detector In 1 simulated image chip images 3 billion pixels 12 million objects Billions of ray-traced objects

Feb 23, /22 LSST Simulations on OSG Simulation Stages

Feb 23, /22 LSST Simulations on OSG OSG User Support for LSST OSG and LSST are working together to enable LSST applications to run on OSG. Started focus on simulations. OSG bootstraps communities with a “phased approach” based on well-scoped focused task forces Final goal of the activity is to empower the community to manage their computing activity using OSG services and platforms. PhasePrepara tion JobsDataJob Management Effort DocsSupportVO Services Responsib. Proof-of- Principle 1 mon.O(100)O(GB)OSG staff for 1 day BasicDedicatedLead by OSG w/ VO contrib. (job sub., user reg., monitoring, …) Production Demo A few mon. O(100,000)O(TB)VO staff for 1 month Production- like DedicatedLead by OSG w/ more VO contrib. ProductionSeveral mon. Production goals Prod. goals VO staff for production goals Production quality Towards Std Methods Lead by VO w/ support from OSG

Feb 23, /22 LSST Simulations on OSG History Feb 2010 – Proof of principle: 1 img on OSG Jun 2010 – OSG EB forms task force for production scale: 500 img pairs i.e. 1 night of observation Jul 2010 – Commissioning of the LSST submission system on OSG –1 person produced 183 times the same image in 1 day Sep 2010 – LSST operator (Bo Xin) ran operations to produce 529 pairs Oct 2010 – LSST validated results Nov 2010 – Feb 2011 Preparation for production phase

Feb 23, /22 LSST Simulations on OSG Workflow Requirements LSST simulation of 1 image: 189 trivially parallel jobs for the 189 chips Input to the workflow: –SED catalog and focal plane conf. files: 15 GB uncompr., pre- installed at all sites –Instance Catalog (SED files + wind speed, etc.): 500 MB compr. per image pair Workflow: –Trim catalog file into 189 chip-specific files –Submit 2 x 189 jobs: 1 image pair (same image w/ 2 exposures) Output: 2 x 189 FITS files, 25 MB each compr.

Feb 23, /22 LSST Simulations on OSG Production by Numbers Goal: simulate 1 night of LSST data collection: 500 pairs 200k simulation jobs (1 chip at a time) trim jobs Assume 4 hours / job for trim and simulation (over-est.)  800,000 CPU hours Assume 2000 jobs DC  ~50,000 CPU hours / day 17 days to complete (w/o counting failures) 12,000 jobs / day i.e. 31 image pairs / day 50 GB / day of input files moved (different for every job) 300 GB / day of output Total number of files = 400,000 (50% input - 50% output) Total output compressed = 5.0 TB (25 MB per job)

Feb 23, /22 LSST Simulations on OSG Condor Collector OSG Glidein Factory Glidein Grid Site CE Glidein VO Frontend WN Info Job & Data Local Disk Condor Scheduler DAG Submitter Create User Interface Submit Architecture Hadoop Submit Monitor Binaries and SED files pre-installed via OSG MM

Feb 23, /22 LSST Simulations on OSG Resource Utilization By Sep 3, produced 150 pairs in 5 days using 13 sites. Frontend Status: Jobs & GlideinsGratia Resource Utilization plots 150 pairs produced

Feb 23, /22 LSST Simulations on OSG Typical Workflow Statistics

Feb 23, /22 LSST Simulations on OSG Operational Challenges LSST binaries were not Grid-ready –Application assumed writable software distribution –Application assumed path-lengths too short for the Grid –Orchestration script did not exit with failure upon error (required manual recovery until fixed) Typical failures at sites: –Job required more memory than the batch system allotted –Storage unavailable due to maintenance at some of the most productive sites Limited disk quota on the submission machine After the production of 150 pairs, the operator was mostly traveling and had limited time to dedicate to the operations

Feb 23, /22 LSST Simulations on OSG Validation: comparison with PT1 We compared 6 image pair (2268 chips) with “official” references (PT1). 99% of the chip imgs are identical (pixel- subtraction is consistently 0) 14 chips are identical except for a few pixels: negligible Images produced on OSG are reproducible

Feb 23, /22 LSST Simulations on OSG Conclusions The Open Science Grid promotes data-intensive research providing a computing facility and services that integrate distributed, reliable and shared resources The Large Synoptic Survey Telescope (LSST) will capture 1000 panoramic sky images each night with a rapid fire 3.2 Gigapixel camera, covering the sky twice per week. OSG User Support and LSST are working together to run Image Simulation and Data Management applications on OSG. 1 LSST person, Bo Xin, with expert support has simulated 1 night of observations (529 image pairs) in 4 weeks, using 50,000 CPU h / day. This project has demonstrated that OSG is a valuable platform to simulate LSST images