HTCondor in Astronomy at NCSA

Slides:



Advertisements
Similar presentations
Overview of Wisconsin Campus Grid Dan Bradley Center for High-Throughput Computing.
Advertisements

The Dark Energy Survey The Big Questions The Discovery of Dark Energy The Dark Energy Survey – The telescope – The camera – The science Expected Results.
Astronomy of the Next Decade: From Photons to Petabytes R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST.
Introduction The Open Science Grid (OSG) is a consortium of more than 100 institutions including universities, national laboratories, and computing centers.
1 LSST: Dark Energy Tony Tyson Director, LSST Project University of California, Davis Tony Tyson Director, LSST Project University of California, Davis.
Building a Framework for Data Preservation of Large-Scale Astronomical Data ADASS London, UK September 23-26, 2007 Jeffrey Kantor (LSST Corporation), Ray.
Astro-DISC: Astronomy and cosmology applications of distributed super computing.
Aus-VO: Progress in the Australian Virtual Observatory Tara Murphy Australia Telescope National Facility.
CREATING A MULTI-WAVELENGTH GALACTIC PLANE ATLAS WITH AMAZON WEB SERVICES G. Bruce Berriman, John Good IPAC, California Institute of Technolog y Ewa Deelman,
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
1 New Frontiers with LSST: leveraging world facilities Tony Tyson Director, LSST Project University of California, Davis Science with the 8-10 m telescopes.
1 Radio Astronomy in the LSST Era – NRAO, Charlottesville, VA – May 6-8 th LSST Survey Data Products Mario Juric LSST Data Management Project Scientist.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The Dark Energy Survey Middleware LSST Workflow Workshop 09/2010.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
Sep 21, 20101/14 LSST Simulations on OSG Sep 21, 2010 Gabriele Garzoglio for the OSG Task Force on LSST Computing Division, Fermilab Overview OSG Engagement.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The Dark Energy Survey The Big Questions The Discovery of Dark Energy
Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
The Large Synoptic Survey Telescope Project Bob Mann LSST:UK Project Leader Wide-Field Astronomy Unit, Edinburgh.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.
George Kola Computer Sciences Department University of Wisconsin-Madison Data Pipelines: Real Life Fully.
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
LSST CORPORATION Patricia Eliason LSSTC Executive Officer Belgrade, Serbia 2016.
Gina Moraila University of Arizona April 21, 2012.
VO Experiences with Open Science Grid Storage OSG Storage Forum | Wednesday September 22, 2010 (10:30am)
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
1 SPIE Astronomical Telescopes + Instrumentation | 26 June - 1 July 2016 | Edinburgh, United Kingdom Investigating interoperability of the LSST Data Management.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
T. Axelrod, NASA Asteroid Grand Challenge, Houston, Oct 1, 2013 Improving NEO Discovery Efficiency With Citizen Science Tim Axelrod LSST EPO Scientist.
LSST Commissioning Overview and Data Plan Charles (Chuck) Claver Beth Willman LSST System Scientist LSST Deputy Director SAC Meeting.
HPC In The Cloud Case Study: Proteomics Workflow
What to Expect at the LSST Archive: The LSST Science Platform Mario Juric, University of Washington LSST Data Management Subsystem Scientist for the.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Pegasus WMS Extends DAGMan to the grid world
Dag Toppe Larsen UiB/CERN CERN,
Cloudy Skies: Astronomy and Utility Computing
Dynamic Deployment of VO Specific Condor Scheduler using GT4
LSST Commissioning Overview and Data Plan Charles (Chuck) Claver Beth Willman LSST System Scientist LSST Deputy Director SAC Meeting.
Working With Azure Batch AI
Dag Toppe Larsen UiB/CERN CERN,
Working with Feature Layers
Seismic Hazard Analysis Using Distributed Workflows
Outline Expand via Flocking Grid Universe in HTCondor ("Condor-G")
Optical Survey Astronomy DATA at NCSA
Provisioning 160,000 cores with HEPCloud at SC17
HTCondor and LSST Stephen Pietrowicz Senior Research Programmer National Center for Supercomputing Applications HTCondor Week May 2-5, 2017.
CyberShake Study 16.9 Discussion
LDF “Highlights,” May-October 2017 (1)
US CMS Testbed.
Haiyan Meng and Douglas Thain
Outline What users want ? Data pipeline overview
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
Brian Lin OSG Software Team University of Wisconsin - Madison
Observing with Modern Observatories (the data flow)
High Throughput Computing for Astronomers
Frieda meets Pegasus-WMS
Euclid UK on IRIS Mark Holliman SDC-UK Deputy Lead
KIT visit to Cascina , 22 March 2019
Presentation transcript:

HTCondor in Astronomy at NCSA Michael Johnson, Greg Daues, and Hsin-Fang Chiang HTCondor Week 2019

The Dark Energy Survey “The Dark Energy Survey (DES) is designed to probe the origin of the accelerating universe and help uncover the nature of dark energy by measuring the 14-billion-year history of cosmic expansion with high precision.” Reference: http://map.gsfc.nasa.gov/media/060915/index.html

400+ scientists from 25 institutions in 7 countries DES Collaboration 400+ scientists from 25 institutions in 7 countries

DES: Mapping the Sky DES is a 5.5-year optical survey covering 5000 deg2 of the southern sky Each section of the “footprint” is observed 2x a year in 5 different filters Certain fields are revisited every ~10 nights to search for supernovae The project is very similar to a high-energy physics experiment. The goal is to construct a large and uniform statistical dataset based on observations and derived measurements which can be analyzed for physical phenomena. The manner in which we process and analyze data draws on the HEP/open science community, including HTC.

DES: Instrumentation Blanco 4m Telescope @ Sees 20x area of Full Moon! Cerro Tololo Inter-American Observatory, La Serena, Chile Sees 20x area of Full Moon! 570-Mpix, 62-CCD camera observes in 5-6 filters Reference (left): http://www.nasa.gov/centers/goddard/news/topstory/2007/noao_glast.html Reference (right): https://www.noao.edu/image_gallery/html/im1047.html Reference (bottom): Dark Energy Survey Collaboration

DES: Data Management Raw images streamed from Chile → Tucson → NCSA Images cleaned and millions of stars and galaxies cataloged Over 18,000 images/night 1 TB raw data/night → 5 TB processed data/night Data are archived at NSCA and served to the collaboration for scientific analysis Reference (globe): http://en.wikipedia.org/wiki/Americas#/media/File:Americas_(orthographic_projection).svg

Data Processing Operational Modes Run in several processing modes Nightly (within 24 hrs) Initial processing to assess data quality Feedback to mountaintop Annually Latest and greatest calibrated processing over all prior data Basis for internal and public data releases Difference imaging Always some level of processing (multiple pipelines) occurring at any given time. As we near survey’s end, we are running new value-added processing pipelines.

DESDM Data Management System The DESDM system is based on a combined processing framework and data management framework Centralized job configuration and management Data movement orchestration Provenance and metadata collection (Open Provenance Model) Continual data annotation Data lifecycle management The DESDM system allows for configuring and managing the various and simultaneous processing “campaigns” For a given campaign, specify which data to process (via metadata query), which pipelines and configs to use, where to archive the data, where to process data, what provenance to collect Manage relative prioritization of campaigns Annotating outputs for identification of data used for downstream processes (e.g., QA, release prep, data management activities) Centralized data management, distributed computing

DES and Beyond DES is scheduled to end around 2021, but DECam is still a world- class instrument and will continue to be used for many more years. Want to leverage our data management system for future needs: Processing public DECam data sets to complement and expand DES (DECADE) On-sky DECam follow-up for optical MMA As future surveys come online can we use DECam as a follow-up instrument? Are there other programs/initiatives that can make use of our system and take advantage of the knowledge we’ve gained processing for DES?

DESDM Workflow & HTCondor Infrastructure Python wrapping HTCondor DAGMan submits Nested DAG workflow for each Unit (Exposure, Tile, etc) Numerous DAGs, No Overarching Workflow Throttling Issues for PRE/POST Submit Side Infrastructure Separate Central Manager (collector, negotiator) Two largish Submit nodes (schedd) Multi-schedd process configuration (~OSG Login) File Staging/Transfer No-shared-filesystem processing Data staged in & out via Curl/webdav

DESDM HTCondor Infrastructure & Platforms Illinois Campus Cluster https://campuscluster.illinois.edu Models: Investor, RCaaS, etc. DESDM as an investor: provisions ~ 32 nodes, 900 cores, CentOS7 Main ICCP has PBS scheduler, DESDM nodes managed separately DESDM Condor Pool - Partitionable Slots Compute jobs run on Local Scratch Disk Machine Ads for Processing Type/Campaign Jobs of a species sent to targeted nodes (e.g., avoid defrag issues) Best for ‘realtime’, quick turn around

DESDM HTCondor Infrastructure & Platforms Blue Waters Computing System at NPCF DESDM works with Innovation and Exploration Allocation HTCondor Glide-ins submitted through PBS scheduler Glide-in setup a driver for RSIP solution (general workflows) HTCondor Execute directories on shared Lustre file system Scale constrained by metadata server FermiGrid HTCondorCE : JobRouter to DES Nodes DES Virtual Organization Software Stacks in CVMFS /cvmfs/des.opensciencegrid.org DESDM Software Services FHNW-Zurich

DESDM HTCondor Infrastructure & Platforms DESDM setup for use of Open Science Grid DESDM as a OSG project Submit Node with Flocking setup FLOCK_TO = flock.opensciencegrid.org Data Origin for utilizing StashCache infrastructure K8s worker node - OSG pods on PRP Kubernetes Cluster Registered /cvmfs/desdm.osgstorage.org in DES VO DESDM setup for OSG prototype for other efforts at NCSA

DESDM HTCondor Infrastructure & Platforms Testing DESDM with HTCondor, condor_annex on AWS Single Exposure test with DESDM framework on AWS EC2 instance, used Singularity Glide-in to ‘production pool’ Testing condor_annex in ‘personal condor’ Default HTCondor 8.6.x, Amazon Linux Customized AMI with HTCondor 8.8.x, Amazon Linux 2 Encryptfs issue Need to examine annex to ‘production pool’ / HTCondor as root

Large Synoptic Survey Telescope A dedicated 10yr Deep-Wide-Fast Survey LSST v.s. DES: 2 times mirror size, 5 times pixels LSST can obtain “DES” in 1.5 months 4 times larger area Repeat the full sky every 3-4 nights Open data, open source Science operations starts in 2023 ~200,000 images per night Raw data ~20TB per night 60PB of raw image data 500PB of final image data

Large Synoptic Survey Telescope Data Processing & Workflow management 11 Public Data Releases Proof-of-concept with DESDM system Customization to the DESDM system would be needed for LSST Proof-of-concept with HTCondor + Pegasus on AWS Exploration just started this month Plan to use HTCondor Annex Plan to use S3 storage Decision not finalized