National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The Dark Energy Survey Middleware LSST Workflow Workshop 09/2010.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

Introduction The Open Science Grid (OSG) is a consortium of more than 100 institutions including universities, national laboratories, and computing centers.
A. Arbree, P. Avery, D. Bourilkov, R. Cavanaugh, S. Katageri, G. Graham, J. Rodriguez, J. Voeckler, M. Wilde CMS & GriPhyN Conference in High Energy Physics,
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
Workload Management Massimo Sgaravatto INFN Padova.
MultiJob PanDA Pilot Oleynik Danila 28/05/2015. Overview Initial PanDA pilot concept & HPC Motivation PanDA Pilot workflow at nutshell MultiJob Pilot.
Intermediate HTCondor: Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.
SUN HPC Consortium, Heidelberg 2004 Grid(Lab) Resource Management System (GRMS) and GridLab Services Krzysztof Kurowski Poznan Supercomputing and Networking.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Vladimir Litvin, Harvey Newman Caltech CMS Scott Koranda, Bruce Loftis, John Towns NCSA Miron Livny, Peter Couvares, Todd Tannenbaum, Jamie Frey Wisconsin.
Workflow Management in Condor Gökay Gökçay. DAGMan Meta-Scheduler The Directed Acyclic Graph Manager (DAGMan) is a meta-scheduler for Condor jobs. DAGMan.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Long Term Ecological Research Network Information System LTER Grid Pilot Study LTER Information Manager’s Meeting Montreal, Canada 4-7 August 2005 Mark.
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
March 2, 2007 DESDM Director's Review - Mohr Director’s Review of DESDM NCSA Director’s Review took place at NCSA on February 20, 2007 NCSA Director’s.
RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.
Managing large-scale workflows with Pegasus Karan Vahi ( Collaborative Computing Group USC Information Sciences Institute Funded.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
We have developed a GUI-based user interface for Chandra data processing automation, data quality evaluation, and control of the system. This system, known.
National Computational Science National Center for Supercomputing Applications National Computational Science NCSA-IPG Collaboration Projects Overview.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
DC2 Post-Mortem/DC3 Scoping February 5 - 6, 2008 DC3 Goals and Objectives Jeff Kantor DM System Manager Tim Axelrod DM System Scientist.
Grid Compute Resources and Job Management. 2 Local Resource Managers (LRM)‏ Compute resources have a local resource manager (LRM) that controls:  Who.
Turning science problems into HTC jobs Wednesday, July 29, 2011 Zach Miller Condor Team University of Wisconsin-Madison.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
AgINFRA science gateway for workflows and integrated services 07/02/2012 Robert Lovas MTA SZTAKI.
Intermediate Condor: Workflows Rob Quick Open Science Grid Indiana University.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
1 P-GRADE Portal: a workflow-oriented generic application development portal Peter Kacsuk MTA SZTAKI, Hungary Univ. of Westminster, UK.
Review of Condor,SGE,LSF,PBS
Interactive Workflows Branislav Šimo, Ondrej Habala, Ladislav Hluchý Institute of Informatics, Slovak Academy of Sciences.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
From photons to catalogs. Cosmological survey in visible/near IR light using 4 complementary techniques to characterize dark energy: I. Cluster Counts.
Leveraging the InCommon Federation to access the NSF TeraGrid Jim Basney Senior Research Scientist National Center for Supercomputing Applications University.
Intermediate Condor: Workflows Monday, 1:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Debugging TI RTOS TEAM 4 JORGE JIMENEZ JHONY MEDRANO ALBIEN FEZGA.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
CERN IT Department CH-1211 Geneva 23 Switzerland t A proposal for improving Job Reliability Monitoring GDB 2 nd April 2008.
Mountaintop Software for the Dark Energy Camera Jon Thaler 1, T. Abbott 2, I. Karliner 1, T. Qian 1, K. Honscheid 3, W. Merritt 4, L. Buckley-Geer 4 1.
National Center for Supercomputing Applications Dark Energy Survey Collaboration Meeting Data Management Status December 11, 2006 Chicago Cristina Beldica.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
Bulk Data Transfer Activities We regard data transfers as “first class citizens,” just like computational jobs. We have transferred ~3 TB of DPOSS data.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
The Gateway Computational Web Portal Marlon Pierce Indiana University March 15, 2002.
OGCE Workflow and LEAD Overview Suresh Marru, Marlon Pierce September 2009.
Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre
George Kola Computer Sciences Department University of Wisconsin-Madison Data Pipelines: Real Life Fully.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
A Prototype for JDEM Science Data Processing, Erik Gottschalk 1 A Prototype for JDEM Science Data Processing Erik Gottschalk Fermilab On behalf of the.
National Center for Supercomputing Applications Data Management for the Dark Energy Survey OSG Consortium All Hands Meeting SDSC, San Diego CA March 2007.
GPIR GridPort Information Repository
Simplify Your Science with Workflow Tools
HTCondor and LSST Stephen Pietrowicz Senior Research Programmer National Center for Supercomputing Applications HTCondor Week May 2-5, 2017.
Grid Portal Services IeSE (the Integrated e-Science Environment)
Module 01 ETICS Overview ETICS Online Tutorials
Overview of Workflows: Why Use Them?
Mats Rynge USC Information Sciences Institute
HTCondor in Astronomy at NCSA
portal broker PingER Replica Mgr RFT GridFTP GateKeeper Job Mgr Akenti
Presentation transcript:

National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The Dark Energy Survey Middleware LSST Workflow Workshop 09/2010 Michelle Gower NCSA on behalf of the DES Data Management Team and DES Collaboration

DES Data Management Observing from : 200 TB raw data, 4 PB of products 100 TB object catalog 120+ CPU years/year of processing Processing includes: Removing the instrument signature Removing artifacts (e.g. planes) Calibrating/registering Feature detection Feature analysis Monitoring/quality assessment

Requirements Level of parallelism changes through processing Easily modified by scientists Local and remote clusters Work with project’s Archive system Monitoring of jobs while they’re running Less research/More production

Processing Framework Overview 4 Archive Nodes Target (HPC)‏ Machines Notification Service Database Pipelines of AppModules Event Monitor Archive Portal Orchestration DAF

Middleware Workflow Condor DAGman Job submission Condor-G to pre-WS GRAM for TeraGrid resources Condor (vanilla jobs) for local machines File transfer GridFTP using clients uberftp and globus-url-copy Runtime Monitoring Elf/Ogrescript

Example: Module Definition output_dir = ${archive_root}/${run_dir}/raw exec=${des_home}/bin/crosstalk.pl args= -archiveroot ${archive_root} -list ${crosstalk_list} -binpath ${des_home}/bin –outputpath ${output_dir} -detector ${detector} -crosstalk ${crosstalk_file} -photflag ${photflag} –verbose ${verbose} wall_mod = 30 basename crosstalk fileclass = src filetype = src query_fields = fileclass,filetype,project,nite min_num_per_job = 5 fileclass = cal filetype = xtalk query_fields = fileclass,filetype,project,detector

Workflow CrosstalkCreateCorImCorrect MaskingAstroRefine RemapPSFModelWeakLensing Make Bkgd

Run queries to get master input file lists Stage input images to target machine Generate Jobs: input lists, job descriptions, DAG Setup target machine Stage generated lists and files to target machine Make timestamp on target machine Target jobs run Ingestion of new or modified files Job Generation

Upcoming Orchestration Work Skipping modules Repeatedly using set of blocks (Campaign processing) Let condor manage vanilla jobs per machine threaded vs serial Finer grain restarts

Notification Events System AppModule (Elf/OgreScript) AppModule (Elf/OgreScript) Science Code ActiveMQ Message Bus ActiveMQ Message Bus Event Store Portal Monitor

DES Monitor Portal Current Technology Drupal CMS / PHP, Javascript/Ajax Tools for Viewing Active Processing Status High level Alert Monitoring Quick navigation to event logs Quality Assurance Profiles Histograms of QA metrics, outlier detection Processing Timing Summary Report Middleware level timing profile

Acknowledgements DESDM Team: Jim Myers, Terry McLaren Joe Mohr Bob Armstrong, Dora Cai, Ankit Chandra, Greg Daues, Shantanu Desai, Michelle Gower, Wayne Hoyenga, Chit Khin, Kailash Kotwani Past DESDM Team Members, DES Project Team and Collaboration Members National Science Foundation