National Center for Supercomputing Applications Data Management for the Dark Energy Survey OSG Consortium All Hands Meeting SDSC, San Diego CA March 2007.

Slides:



Advertisements
Similar presentations
Common Instrument Middleware Architecture and Federation of Instrument Resources for X-ray Crystallography Rick McMullen Indiana University.
Advertisements

The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
Brenna Flaugher Sept. 26, The Dark Energy Survey Camera: DECam 1.1 Management 1.2 Focal Plane Detectors 1.3 Front End Electronics 1.4 Optics 1.5.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
Introduction The Open Science Grid (OSG) is a consortium of more than 100 institutions including universities, national laboratories, and computing centers.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
The National Science Foundation The Dark Energy Survey J. Frieman, M. Becker, J. Carlstrom, M. Gladders, W. Hu, R. Kessler, B. Koester, A. Kravtsov, for.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Building a Framework for Data Preservation of Large-Scale Astronomical Data ADASS London, UK September 23-26, 2007 Jeffrey Kantor (LSST Corporation), Ray.
DES Meeting in Michigan April Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO, Barcelona, UCL, Cambridge, Edinburgh 1 Dark Energy Survey 5000 sq-deg.
Simo Niskala Teemu Pasanen
The Photometric Calibration of the Dark Energy Survey D. L. Tucker 1, S. S. Allam 1, J. T. Annis 1, R. Armstrong 2, J. P. Bernstein 3, E. Bertin 4, D.
1 B. Flaugher P5 April 2006 The Dark Energy Survey From Scientific Goals to Science Quality Data Brenna Flaugher Fermilab April 2006 P5 Meeting.
August '04 - Joe Mohr Blanco Instrument Review Presentations to Blanco Instrument Review Panel Intro and Science 1 Mohr Intro and Science 1 Mohr Science.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
October 3rd 2013 Rogerio Rosenfeld The Dark Energy Survey: an overview Rogerio Rosenfeld IFT-UNESP ICTP-SAIFR LIneA 1.
Each 6” wafer contains: four 2k×4k, one 2k × 2k, eight 512 × 1k Follows SNAP model: Foundry performs first 8 steps on 650  m high resistivity wafers (10.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The Dark Energy Survey Middleware LSST Workflow Workshop 09/2010.
March 2, 2007 DESDM Director's Review - Mohr Director’s Review of DESDM NCSA Director’s Review took place at NCSA on February 20, 2007 NCSA Director’s.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
December 11, 2006 DES DM - Mohr The DES DM Team Tanweer Alam 1 Dora Cai 1 Joe Mohr 1,2 Jim Annis 3 Greg Daues 1 Choong Ngeow 2 Wayne Barkhouse 2 Patrick.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
Brenna Flaugher Dark Energy Symposium StSci May The Dark Energy Survey (DES) Proposal: –Perform a 5000 sq. deg. survey of the southern galactic.
Each 6” wafer contains: 4 2k×4k, 1 2k × 2k, & × 1k Follows SNAP model: Foundry performs first 8 steps on 650  m high resistivity wafers (10 kohm-cm)
Dark Energy Probes with DES (focus on cosmology) Seokcheon Lee (KIAS) Feb Section : Survey Science III.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
© 2006 Open Grid Forum Enabling Pervasive Grids The OGF GIN Effort Erwin Laure GIN-CG co-chair, EGEE Technical Director
John Peoples for the DES Collaboration BIRP Review August 12, 2004 Tucson1 DES Management  Survey Organization  Survey Deliverables  Proposed funding.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
Kurt Mueller San Diego Supercomputer Center NPACI HotPage Updates.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.
Dark Energy Survey Status John Peoples Presentation to the Fermilab PAC 6 March 2009.
The Photometric Calibration of the Dark Energy Survey D. L. Tucker 1, S. S. Allam 1, J. T. Annis 1, R. Armstrong 2, J. P. Bernstein 3, E. Bertin 4, D.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The Dark Energy Survey The Big Questions The Discovery of Dark Energy
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
John Peoples October 3, The Dark Energy Survey Structure, Management, and Oversight A presentation to the Directors of Fermilab, NCSA and NOAO.
From photons to catalogs. Cosmological survey in visible/near IR light using 4 complementary techniques to characterize dark energy: I. Cluster Counts.
LSST VAO Meeting March 24, 2011 Tucson, AZ. Headquarters Site Headquarters Facility Observatory Management Science Operations Education and Public Outreach.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
Mountaintop Software for the Dark Energy Camera Jon Thaler 1, T. Abbott 2, I. Karliner 1, T. Qian 1, K. Honscheid 3, W. Merritt 4, L. Buckley-Geer 4 1.
National Center for Supercomputing Applications Dark Energy Survey Collaboration Meeting Data Management Status December 11, 2006 Chicago Cristina Beldica.
Gijs Verdoes Kleijn Edwin Valentijn Marjolein Cuppen for the Astro-WISE consortium.
Brenna Flaugher for the DES Collaboration; DPF Meeting August 27, 2004 Riverside,CA Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO 1 Dark Energy and.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Brenna Flaugher FNAL All Experimenters Meeting June Dark Energy Survey Motivation Dark Energy is the dominant constituent of the Universe Dark Matter.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
SDSS Stripe 82 and the Photometric Calibration of the Dark Energy Survey (DES) Douglas L. Tucker (FNAL), for the Dark Energy Survey Collaboration DES Collaboration:
Shaowen Wang 1, 2, Yan Liu 1, 2, Nancy Wilkins-Diehr 3, Stuart Martin 4,5 1. CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Quantifying Dark Energy using Cosmic Shear
Shaowen Wang1, 2, Yan Liu1, 2, Nancy Wilkins-Diehr3, Stuart Martin4,5
LHC Data Analysis using a worldwide computing grid
OGCE Portal Applications for Grid Computing
HTCondor in Astronomy at NCSA
Presentation transcript:

National Center for Supercomputing Applications Data Management for the Dark Energy Survey OSG Consortium All Hands Meeting SDSC, San Diego CA March 2007 Greg Daues National Center for Supercomputing Applications University of Illinois at Urbana-Champaign, IL, USA

National Center for Supercomputing Applications The Dark Energy Survey International Collaboration –Fermilab - Camera building, Survey Planning and Simulations –U Illinois - Data Management, Data Acquisition, SPT –U Chicago - SPT, Simulations, Optics –U Michigan- Optics –LBNL - Red Sensitive CCD Detectors –CTIO - Telescope & Camera Operations –Spanish Consortium- Front End Electronics Institut de Fisica d’Altes Energies, Barcelona Institut de Ciencias del Espacias, Barcelona CIEMAT, Madrid –UK Consortium- Optics UCL, Cambridge, Edinburgh, Portsmouth and Sussex –U Penn- science software, electronics –Brazilian Consortium- Simulations, CCDs, Data Management Observatorio Nacional Centro Brasileiro de Pesquisas Fisicas Universidade Federal do Rio de Janeiro Universidade Federal do Rio do Sul –Ohio State University- electronics –Argonne National Laboratory- electronics, CCDs

National Center for Supercomputing Applications DESDM Team Tanweer Alam 3 Emannuel Bertin 4 Joe Mohr 1,3 Jim Annis 2 Dora Cai 3 Choong Ngeow 1 Wayne Barkhouse 1 Greg Daues 3 Ray Plante 3 Cristina Beldica 3 Patrick Duda 3 Douglas Tucker 2 Huan Lin 2 1 U Illinois Astronomy 2 Fermilab 3 U Illinois NCSA 4 Terapix, IAP

National Center for Supercomputing Applications The Dark Energy Survey A study of the dark energy using four independent and complementary techniques –Galaxy cluster surveys –Weak lensing –Galaxy angular power spectrum –SN Ia distances Two linked, multiband optical surveys –5000 deg 2 gri and z (or Y+Z?) –Repeated observations of 40 deg 2 Construction and Science –Build new 3 deg 2 camera for Blanco 4m in Chile –Invest in upgrades for Blanco 4m –Build DM system to archive and process data at NCSA –Survey and Science: Image credit: Roger Smith/NOAO/AURA/NSF Blanco 4m on Cerro Tololo

National Center for Supercomputing Applications DES Instrument (DECam) 3556 mm 1575 mm Camera Filters Optical Lenses Scroll Shutter Elements: mosaic camera 5 element optical corrector 4 filters (g,r,i,z) Focal plane: 62 2K x 4K CCD modules (0.27''/pixel)

National Center for Supercomputing Applications DES Data Management Overview Data Management System must provide –Reliably transfer ~310GB/night produced on 525 nights over 5 years from CTIO to NCSA –Process data at NCSA, provide to DES team and community –Maintain DES archive over the long term ~2PB total, including 100TB database –Community access Raw and reduced data 1 yr after acquisition Co-add and catalogs at midpnt and end of survey Reduction pipelines released when camera deployed DESDM design guided by requirements for –Data –Processing –Storage and Distribution –Automation and Reliability –Hardware Single DECam exposure (500Mpix) consists of 62 individual CCD images

National Center for Supercomputing Applications DESDM Data Flow

National Center for Supercomputing Applications Pipeline Architecture Java Middleware layer with XML-described workflows Java Middleware layer with XML-described workflows NCSA workflow scripting packages : NCSA workflow scripting packages : OGRE [Apache Ant] OGRE [Apache Ant] OgreScript [ELF container] OgreScript [ELF container] Developed within LEAD, OGCE, and other projects Developed within LEAD, OGCE, and other projects Orchestration/Control and Management of Tasks Orchestration/Control and Management of Tasks Modular components with dependency rules Modular components with dependency rules Grid functionality: Grid functionality: Integrated with Trebuchet (jglobus) GridFTP library Integrated with Trebuchet (jglobus) GridFTP library Workflows publish/subscribe to Event Channel Workflows publish/subscribe to Event Channel (Interact with web services) (Interact with web services) Application Layer: Astronomy Modules Application Layer: Astronomy Modules Many authored by DES team in C Many authored by DES team in C Terapix software : SCAMP, SExtractor, SWarp Terapix software : SCAMP, SExtractor, SWarp

National Center for Supercomputing Applications Orchestration Pipeline Orchestration Pipeline Operates on a single DES server : Control and Management Operates on a single DES server : Control and Management Database interaction; Setup of workspaces, staging via Database interaction; Setup of workspaces, staging via GridFTP GridFTP Condor-G for data parallel job description/submission Condor-G for data parallel job description/submission to Globus gatekeepers to Globus gatekeepers Pipeline Server Orchestration Pipeline Catalog and Image Archive Database... TeraGrid Compute Pipelines Event Channel Pipeline Processing

National Center for Supercomputing Applications Image Processing Pipelines Image Processing Pipelines Executed on TeraGrid/HPC systems (NCSA Mercury) Executed on TeraGrid/HPC systems (NCSA Mercury) Data parallelism by CCD : 62 independent jobs Data parallelism by CCD : 62 independent jobs Data parallelism by exposure : 250 jobs per night Data parallelism by exposure : 250 jobs per night Monitoring and Quality Assurance through events system Monitoring and Quality Assurance through events system Pipeline Server Orchestration Pipeline Catalog and Image Archive Database... TeraGrid Compute Pipelines Event Channel Pipeline Processing

National Center for Supercomputing Applications DES Distributed Archive Database tracks image location and provenance Sites/Nodes within Replica Archive FNAL Enstore Mass Storage FNAL Enstore Mass Storage NCSA MSS NCSA MSS DES development platforms/workstations DES development platforms/workstations NCSA Mercury NCSA Mercury NCSA Cobalt & Tungsten, SDSC IA64, TACC Lonestar NCSA Cobalt & Tungsten, SDSC IA64, TACC Lonestar GPFS-WAN space GPFS-WAN space Replicate a fixed Filesystem Structure at each node Data from all Stages of Processing Data from all Stages of Processing Archive/ raw/* red/* coadd/*

National Center for Supercomputing Applications Data Access and Replication software Abstracts file transfer API to hide underlying middleware Abstracts file transfer API to hide underlying middleware Uniform interface to Uniform interface to - heterogeneous resources - heterogeneous resources - different transfer tools & strategies - different transfer tools & strategies Between TeraGrid sites, use NCSA Trebuchet library : Between TeraGrid sites, use NCSA Trebuchet library : Dataset optimizations: client_pool_size, tcp buffer Dataset optimizations: client_pool_size, tcp buffer ~ 300 MB/s for 3 clients ~ 300 MB/s for 3 clients NCSA MSS optimizations: stream mode, “target active” NCSA MSS optimizations: stream mode, “target active” ~ 60 MB/s ~ 60 MB/s For retrieval from Enstore/dCache, use For retrieval from Enstore/dCache, use Storage Resource Manager (SRM) developed at FNAL Storage Resource Manager (SRM) developed at FNAL Optimized Tape to Cache staging at FNAL Enstore Optimized Tape to Cache staging at FNAL Enstore DES Distributed Archive

National Center for Supercomputing Applications DES Data Challenges DESDM employs spiral development model with testing thru yearly Data Challenges Large scale DECam image simulation effort at Fermilab – Produce sky images over hundreds of deg 2 with realistic distribution of galaxies, stars (weak lensing shear from large scale structure still to come) – Realistic effects: cosmic rays, dead columns, gain variations, seeing variations, saturation, optical distortion, nightly extinction variations, etc Data challenges: DESDM team uses current system to process simulated images to the catalog level and compare catalogs to truth tables

National Center for Supercomputing Applications DES Data Challenges (2) DES DC1 Oct 05 - Jan 06 5 nights simulated data (ImSim1) 700 GB raw 5 nights simulated data (ImSim1) 700 GB raw Basic Orchestration, Condor-G job submission Basic Orchestration, Condor-G job submission First Astro modules: Image Reduction & Cataloguing; First Astro modules: Image Reduction & Cataloguing; Ingestion of objects to Database Ingestion of objects to Database TeraGrid usage: Data parallel computing on NCSA Mercury TeraGrid usage: Data parallel computing on NCSA Mercury Catalogued, calibrated and ingested 50 million objects Catalogued, calibrated and ingested 50 million objects DES DC2 Oct 06 - Jan nights simulated data (5TB, equiv to 20% of SDSS imaging dataset) 10 nights simulated data (5TB, equiv to 20% of SDSS imaging dataset) Astronomy algorithms: astrometry, co-addition, cataloguing Astronomy algorithms: astrometry, co-addition, cataloguing Distributed Archive; high performance GridFTP transfers Distributed Archive; high performance GridFTP transfers Archive Portal Archive Portal Teragrid usage: Production runs on NCSA Mercury; Teragrid usage: Production runs on NCSA Mercury; Pipelines successfully tested on NCSA Cobalt, SDSC IA64 Pipelines successfully tested on NCSA Cobalt, SDSC IA64 Catalogued, calibrated and ingested 250 million objects Catalogued, calibrated and ingested 250 million objects

National Center for Supercomputing Applications DESDM Partnership with OSG DESDM reflects underlying DOE-NSF partnership in DES – Camera construction supported by DOE (Fermilab) – Telescope built and operated by NSF (NOAO) – Data management development supported by NSF (NCSA) DESDM designed to support production, delivery and analysis of science ready data products to all collaboration sites Fermilab and NCSA both host massive HPC resources – Strong incentive for developing DM system that seamlessly integrates TG and OSG resources

National Center for Supercomputing Applications OSG Partnership Components Archiving – Distributed archive already includes node on Fermilab dcache system Simplifies transfer from simulated and reduced images among DES collaborators at NCSA and Fermilab – In operations phase Fermilab and NCSA will both host comprehensive DES data repositories Processing – Large scale reprocessing of DES data would benefit from access to more HPC resources – Orchestration layer already supports processing on multiple (similar) TG nodes – Working to add support for processing on Fermigrid platforms