EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.

Slides:



Advertisements
Similar presentations
Microsoft Research Microsoft Research Jim Gray Distinguished Engineer Microsoft Research San Francisco SKYSERVER.
Advertisements

27 June 2005 National Virtual Observatory 1 The National Virtual Observatory: Publishing Astronomy Data Robert J. Hanisch US National Virtual Observatory.
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
Development of China-VO ZHAO Yongheng NAOC, Beijing Nov
Space …. are big. Really big. You just won't believe how vastly, hugely, mindbogglingly big they are. Massive data streams Douglas Adams – Hitchhiker’s.
Extremely Large Telescopes and Surveys Mark Dickinson, NOAO.
High Performance Computing Course Notes Grid Computing.
Astronomy of the Next Decade: From Photons to Petabytes R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST.
All these Sky Pixels Are Yours The evolution of telescopes and CCD Arrays: The Coming Data Nightmare.
Astronomical Research at TAMU Faculty Future facilities and projects.
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
The SOAR Telescope MSU’s Laboratory for Astronomical Discovery.
Building a Framework for Data Preservation of Large-Scale Astronomical Data ADASS London, UK September 23-26, 2007 Jeffrey Kantor (LSST Corporation), Ray.
Planning for the Virtual Observatory Tara Murphy … with input from other Aus-VO members …
Leicester Database & Archive Service J. D. Law-Green, J. P. Osborne, R. S. Warwick X-Ray & Observational Astronomy Group, University of Leicester What.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
1 Large Synoptic Survey Telescope Review Kirk Gilmore - SLAC DOE Review June 15, 2005.
Aus-VO: Progress in the Australian Virtual Observatory Tara Murphy Australia Telescope National Facility.
The KPNO 4m “Mayall” Telescope Arjun Dey (NOAO). National Optical Astronomy Observatory Mission: provide the best ground-based astronomical capabilities.
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
UNIVERSITY of MARYLAND GLOBAL LAND COVER FACILITY High Performance Computing in Support of Geospatial Information Discovery and Mining Joseph JaJa Institute.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Commissioning the NOAO Data Management System Howard H. Lanning, Rob Seaman, Chris Smith (National Optical Astronomy Observatory, Data Products Program)
From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith AURA/NOAO/CTIO/LSST.
The Cosmic Simulator Daniel Kasen (UCB & LBNL) Peter Nugent, Rollin Thomas, Julian Borrill & Christina Siegerist.
From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith AURA/NOAO/CTIO.
E. Solano Centro de Astrobiología (INTA-CSIC) I.P. Observatorio Virtual Español El Observatorio Virtual: una infraestructura básica para la investigación.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
1 New Frontiers with LSST: leveraging world facilities Tony Tyson Director, LSST Project University of California, Davis Science with the 8-10 m telescopes.
Alex Szalay, Jim Gray Analyzing Large Data Sets in Astronomy.
1 Radio Astronomy in the LSST Era – NRAO, Charlottesville, VA – May 6-8 th LSST Survey Data Products Mario Juric LSST Data Management Project Scientist.
IXYZ Frank Marshall NASA/GSFC 25 April 2012 April 25, 20121Implementing Portals of the Universe.
Astronomy Networking Needs 5 December 2001 Jim Kennedy Gemini Observatory Important Contributions by Dick Crutcher, NCSA, UIUC Tom Troyland, UKY Arun Venkataraman,
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
NOAO Brown Bag Tucson, AZ March 11, 2008 Jeff Kantor LSST Corporation Requirements Flowdown with LSST SysML and UML Models.
Astro / Geo / Eco - Sciences Illustrative examples of success stories: Sloan digital sky survey: data portal for astronomy data, 1M+ users and nearly 1B.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
Using the NSA Presentation to NOAO Users Committee October 5, 2005.
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DC2 Post-Mortem/DC3 Scoping February 5 - 6, 2008 DC3 Goals and Objectives Jeff Kantor DM System Manager Tim Axelrod DM System Scientist.
Research Networks and Astronomy Richard Schilizzi Joint Institute for VLBI in Europe
“Big Data” and Data-Intensive Science (eScience) Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington July.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.
What the Data Products Program Offers Users Todd Boroson Dick Shaw Presentation to NOAO Users Committee October 23, 2003.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
1 Imaging Surveys: Goals/Challenges May 12, 2005 Luiz da Costa European Southern Observatory.
Data Archives: Migration and Maintenance Douglas J. Mink Telescope Data Center Smithsonian Astrophysical Observatory NSF
LSST VAO Meeting March 24, 2011 Tucson, AZ. Headquarters Site Headquarters Facility Observatory Management Science Operations Education and Public Outreach.
Large Area Surveys - I Large area surveys can answer fundamental questions about the distribution of gas in galaxy clusters, how gas cycles in and out.
LSST and VOEvent VOEvent Workshop Pasadena, CA April 13-14, 2005 Tim Axelrod University of Arizona.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
AstroGrid NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL.
EScience: Techniques and Technologies for 21st Century Discovery Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering Computer Science.
The Large Synoptic Survey Telescope Project Bob Mann Wide-Field Astronomy Unit University of Edinburgh.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
Mountaintop Software for the Dark Energy Camera Jon Thaler 1, T. Abbott 2, I. Karliner 1, T. Qian 1, K. Honscheid 3, W. Merritt 4, L. Buckley-Geer 4 1.
Gijs Verdoes Kleijn Edwin Valentijn Marjolein Cuppen for the Astro-WISE consortium.
Faculty meeting - 13 Dec 2006 The Hubble Legacy Archive Harald Kuntschner & ST-ECF staff 13 December 2006.
1 eScience in Astronomy: Grid & VO GAVO III KickOff eScience in Astronomy: VO & GRID eScience: making the most advanced tools of IT available to scientists.
RI EGI-InSPIRE RI Astronomy and Astrophysics Dr. Giuliano Taffoni Dr. Claudio Vuerli.
HELIO: Discovery and Analysis of Data in Heliophysics Robert Bentley, John Brooke, André Csillaghy, Donal Fellows, Anja Le Blanc, Mauro Messerotti, David.
T. Axelrod, NASA Asteroid Grand Challenge, Houston, Oct 1, 2013 Improving NEO Discovery Efficiency With Citizen Science Tim Axelrod LSST EPO Scientist.
From LSE-30: Observatory System Spec.
Optical Survey Astronomy DATA at NCSA
Google Sky.
Presentation transcript:

eScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST

eScience May 2007 Challenges for the Operational VO Providing Content capturing and archiving data from diverse instruments, AND capturing metadata (system & science) to make that data useful Providing Access implementing the VO standards and services, plus network infrastructure, needed for wide access to the content Ensure not only access, but long-term support and documentation of datasets & metadata (curation) Providing User Interfaces and Tools developing and operating user interfaces which enable effective scientific use of ALL of the distributed resources of the VO

eScience May 2007 A Case Study: NOAO Data Management Management of data from all NOAO and some affiliated facilities = CONTENT 3 mountaintops (Cerro Tololo, Cerro Pachon, Kitt Peak) 11 telescopes More than 30 instruments Virtual Observatory “back end” = ACCESS Provide effective access to large volume (TBs to PBs) of archived ground-based optical & infrared data and data products through VO standard interfaces and networks Virtual Observatory “front end” = UI and TOOLS Enable science by developing VO user interfaces, tools, and services to work with distributed data sources and large volumes of data

eScience May 2007

BIG Question: How does this model SCALE? Capturing, moving, & processing the data Making the data AVAILABLE through VO interfaces Making the data USEFUL for scientific analysis Why do we worry about scaling?

eScience May 2007 Turning Photons into Petabytes Today MOSAIC, WFI, IMACS: 64 Mpix cameras ~10 to 20 GB/night Builds up quickly! in only 3 years of two MOSAIC cameras ~20TB raw data ~40-60TB processed IMACS image, Las Campanas Observatory (Danny Steeghs, Jan'04)

eScience May 2007 Coming Soon: Dark Energy Camera Focal Plane: 64 2K x 4K detectors Plus guiding and WFS 530 Mpix camera

eScience May 2007 The Data: Dark Energy Survey Each image = 1GB 350 GB of raw data / night Data must be moved to supercomputer center (NCSA) before next night begins (<24 hours) Need >36Mbps internationally Data must be processed within ~24 hours Need to inform next night’s observing Total raw data after 5 yrs ~0.2 PB TOTAL Dataset 1 to 5 PB Reprocessing planned using TeraGrid resources

eScience May 2007 LSST: The Large Synoptic Survey Telescope Survey the entire sky every 3 to 5 nights, to simultaneously detect and study: Dark Matter via Weak gravitational lensing Dark Energy via thousands of SNe per year Potentially hazardous near earth asteroids Tracers of the formation of the solar system Fireworks in the heavens – GRBs, quasars… Periodic and transient phenomena...…the unknown Massively PARALLEL Astronomy

eScience May 2007 LSST: The Instrument 8.2m telescope Optimized for WIDE field of view 3.5 degree FOV 3.5 GIGApixel camera Deep images in 15s Able to scan whole sky every 3 to 5 nights

eScience May 2007 LSST: Deep, Wide, Fast Field of view (FOV) Keck Telescope 0.2 degrees 10 m 3.5 degrees LSST

eScience May 2007 LSST Site: Cerro Pachon, Chile Soar Gemini LSST ~1.5m cal telescope Support LSST site plan El Penon Gemini (South) SOAR

eScience May 2007 LSST: Distributed Data Mgmt Long-Haul Communications Data transport & distribution Base Facility Real time processing Mountain Site data acquisition, temp. storage Archive/Data Access Centers Data processing, long term storage, & public access

eScience May 2007 LSST: The Data Flow Each image roughly 6.5GB Cadence: ~1 image every 15s 15 to 18 TB per night ALL must be transferred to U.S. “data center” Mtn-base within image timescale (15s), ~10-20Gbps Internationally within 2-10Gbps REAL TIME reduction, analysis, & alerts Send out alerts of transient sources within minutes Provide automatic data quality evaluation, alert to problems Processed data grows to >100TB per night! Just catalogs = Petaybytes per year!

eScience May 2007 LSST Needs Archive Center Base Data Access Center

eScience May 2007 Turning Photons into Petabytes: Summary Today, ~10 to 20 GB/night MOSAIC, WFI, IMACS: 64 Mpix cameras Soon, ~300 to 500 GB/night VISTA: 67 Mpix camera VST: 256 Mpix camera DECam/DES: 520 Mpix camera On the horizon, ~15 TB/night LSST Project: 3 Gpix camera And these are just survey instruments in Chile!

eScience May 2007 DES, LSST, … the REST of the Science? Ongoing (MOSAIC, WFI, IMACS) and future (DES, LSST, etc.) projects will provide PETABYTES of archived data Only a small fraction of the science potential will be realized by the planned investigations How do we maximize the investment in these datasets and provide for their future scientific use?

eScience May 2007 VO Challenges Provider Perspective How do we effectively capture, transport, and manage Petabytes of data? Need advanced IT infrastructure How do we provide effective access to Petabytes of data? Need advanced data mining interfaces Fundamentally IT challenges, in support of the astronomical community

eScience May 2007 VO Challenges Scientific Perspective Data Discovery From those Petabytes, what data exists that might be useful to help address my scientific query? Data Understanding Which data are best suited for my analysis? Data Movement How do I get the data from where it is to where it is most useful? Data Analysis How do I extract the information I need from the data?

eScience May 2007 NVO NOAO Focus on Scientific USER 4 Keys: Data Discovery, Data Understanding, Data Access, Data Analysis First focus on supporting data DISCOVERY Discovery in spatial coordinates: NOAO Sky Discovery in temporal coordinates: Timeline NOAO NVO portals: And for South America… Foundation for exploring partnerships with S.A. communities

eScience May 2007 Summary: VO Challenges In Infrastructure Collect and maintain petabytes of content Provide for effective access, including networks, hardware, and software In User Interaction Provide effective user interfaces Support distributed analysis Support large queries across distributed DBs Support statistical analysis and processing across distributed resources (Grid processing & storage) TOOLS & SERVICES to enable SCIENCE

eScience May 2007 How? Strategic Partnerships In Local Systems Vendors: Local Storage, Processing, Servers In Remote Systems Distributed computer centers to provide bulk storage, large scale processing Linked together for Grid processing, Grid storage In Connectivity High-speed national and international bandwidth Scientific VO Partners to develop standards, provide tools (IVOA) Developing tools and services optimized for scientific analysis over large datasets (e.g., statistical methods)