Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Slides:



Advertisements
Similar presentations
Trying to Use Databases for Science Jim Gray Microsoft Research
Advertisements

Isabel Hawkins Director of the Center for Science Education/SSL, UC Berkeley Nahide Craig Director of the Science Education Gateway program (SEGway), SEGway.
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
ESO-ESA Existing Activities Archives, Virtual Observatories and the Grid.
Development of China-VO ZHAO Yongheng NAOC, Beijing Nov
Extremely Large Telescopes and Surveys Mark Dickinson, NOAO.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Planning for the Virtual Observatory Tara Murphy … with input from other Aus-VO members …
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
CXC Implementing 2007 NRC Portals of the Universe Report Chandra X-ray Center Recommended Best Practices Roger Brissenden and Belinda Wilkes 25 April 2012.
Virtual Observatory --Architecture and Specifications Chenzhou Cui Chinese Virtual Observatory (China-VO) National Astronomical Observatory of China.
ILWS Solar Task Group 2008 Report. ILWS Solar Task Group - Charter The Solar Task Group for International Living With a Star has been tasked with cataloging.
Partnerships and Broadening Participation Dr. Nathaniel G. Pitts Director, Office of Integrative Activities May 18, 2004 Center.
Big Data in Science (Lessons from astrophysics) Michael Drinkwater, UQ & CAASTRO 1.Preface Contributions by Jim Grey Astronomy data flow 2.Past Glories.
E. Solano Centro de Astrobiología (INTA-CSIC) I.P. Observatorio Virtual Español El Observatorio Virtual: una infraestructura básica para la investigación.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
Alex Szalay, Jim Gray Analyzing Large Data Sets in Astronomy.
State of the Federation Winter Meeting Washington, D.C. January 9, 2008.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
Spitzer Space Telescope Lisa Storrie-Lombardi Spitzer Science Center, Manager & Asst. Director for Community Affairs Implementing Portals of the Universe:
Markus Dolensky, ESO Technical Lead The AVO Project Overview & Context ASTRO-WISE ((G)A)VO Meeting, Groningen, 06-May-2004 A number of slides are based.
CODATA 2006 Beijing - E-Science Session The Role of Scientific Data in e-Science: How Do We Preserve All Necessary Data So They are Useful John Rumble.
ICSTI Annual Members’ Meeting & Workshop Dr. Stefan Winkler-Nees; Paris, 5. March 2012 The Alliance of German Science Organisations - Recommendations on.
F. Genova, Berlin 7, Paris, 2 December 2009 The astronomical information network.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
July 16, 2004P. Padovani, NEON Archive School Science with multi-wavelength Archival Data Paolo Padovani (ESO) Virtual Observatory Systems Department &
CyberInfrastructure workshop CSG May Ann Arbor, Michigan.
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
The Virtual Observatory Europe and the VO: the Astrophysical Virtual Observatory and the EURO-VO Astrophysical Virtual Observatory and the EURO-VO Paolo.
The Swiss Grid Initiative Context and Initiation Work by CSCS Peter Kunszt, CSCS.
INTO THE NEW YEAR January 3, Objectives Reaffirm principles –China’s interest in exploring ESIP structure prompted review of ESIP evolution (more.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Advanced Technologies in Education Virtual Observatory 1 Virtual Observatory: D-Space Project Athens, 14 November 2004 Elena Tavlaki Head of Research Programs.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Grid Based Chinese Virtual Observatory System Design Chenzhou CUI, Yongheng ZHAO National Astronomical Observatories, Chinese Academy of Sciences
Future of grids V. Breton CNRS. EGEE training, CERN, May 19th Table of contents Introduction Future of infrastructures : from networks to e-
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
A Data Centre for Science and Industry Roadmap. INNOVATION NETWORKING DATA PROCESSING DATA REPOSITORY.
Tuesday, April 5th, 2005 N. Craig, B. J. Méndez (SEGway,UC Berkeley) R. J. Hanisch, C. A. Christian, F. Summers (NVO,StScI) B. Haisch, J. Lindblom (ManyOne.
GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.
Sharing scientific data: astronomy as a case study for a change in paradigm Présenté par Françoise Genova.
German Astrophysical Virtual Observatory Overview and Results So Far W. Voges, G. Lemson, H.-M. Adorf.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
EScience: Techniques and Technologies for 21st Century Discovery Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering Computer Science.
An Environmental Scan for Data Services Trends that are shaping today’s environment for data services.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Data Mining Challenges and Opportunities in Astronomy The Punchline: Astronomy has become an immensely data- rich field (and growing) There is a need.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
HEP and NP SciDAC projects: Key ideas presented in the SciDAC II white papers Robert D. Ryne.
1 Cyber-Enabled Discovery and Innovation Michael Foster May 11, 2007.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
SEE-GRID-2 The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no
New Astronomy in a Virtual Observatory S. G. Djorgovski (Caltech) Presentation at the NSF Symposium on Knowledge Environments for Science, Arlington, 26.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
RI EGI-InSPIRE RI Astronomy and Astrophysics Dr. Giuliano Taffoni Dr. Claudio Vuerli.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Community Science Updates
Joslynn Lee – Data Science Educator
Data Mining Challenges and Opportunities in Astronomy
Moving towards the Virtual Observatory Paolo Padovani, ST-ECF/ESO
Long-Term Preservation of Astronomical Research Results
Google Sky.
Bird of Feather Session
Presentation transcript:

Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009

Astronomy Has Become Very Data-Rich Typical digital sky survey now generates ~ TB, plus a comparable amount of derived data products –PB-scale data sets are on the horizon Astronomy today has ~ PB of archived data, and generates a few TB/day –Both data volumes and data rates grow exponentially, with a doubling time ~ 1.5 years –Even more important is the growth of data complexity For comparison: Human memory ~ a few hundred MB Human Genome < 1 GB 1 TB ~ 2 million books Library of Congress (print only) ~ 30 TB

Exponential Growth in Data Volumes and Complexity Visible + X-ray CrabStar forming complex Radio + IR Understanding of complex phenomena requires complex data! Multi- data fusion leads to a more complete, less biased picture (also: multi-scale, multi-epoch, …) Numerical simulations are also producing many TB’s of very complex “data” Data + Theory = Understanding doubling t ≈ 1.5 yrs TB’s to PB’s of data, sources, param./source

The Archive Archipelago As the data sets kept increasing, a number of archives, data depositories, and digital library services were created All of them are mission-, domain-, or observatory-specific, distinct and independent scientifically, technologically, institutionally, heterogeneous in look-feel, usage, etc. –There was a considerable replication of effort –There was some functional redundancy –There was almost no interoperability –Some standards have been generally adopted (e.g., FITS) All of them were primarily designed for single-object (or single-pointing) queries - and thus inherently unsuitable for the science enabled by the massive and complex data sets The next step was clearly to connect them in a functional manner, and develop interoperability standards, formats, etc.

The Virtual Observatory Concept A complete, dynamical, distributed, open research environment for the new astronomy with massive and complex data sets – Provide and federate content (data, metadata) services, standards, and analysis/compute services – Develop and provide data exploration and discovery tools – Not just the archives! – A part of a broader Cyber-Infrastructure and e-Science movement

Survey Telescope Archive Follow-Up Telescopes Results Target Selection Data Mining From Traditional to Survey to VO Science Highly successful, but inherently limited by the information content of individual sky surveys … What comes next, beyond survey science is the VO science Another Survey/Archive? Data Analysis Results Telescope Traditional: Survey-Based:

Surveys Observatories Missions Survey and Mission Archives Follow-Up Telescopes and Missions Results Data Services Data Mining and Analysis, Target Selection Digital libraries Primary Data Providers VO Secondary Data Providers A Systemic View of the VO-Based Science VO connects the whole system of astronomical research

A Brief History of the VO Concept Early (pre-web!) ideas already in the “Astrophysics Data System” (only the digital library part survives) Concept developed through 1990’s, mainly from large digital sky surveys (DPOSS, SDSS…), discussions at conferences and workshops in the late 1990’s Top recommendation in the “small projects” category in the NAS Decadal Astronomy & Astrophysics survey (the McKee-Taylor report), 2001 The first major VO conference at Caltech in 2000; the NVO White paper National Virtual Observatory Science Definition Team, ESO conferences, Vigorous international efforts, coordinated via International VO Alliance (IVOA)

VO Development and Status NSF-funded framework development project ( ): the U.S. National Virtual Observatory (NVO) Now into a facility regime: Virtual Astro. Obs. (VAO) Joint funding by the NSF and NASA Work largely done in the existing data archives, and thus very data-centric Vigorous international efforts (IVOA) ivoa.net

Scientific Roles and Benefits of a VO Facilitate science with massive data sets (observations and theory/simulations) efficiency amplifier Provide an added value from federated data sets (e.g., multi-wavelength, multi-scale, multi-epoch …) –Discover the knowledge which is present in the data, but can be uncovered only through data fusion Enable and stimulate some qualitatively new science with massive data sets (not just old-but-bigger) Optimize the use of expensive resources (e.g., space missions, large ground-based telescopes, computing …) Provide R&D drivers, application testbeds, and stimulus to the partnering disciplines (CS/IT, statistics …)

VO Represents a New Type of a Scientific Organization for the era of information abundance It is not yet another data center, archive, mission, or a traditional project It does not fit into any of the usual organizational structures –It is inherently distributed, and web-centric –It is fundamentally based on a rapidly developing technology (IT/CS) –It transcends the traditional boundaries between different wavelength regimes, agency domains –It has an unusually broad range of constituents and interfaces –It is inherently multidisciplinary

Broader and Societal Benefits of a VO Professional Empowerment: Scientists and students anywhere with an internet connection would be able to do a first-rate science A broadening of the talent pool in astronomy, democratization of the field Interdisciplinary Exchanges: –The challenges facing the VO are common to most sciences and other fields of the modern human endeavor –Intellectual cross-fertilization, feedback to IT/CS Education and Public Outreach: –Unprecedented opportunities in terms of the content, broad geographical and societal range, at all levels –Astronomy as a magnet for the CS/IT education “Weapons of Mass Instruction”

VO Education and Public Outreach Microsoft’s World Wide Telescope, and Google Sky: use DSS, SDSS, HST data, etc., for easy sky browsing

VO Functionality Today What we did so far: Lots of progress on interoperability, standards, etc. An incipient data grid of astronomy Some useful web services Community training, EPO What we did not do (yet): Significant data exploration and mining tools That is where the science will come from! Thus, little VO-enabled science so far Thus, a slow community buy-in  Development of powerful, usable knowledge discovery tools should be a key priority

An Evolving Sociology We have transitioned from the data poverty regime into an era of exponential data abundance –Most astronomers do not seem too fully realize this –Proprietary periods should be re-thought; there are other modes of data access rights currencies, different scenarios? –Data are cheap, but the expertise is expensive (and creativity is priceless) Telescopes are just the hardware needed to generate the data; and data are just incidental to our real mission, which is knowledge creation –When the data and the exploration tools are on the web, the value of large facilities ownership should be rethought –Computers are (relatively) cheap, but software is expensive — especially if you are not approaching it in a smart way

Information Technology  New Science The information volume grows exponentially Most data will never be seen by humans! The need for data storage, network, database-related technologies, standards, etc. Information complexity is also increasing greatly Most data (and data constructs) cannot be comprehended by humans directly! The need for data mining, KDD, data understanding technologies, hyperdimensional visualization, AI/Machine-assisted discovery … We need to create a new scientific methodology on the basis of applied CS and IT VO is the framework to effect this for astronomy

Some Readings: A quick summary: –“Virtual Observatory: From Concept to Implementation”, Djorgovski, S.G., & Williams, R. 2005, A.S.P. Conf. Ser. 345, 517, available as The original VO White Paper: –“Toward a National Virtual Observatory: Science Goals, Technical Challenges, and Implementation Plan”, in Virtual Observatories of the Future, A.S.P. Conf. Ser. 225, 353, available as The NVO SDT report, from Many other good documents available at (especially the summer school presentations) Technical documents at