Hanisch / National Data Service, Boulder, CO 13 June 2014 Building on Existing Communities: the Virtual Astronomical Observatory (and NIST) Robert Hanisch.

Slides:



Advertisements
Similar presentations
The SDMX Registry Model April 2, 2009 Arofan Gregory Open Data Foundation.
Advertisements

SCAR Data Management SSG Plenary 30 th July 2010 Kim Finney (Manager, Australian Antarctic Data Centre & Chief Officer, SCAR Standing Committee on Antarctic.
September 13, 2004NVO Summer School1 VO Protocols Overview Tom McGlynn NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY.
9/14/2006NVOSS III1 The Future of the Virtual Observatory Robert Hanisch US NVO Project Manager Space Telescope Science Institute T HE US N ATIONAL V IRTUAL.
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
CASDA Virtual Observatory CSIRO ASTRONOMY AND SPACE SCIENCE Arkadi Kosmynin 11 March 2014.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
The Documentum Team Lance Callaway, Brooke Durbin, Perry Koob, Lorie McMillin, Jennifer Song Missouri University of Science and Technology Rolla, Missouri.
Using Sakai to Support eScience Sakai Conference June 12-14, 2007 Sayeed Choudhury Tim DiLauro, Jim Martino, Elliot Metsger, Mark Patton and David Reynolds.
14 October 2003ADASS 2003 – Strasbourg1 Resource Registries for the Virtual Observatory R.Plante (NCSA), G. Greene (STScI), R. Hanisch (STScI), T. McGlynn.
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
Solar and STP Physics with AstroGrid 1. Mullard Space Science Laboratory, University College London. 2. School of Physics and Astronomy, University of.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
Virtual Observatory Single Sign-on U.S. National Virtual Observatory National Center for Supercomputing Applications Ray Plante, Bill Baker.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Virtual Observatory --Architecture and Specifications Chenzhou Cui Chinese Virtual Observatory (China-VO) National Astronomical Observatory of China.
E. Solano Centro de Astrobiología (INTA-CSIC) I.P. Observatorio Virtual Español El Observatorio Virtual: una infraestructura básica para la investigación.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Virtual Observatory & LIGO Roy Williams California Institute of Technology.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
Science with the Virtual Observatory Brian R. Kent NRAO.
Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.
Markus Dolensky, ESO Technical Lead The AVO Project Overview & Context ASTRO-WISE ((G)A)VO Meeting, Groningen, 06-May-2004 A number of slides are based.
Summary of distributed tools of potential use for JRA3 Dugan Witherick HPC Programmer for the Miracle Consortium University College.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
Federated Discovery and Access in Astronomy Robert Hanisch (NIST), Ray Plante (NCSA)
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
The VAO is operated by the VAO, LLC. LSST VAO Meeting Robert Hanisch Space Telescope Science Institute Director, Virtual Astronomical Observatory.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
The International Virtual Observatory Alliance (IVOA) interoperability in action.
Data Archives: Migration and Maintenance Douglas J. Mink Telescope Data Center Smithsonian Astrophysical Observatory NSF
Tuesday, April 5th, 2005 N. Craig, B. J. Méndez (SEGway,UC Berkeley) R. J. Hanisch, C. A. Christian, F. Summers (NVO,StScI) B. Haisch, J. Lindblom (ManyOne.
Sharing scientific data: astronomy as a case study for a change in paradigm Présenté par Françoise Genova.
German Astrophysical Virtual Observatory Overview and Results So Far W. Voges, G. Lemson, H.-M. Adorf.
21-jun-2009 IVOA Standards Pedro Osuna ESA-VO Project Science Archives and Computer Support Engineering Unit (SRE-OE) Science Operations Department (SRE-O)
IVOA RM, VOResources, Identifiers, Interfaces Chenzhou CUI.
F. Genova, VO as a Data Grid, 2003/06/301 Interoperability of astronomy data bases Françoise Genova, CDS.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
The Large Synoptic Survey Telescope Project Bob Mann Wide-Field Astronomy Unit University of Edinburgh.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
7 Dec 2009R. J. Hanisch: Astronomy Data Standards CERN 1 Data Standards in Astronomy Dr. Robert J. Hanisch Director, US Virtual Astronomical Observatory.
NIH: DATA SCIENCE & BD2K Jennie Larkin, PhD Senior Advisor, Extramural Programs and Strategic Planning Office of the Associate Director for Data Science,
VO Data Access Layer IVOA Cambridge, UK 12 May 2003 Doug Tody, NRAO.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
End of the Beginning for IVOA is now Roy Williams IVOA Technical Lead.
International Planetary Data Alliance Registry Project Update September 16, 2011.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Robert Hanisch, Radio Astronomy / LSST, Charlottesville8 May 2013 The VAO is operated by the VAO, LLC, a not-for-profit 501(c)(3) organization. 8 May 2013.
JOINT SESSION IG Domain Repositories, IG Agriculture Data, WG BioSharing Registry, IG Materials Data, WG Wheat Data RDA P6, Paris,
Enhancements to Galaxy for delivering on NIH Commons
NIST Office of Data and Informatics (ODI) of the Material Measurement Laboratory Robert Hanisch, director Ray Plante, interoperability expert ODI has responsibility.
Policies NIST policies on research data reflect OSTP and OMB directives of 2013 NIST P , eff. 6/26/2015: “To the extent feasible and consistent.
Accessing the VI-SEEM infrastructure
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Tools and Services Workshop
Building a worldwide interoperable data infrastructure for astronomy: the International Virtual Observatory Alliance SciDataCon 2016, Denver, 13/Sept/2016.
An Overview of Data-PASS Shared Catalog
Long-Term Preservation of Astronomical Research Results
Google Sky.
Bird of Feather Session
Presentation transcript:

Hanisch / National Data Service, Boulder, CO 13 June 2014 Building on Existing Communities: the Virtual Astronomical Observatory (and NIST) Robert Hanisch Space Telescope Science Institute Director, Virtual Astronomical Observatory

Hanisch / National Data Service, Boulder, CO 13 June Data in astronomy ~70 major data centers and observatories with substantial on-line data holdings ~10,000 data “resources” (catalogs, surveys, archives) Data centers host from a few to ~100s TB each, currently at least 2 PB total Current growth rate ~0.5 PB/yr, increasing Current request rate ~1 PB/yr Future surveys will increase data rates to PB/day  “For LSST, the telescope is a peripheral to the data system” (T. Tyson) How do astronomers navigate through all of this data?

Hanisch / National Data Service, Boulder, CO 13 June The Virtual Observatory Images, spectra, time series Catalogs, databases Transient event notices Software and services Application inter-communication Distributed computing  authentication, authorization, process management International coordination and collaboration through IVOA (a la W3C) The VO is a data discovery, access, and integration facility 3

Hanisch / National Data Service, Boulder, CO 13 June 2014 Virtual Observatory capabilities Data exchange / interoperability / multi-λ (co-observing) Data Access Layer (SIAP, SSAP / time series) Query and cross-match across distributed databases Cone Search, Table Access Protocol Remote (but managed) access to centralized computing and data storage resources VOSpace, Single-Sign-On (OpenID), SciDrive Transient event notification, scalable to 10 6 messages/night VOEvent Data mining, characterization, classification, statistical analysis VOStat, Data Mining and Exploration toolkit 4

Hanisch / National Data Service, Boulder, CO 13 June VO architecture

Hanisch / National Data Service, Boulder, CO 13 June VO architecture

Hanisch / National Data Service, Boulder, CO 13 June VO architecture

Hanisch / National Data Service, Boulder, CO 13 June 2014 Key to discovery: Registry Used to discover and locate resources—data and services—that can be used in a VO application Resource: anything that is describable and identifiable.  Besides data and services: organizations, projects, software, standards Registry: a list of resource descriptions  Expressed as structured metadata in XML to enable automated processing and searching Metadata based on Dublin Core 8

Hanisch / National Data Service, Boulder, CO 13 June 2014 Registry framework 9 Data Centers Local Publishing Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry (pull) harvest replicate search queries Users, applications OAI/PMH

Hanisch / National Data Service, Boulder, CO 13 June Data discovery

Hanisch / National Data Service, Boulder, CO 13 June 2014 Data discovery 11

Hanisch / National Data Service, Boulder, CO 13 June 2014 SciDrive: astro-centric cloud storage 12 Automatic Metadata Extraction Extract tabular data from: CSV FITS TIFF Excel Extract metadata from: FITS Image files (TIFF, JPG) Automatically upload tables into relational databases: CasJobs/MyDB SQLShare Controlled data sharing Single sign-on Deployable as virtual machine

Hanisch / National Data Service, Boulder, CO 13 June 2014 The VO concept elsewhere Space Science  Virtual Heliophysics Observatory (HELIO)  Virtual Radiation Belt Observatory (ViRBO)  Virtual Space Physics Observatory (VSPO)  Virtual Magnetospheric Observatory (VMO)  Virtual Ionosphere Thermosphere Mesosphere Observatory (VITMO)  Virtual Solar-Terrestrial Observatory (VSTO)  Virtual Sun/Earth Observatory (VSEO) Virtual Solar Observatory Planetary Science Virtual Observatory Deep Carbon Virtual Observatory Virtual Brain Observatory 13

Hanisch / National Data Service, Boulder, CO 13 June 2014 Data management at I move to NIST 7/28/2014 as Director, Office of Data and Informatics, Material Measurement Laboratory  Materials science, chemistry, biology  Materials Genome Initiative Foster a culture of data management, curation, re-use in a bench- scientist / PI-dominated organization having a strong record of providing “gold standard” data Inward-looking challenges  Tools, support, advice, common platforms, solution broker  Big data, lots of small/medium data Outward-looking challenges  Service directory  Modern web interfaces, APIs, better service integration  Get better sense of what communities want from NIST Define standards, standard practices Collaboration: other government agencies, universities, domain repositories 14

Hanisch / National Data Service, Boulder, CO 13 June

Hanisch / National Data Service, Boulder, CO 13 June 2014 NDS and domain repositories Domain repositories are discipline-specific Various business models in use; long-term sustainability is a major challenge* Potential NDS roles  Customizable data management and curation tools built on a common substrate  Access to cloud-like storage but at non-commercial rates  A directory of ontology-building and metadata management tools  A directory of domain repositories  Accreditation services  Advice, referral services, “genius bar” * “Sustaining Domain Repositories for Digital Data: A White Paper,” C. Ember & R. Hanisch, eds. DRDD_ pdf 16

Hanisch / National Data Service, Boulder, CO 13 June 2014 Technologies/standards to build on Just use the VO standards!  OK, seriously… NIH syndrome  Much could be re-used in terms of architecture  Generic, collection-level metadata Cross-talk with Research Data Alliance (ANDS, EUDAT)  Data Citation WG  Data Description Registry Interoperability WG  Data Type Registries WG  Domain Repositories IG  Long Tail of Research Data IG  Metadata IG  Metadata Standards Directory WG  Preservation e-Infrastructure WG  and others… 17 Dataverse, Dryad, iRODS, DSpace, etc.

Hanisch / National Data Service, Boulder, CO 13 June 2014 Lessons learned re/ federation It takes more time than you think  Community consensus requires buy-in early and throughout Top-down imposition of standards likely to fail Balance requirements coming from a research- oriented community with innovation in IT Marketing is very important  Managing expectations  Build it, and they might come Coordination at the international level is essential  But takes time and effort Data models – sometimes seem obvious, more often not Metadata collection and curation are eternal but essential tasks 18

Hanisch / National Data Service, Boulder, CO 13 June 2014 Lessons learned re/ federation For example, the Cancer Biomedical Informatics Grid (caBIG) [$350M]  “…goal was to provide shared computing for biomedical research and to develop software tools and standard formats for information exchange.”  “The program grew too rapidly without careful prioritization or a cost-effective business model.”  “…software is overdesigned, difficult to use, or lacking support and documentation.”  "The failure to link the mission objectives to the technology shows how important user acceptance and buy-in can be.” (M. Biddick, Fusion PPT) J. Foley, InformationWeek, 4/8/ cancer-research-grid/d/d-id/ ? 19