20 January 2004ESS Technical Colloquium1 NVO Infrastructure Gretchen Greene T HE US N ATIONAL V IRTUAL O BSERVATORY
20 January 2004ESS Technical Colloquium2 Collaborators Bob Hanisch (ST),William O’Mullane (JHU), Alex Szalay (JHU), Tamas Budavari (JHU), Maria A. Nieto-Santisteban (JHU), Ani Thakar, Jeongin Lee (GSFC), Tom McGlynn (GSFC), Ray Plante (NSCA), Tony Linde (AstroGrid/Univ. of Leicester), Kevin Benson (AstroGrid/Mullard Space Science Laboratory), Niall Gaffney (STScI), Antonio Volpicelli (STScI/OATo), NVO service providers (including MAST, Randy Thompson)
20 January 2004ESS Technical Colloquium3 NVO? If only I could take HST images and 2MASS data and SDSS spectra and figure out the wonders of the universe !.,!
20 January 2004ESS Technical Colloquium4 What is NVO? A NSF funded collaboration between astronomical researchers and information technologists to build a global network of astronomical resources which facilitates scientific discovery and operations in the space science community.
20 January 2004ESS Technical Colloquium5 Who works on the NVO Project? Distributed US Projects –PI (Alex Szalay/JHU), Project Manager (Bob Hanisch/STScI), Astronomical Institutions… International Collaboration –IVOA is the “parent” consortium partners: GAVO, JVO, NVO, AstroGrid, others –Global network requires this –Astronomical community is international
20 January 2004ESS Technical Colloquium6 NVO 2003 Milestones Demonstrate Science with VO concepts Establish standard… –Services: Spatial catalog query, “cone search” Simple Image Access protocol (SIAP) SkyNode *new* web service –Protocols Data exchange format => XML schemas (VOTable, VOResource) Form Collaborations to accomplish goals –Working Groups: Metadata, Data Models, Web Service and Grid, DAL (Data Access Layer), Applications
20 January 2004ESS Technical Colloquium7 NVO 2004 Milestones Complete Spectral Access Protocol (SSAP) Incorporate Data Models (DM) –For mining more complex data structures Tune Standards and Schemas based on 2003 lessons learned Network related issues –Security, authentication Continue to build science tools and applications
20 January 2004ESS Technical Colloquium8 VO Framework/Architecture Computational Services Virtual Data ConeSearch VOTable SIAP, SSAP, ADQL VOTable, FITS, GIF Information Discovery Registries Data Access Layer (DAL) Archives, Collections Catalogs Portals, User Interfaces, Tools VOPlotMirageTopcatTreeviewDISMAST HTTP, Web, & Grid Services
20 January 2004ESS Technical Colloquium9 Where is NVO locally? STScI –Project management for NVO –VO services, VO science app and registry prototypes, MAST integration JHU –Science leadership –SkyServer, SkyQuery, and prototypes as drivers for NVO STScI and JHU –technical collaboration –Established local tech exchange meetings –Academic + Operational Science gives you the best of both worlds
20 January 2004ESS Technical Colloquium10 STScI VO Resources Searchable NVO prototype registry – Science Demo Portal (Galaxy Morphology) – vo.org/prototypes/galaxymorphology.html GSC, DSS, Hip, Tycho, HST pointings – MAST VO services (including GALEX) –
20 January 2004ESS Technical Colloquium11 Galaxy Morphology Demo Goal: –Analyze the morphology of a cluster of galaxies to study the formation properties…. Technology: –NVO standards + Grid +.NET + JAVA +++ What resources are available? –Define required parameters, manually identify resources, form integration plan Collaboration…a ‘must’ to obtain all components –STScI, JHU, NCSA, Fermi, USC/ISI HAVE THIS WORKING in a couple months using the NVO……
20 January 2004ESS Technical Colloquium12 Galmorph Demo Glimpse into the potential applications… Show Under the Hood…. –Go To US-VO site –Go To Prototype –Run Under the Hood
20 January 2004ESS Technical Colloquium13 What was missing? A Resource Registry –Find the resources and services available to perform this scientific task….. –How do I connect these together….. –What tools are available to visualize and analyze my results?
20 January 2004ESS Technical Colloquium14 The Role of Resource Registries Used to discover and locate resources— data and services—that can be used in a VO application –May also include tools, e.g. ETC Registry: a list of resource descriptions –Expressed as structured metadata to enable automated processing and searching Registries are themselves VO Resources
20 January 2004ESS Technical Colloquium15 Registry Requirements Allow user to select resources that are likely to pertain to a scientific question Select resources based on characteristics… –Type of resource: catalogs, image archives, EPO, services –Coverage in space, time, and frequency –Where data comes from, who curates it Dynamic: resources will come and go Distributed: Should not depend on a single point of failure or single view of the VO. Preserve the data providers’ control over their data –Curators control what gets registered, content, updates –Allow integration with existing resource management Allow extension to new types of resources *customized
20 January 2004ESS Technical Colloquium16 IVOA Registry Working Group (RWG) Common approach to registries Work packages –Science requirements and use cases –Resource metadata –Registry interfaces –Prototyping Distributed model for registries
20 January 2004ESS Technical Colloquium17 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Model
20 January 2004ESS Technical Colloquium18 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Model harvest (pull)
20 January 2004ESS Technical Colloquium19 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Model harvest (pull) replicate
20 January 2004ESS Technical Colloquium20 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Model harvest (pull) replicate selective harvesting
20 January 2004ESS Technical Colloquium21 NVO Prototype Registry To support a Data Inventory Service (DIS) What is known about a position in the sky? –Use a registry to locate and query standard services: Cone Search Services: querying catalogs Simple Image Access Services: querying image archives and cutout services Components –Publishing Registries –Searchable Registry –Resource Metadata –Harvesting Protocol –Populated with service descriptions
20 January 2004ESS Technical Colloquium22 Resource Metadata KEY means for exchange between Registries Under development within the IVOA RWG The standard comes in two parts: –Prose document that defines concepts independent of an encoding scheme Resource Metadata Document (RSM) by Hanisch et. al. –XML Schemas Draws on Dublin Core metadata –An interdisciplinary standard for core resource metadata
20 January 2004ESS Technical Colloquium23 Resource Metadata: XML Schema Classes of Resources Organization, DataCollection, Service, Registry –Specific classes inherit from generic Organized into separate schemas: –Core resource metadata: VOResource –Various extensions schemas containing specific types Capable of describing… –Data centers, research organizations, missions, observatories –Data collections, archives –VO standard services: Cone Search, Simple Image Access –Existing Browser/CGI-based services
20 January 2004ESS Technical Colloquium24 Publishing Registries: getting information into registries Multiple publishing registries established Motivation: –Register VO Services –Develop techniques for easy registration Variable Resource descriptions storage solutions –XML Documents –Custom File Systems –Relational Databases
20 January 2004ESS Technical Colloquium25 Harvesting Interface Adopted Open Archives Initiative (OAI) Protocol for Metadata Harvesting –HTTP/CGI-based protocol for exposing metadata to harvesters (e.g. searchable registries) Advantages: –Existing, field-tested design we didn’t have to re-invent –Fairly easy to implement –Existing tools for emitting and harvesting metadata –Exposes our metadata to larger digital library community
20 January 2004ESS Technical Colloquium26 Searchable Registry Searchable Registry was set up at JHU/STScI OAI harvester collects resource descriptions –from Publishing Registries Use of modification Date for –Parses XML and Loads data into relational database SOAP Web Service interface –Searching Currently provides specialized SQL querying useful for DIS Web Form Access for conventional practices
20 January 2004ESS Technical Colloquium27 Local Publishing Registry Full Searchable Registry Local Publishing Registry Heasarc STScI harvest (pull) Data Inventory Service search for services Registry Model NCSA DIS Local Publishing Registry Local Publishing Registry Local Publishing Registry Vizier Caltech Astrogrid
20 January 2004ESS Technical Colloquium28 Cone Search Service Cone Search Service Simple Image Access Simple Image Access Full Searchable Registry STScI harvest (pull) Data Inventory Service search for services Registry Model DIS Cone Search Service Simple Image Access Data Providers Local Publishing Registry Local Publishing Registry Heasarc NCSA Local Publishing Registry Local Publishing Registry Local Publishing Registry Vizier Caltech Astrogrid
20 January 2004ESS Technical Colloquium29 Registry Page
20 January 2004ESS Technical Colloquium30 NVO: Data Inventory Service (DIS) Rapid retrieval of all registered images, catalogs, and pointed observations for a selected position on the sky
20 January 2004ESS Technical Colloquium31 NVO: DIS Registry Findings Images Pointed observations Source catalogs
20 January 2004ESS Technical Colloquium32 NVO: Visualization of DIS Findings Easy comparison of multiwavelength data Radio Optical X-ray Aladin image viewer, CDS
20 January 2004ESS Technical Colloquium33 Lessons Learned XML schema needs simplification –hierarchical layering makes parsing complex for very heterogeneous resources –Data Integrity –Transition from 100 => many K of resources need efficient means for validating metadata Synchronization Between Repositories –Rating integrity of resources Stamps Large Scale Harvesting –Network instabilities need to be accounted for
20 January 2004ESS Technical Colloquium34 VO Goals at STScI for 2004 Quality Data Provider –Standard VO services to the archives Build Science Discovery mechanisms –Efficient User interfaces to Services, Registry, analysis tools Scientific Leadership –Build next generation applications using the VO technology (planning) Coordinated Development –Internal: cross division (ESS, ODM, CMO, OPO) –Continued collaboration with JHU and other VO partners
20 January 2004ESS Technical Colloquium35 NVO Goals for 2004 Standardized service provider functions –heartbeat (is service alive) –footprint (spatial field) Cross correlation, HTM, Healpix indexing Large Scale Data Correlations –Large archives –Mining large number of resources Automated discovery and analysis Authorization/Authentication/Security –Grid and web service technology Seamless integration –Make client application building simple Underlying model may be complex, yet modular for maintainability
20 January 2004ESS Technical Colloquium36