8 September 2006NVO Summer School Aspen1 Publishing and Resource Discovery with Registries Ray Plante Gretchen Greene T HE US N ATIONAL V IRTUAL O BSERVATORY
8 September 2006NVO Summer School Aspen2 All about Registries Overview of the Registry Framework Publishing to the NVO VOResource: Resource Metadata in XML IVOA Standard Registry Interface Exercise: query registry using standard interface Exercise: register resources in a registry
8 September 2006NVO Summer School Aspen3 The role of Resource Registries Used to discover and locate resourcesdata and servicesthat can be used in a VO application Resource: anything that is describable and identifiable. –Besides data and services: organizations, projects, software, … –Presently concerned with simple set of resource types Registry: a list of resource descriptions –Expressed as structured metadata to enable automated processing and searching
8 September 2006NVO Summer School Aspen4 An Overview of Data Discovery You can search the main NVO registry to find resources based on descriptive criteria NVO Registries are coarse-grained –You can find organizations, archives, catalogs –Wont find images, celestial objects, table records Registry framework contains multiple registries: –searchable registries –publishing registries
8 September 2006NVO Summer School Aspen5 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Framework
8 September 2006NVO Summer School Aspen6 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Framework harvest (pull)
8 September 2006NVO Summer School Aspen7 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Framework harvest (pull) Cross-harvest
8 September 2006NVO Summer School Aspen8 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Registry Framework harvest (pull) selective harvesting Cross-harvest
8 September 2006NVO Summer School Aspen9 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Client Applications search queries Registry Framework
8 September 2006NVO Summer School Aspen10 Local Publishing Registry Local Searchable Registry Full Searchable Registry Local Publishing Registry Full Searchable Registry Data Centers VO Projects Specialized Portals & Services Client Applications search queries Registry Framework
8 September 2006NVO Summer School Aspen11 NVO Public Registries RegistryURLSearchable?Publishing? STScI/JHU NVO Registry Caltech Carnivorehttp://nvo.caltech.edu:8080/carnivore/Yes NCSA Registration Portal Private Publishing Registries HEASARC CDS Only support harvesting protocol
8 September 2006NVO Summer School Aspen12 Overview of Publishing Resources are published if one can use NVO facilities to find them. How to Publish to the NVO –Multiple layers of publishing Starts with registry description of resource Data Access Services Incremental exposure for incremental effort –Who are you? How you publish depends on what you want to publish. An individual with a small data collection An archive center Someone with a cool service
8 September 2006NVO Summer School Aspen13 Small collections: VO-ready Repositories Repositories that allow users to deposit data to share with community –Guarantee long-term storage, availability Automatic support for VO publishing mechamisms –Entries into NVO Registry –Support for standard services: Cone Search, SIA, SSA, SkyNode Currently available Repositories –Images: NCSA Astronomy Digital Image Library –Spectra: Spectrum Service for the VO Part of an emerging data-preservation effort –Focusing on processed products associated with published results –Collaboration between NVO, journal publishers, and the library community –Goals: data publishing integrated into the journal publishing process data stored in distributed repositories run by academic libraries fully VO compliant
8 September 2006NVO Summer School Aspen14 Persistent Archives: Tools for Federation Registering your resources with a VO publishing registry –Enter description into registration form at one of the available NVO registries: STScI/JHU Registry: NCSA Registration Portal: Caltech Carnivore: –If you have a large number of resources to register, you can run your own registry on your own site Caltech Carnivore:
8 September 2006NVO Summer School Aspen15 IVOA Standard Registry Interface –Will come on-line this fall world-wide –As part of this upgrade, NVO will unify publishing interfaces It wont matter which NVO registry you register with Improved support for all types of resources –Will affect how users express advanced, constraint-based searches. –In general, this presentation describes the registries and publishing in terms of the new standards Your feedback is valuable! –Publishing GUI –The publishing process –Client interfaces Caution: Construction Ahead
8 September 2006NVO Summer School Aspen16 What can/should you register? –Should: your Organization Declares yourself as a publisher with an ID –Should: your Data Collection Users at least know how to access it via a Browser –Can: your existing services Browser-based services: e.g. search page Traditional CGI services Web Services The next level… Implement and register one or more standard services –Cone Search –Simple Image Access –Simple Spectral Access* –SkyNode *newest service standard Persistent Archives: Tools for Federation
8 September 2006NVO Summer School Aspen17 Cool Services: Integrating with the VO 1.Register your service at a registry Currently… Can register a generic Web Service If service doesnt fall into supported categories, register it as a generic Service Improved support for non-standard services coming Feel free to let us know where the forms are inadequate 2.Integrate support for standard VO formats, schemas FITS and VOTable Standard Data Model schemas (emerging) VOResource, Space-time Coordinates, Spectra 3.Implement Standard Support Interface a standard in development for: Self-description, tracking health and usage
8 September 2006NVO Summer School Aspen18 A word about Identifiers… IVOA Identifier: a globally-unique URI identifying a resource Ex: ivo://adil.ncsa/targeted/SIA Required as part of a registered resource description As publisher, you control what it looks like Two components: –Authority ID: e.g. adil.ncsa Defines a namespace for identifiers Owned by a single publishing organization –Resource Key: e.g. targeted/SIA Name for the resource unique within the namespace Encourage re-use of local identifiers
8 September 2006NVO Summer School Aspen19 Resource Metadata: XML Schema Classes of Resources –Generic Resource –Extensions: e.g. Organisation, DataCollection, Service, DataService, CatalogService, Registry, … Organized into separate schemas: –Core resource metadata: VOResource –Various extensions schemas containing specific types Capable of describing… –Data centers, research organizations, missions, observatories –Data collections, archives –VO standard services: Cone Search, Simple Image Access, Simple Spectral Access, SkyNode –Existing Browser/CGI-based services –Web Services
8 September 2006NVO Summer School Aspen20 Resource Metadata: Services Service resource records extends the core by adding capability metadata –capability = the interfaces/protocols and behavior supported by the service –Each standard protocol is considered a different capability A service can support several capabilities (e.g. ConeSearch and SkyNode) –There are associated standard capability metadata extensions for standard protocols For Simple Image Access, can state… –Maximum number of records returned –Maximum query region –Whether returned images are cutouts or static images… Capability metadata includes a description of the service interface –All interface descriptions include a service or access URL –For Web Services, access URL is usually sufficient –For REST-like interfaces, more descriptions of inputs can be described. Capability model allows description of support for different versions of protocol standards
8 September 2006NVO Summer School Aspen21 Sample Resource Description adilsia.xml <Resource xsi:type="vs:CatalogService" created=" T19:02:32" updated=" T17:07:22" … namespace definitions > 2 NCSA Astronomy Digital Image Library Simple Image Access ADIL ivo://adil.ncsa/targeted/SIA NCSA Radio Astronomy Imaging contributing authors Dr. Raymond Plante This allows searching for ADIL images via the SIA protocol. Archive University Research The specific class of resource
8 September 2006NVO Summer School Aspen22 Sample Resource Description adilsia.xml <Resource xsi:type="vs:CatalogService" created=" T19:02:32" updated=" T17:07:22" … namespace definitions > 2 NCSA Astronomy Digital Image Library Simple Image Access ADIL ivo://adil.ncsa/targeted/SIA NCSA Radio Astronomy Imaging contributing authors Dr. Raymond Plante This allows searching for ADIL images via the SIA protocol. Archive University Research The specific class of resource Metadata quality rating
8 September 2006NVO Summer School Aspen23 Sample Resource Description adilsia.xml <Resource xsi:type="vs:CatalogService" created=" T19:02:32" updated=" T17:07:22" … namespace definitions > 2 NCSA Astronomy Digital Image Library Simple Image Access ADIL ivo://adil.ncsa/targeted/SIA NCSA Radio Astronomy Imaging contributing authors Dr. Raymond Plante This allows searching for ADIL images via the SIA protocol. Archive University Research The specific class of resource Identity Metadata: what we call it Metadata quality rating
8 September 2006NVO Summer School Aspen24 Sample Resource Description adilsia.xml <Resource xsi:type="vs:CatalogService" created=" T19:02:32" updated=" T17:07:22" … namespace definitions > 2 NCSA Astronomy Digital Image Library Simple Image Access ADIL ivo://adil.ncsa/targeted/SIA NCSA Radio Astronomy Imaging contributing authors Dr. Raymond Plante This allows searching for ADIL images via the SIA protocol. Archive University Research The specific class of resource Identity Metadata: what we call it Curation Metadata: who is responsible Metadata quality rating
8 September 2006NVO Summer School Aspen25 Sample Resource Description adilsia.xml <Resource xsi:type="vs:CatalogService" created=" T19:02:32" updated=" T17:07:22" … namespace definitions > 2 NCSA Astronomy Digital Image Library Simple Image Access ADIL ivo://adil.ncsa/targeted/SIA NCSA Radio Astronomy Imaging contributing authors Dr. Raymond Plante This allows searching for ADIL images via the SIA protocol. Archive University Research The specific class of resource Identity Metadata: what we call it Curation Metadata: who is responsible Content Metadata: what it contains Metadata quality rating
8 September 2006NVO Summer School Aspen26 Sample Resource Description adilsia.xml <capability xsi:type="sia:SimpleImageAccess" standardID="ivo://ivoa.net/std/SIA"> 2 GET application/xml+votable Pointed <stc:AstroCoordSystem id="UTC-ICRS-TOPO" xlink:href="ivo://STClib/CoordSys#UTC-ICRS-TOPO" xlink:type="simple"/> Radio Millimeter Infrared Optical UV Capability Metadata: what it can do
8 September 2006NVO Summer School Aspen27 Sample Resource Description adilsia.xml <capability xsi:type="sia:SimpleImageAccess" standardID="ivo://ivoa.net/std/SIA"> 2 GET application/xml+votable Pointed <stc:AstroCoordSystem id="UTC-ICRS-TOPO" xlink:href="ivo://STClib/CoordSys#UTC-ICRS-TOPO" xlink:type="simple"/> Radio Millimeter Infrared Optical UV Capability Metadata: what it can do The specific class of capability
8 September 2006NVO Summer School Aspen28 Sample Resource Description adilsia.xml <capability xsi:type="sia:SimpleImageAccess" standardID="ivo://ivoa.net/std/SIA"> 2 GET application/xml+votable Pointed <stc:AstroCoordSystem id="UTC-ICRS-TOPO" xlink:href="ivo://STClib/CoordSys#UTC-ICRS-TOPO" xlink:type="simple"/> Radio Millimeter Infrared Optical UV Capability Metadata: what it can do The specific class of capability Coverage Metadata: how the data covers the sky, time, & frequency
8 September 2006NVO Summer School Aspen29 IVOA Standard Registry Interface Harvesting Interface –Used by registries to exchange resource descriptions. –Defined as a profile on the Open Archives Initiative (OAI) harvesting standard Search Interface –How client applications discover resources –5 operations: getIdentity: returns VOResource description of registry getResource: returns the VOResource description for a given identifier keywordSearch: returns all descriptions that contain words from a given set search: returns all descriptions that match a set of specific constraints xquerySearch: (optional) XQuery-based searching –For end users, many of the details of these operations may be hidden behind user-oriented tools and libraries Possible exception: expressing constraint-based searches
8 September 2006NVO Summer School Aspen30 Advanced, constraint-based searching Placing constraints on values of specific metadata –Expressed as an ADQL where clause e.g. title like '%Deep Field%' or shortName='HDF' –A field or column name is expressed as a simple XPath to the element being constrained Relative to the root Resource element Composed of /s and element names only; [ ] predicates and special characters (*,.,.., //) are not allowed Must point to a primative valuee.g. contains a string Can point to an attribute by preceeding name or curation/publisher like '%IPAC%') and content/contentLevel='Research' and capability/validationLevel >= 3
8 September 2006NVO Summer School Aspen31 Advanced, constraint-based searching Searching on xsi:type –Avoid including prefix label in xsi:type like '%:CatalogService' like like '%Service' –Matches Service, DataService, & CatalogService Selecting based on coverage –It is generally not useful to apply ADQL constraints to Space- Time Coordinate (STC) metadata e.g. anything under stc:STCResourceProfile STC descriptions are complex and not sufficiently unique –Emerging footprint services will facilitate selection based on coverage