February 12, 2002Tom McGlynn ADEC Interoperability Technical Working Group Report
February 12, 2002Tom McGlynn Membership IRSA: Bruce Berriman, John Goode ADC: Kirk Borne, Brian Thomas CXC: Arnold Rots NED: Joe Mazzerella ADS: Guenther Eichhorn MAST: Tim Kimball HEASARC: Tom McGlynn
February 12, 2002Tom McGlynn Charter Goal –The top level goal is to provide seamless (one stop shopping) access to all the NASA astrophysics catalog and data services, irrespective of where the information is located. Charter Requirements –All services will be accessible via standard protocols –Access will be via the existing user interfaces –Results will appear consistent with the interface that initiated the request –Attribution of data to original site will be clearly stated
February 12, 2002Tom McGlynn Current Interoperating Services Many interoperating services are already in place at ADEC sites. E.g., –ADC links to HEASARC, IRSA and MAST archives –ADS links to all datasets associated with a given bibcode. –HEASARC links to CXC and MAST data and provides access to all VizieR tables. –IRSA suite of remotely accessible catalog and data services –MAST provides transparent access to HEASARC ROSAT and EUVE archives –NED links to multiple ADEC/non-ADEC data sources for many objects. NED name resolver used many places. –Astrobrowse services at CXC, MAST and HEASARC. No general framework for interoperations –Limited capabilities –Incompleteness/inconsistency –Resource discovery is difficult –Fragile links
February 12, 2002Tom McGlynn Summary of Activities Telecons –January 18, 2002 –February 8, 2002 Strawman design implementations Temporary Web site –heasarc.gsfc.nasa.gov/itwg
February 12, 2002Tom McGlynn Areas of Agreement No generic mandates on user interfaces –Each site uses resources developed in this effort to present information appropriate to its users. Layered approach to implementation –Stage 1: Links to data products and services –Stage 2: ‘Seamless’ services –Stage 3: ?? Standardized publication of services Monitoring of VO activities
February 12, 2002Tom McGlynn Technical Issues What kinds of links should be made? –What do we link to? Home page, service forms, availability summaries, queries, data products –How do we mediate links? Metadata or tables
February 12, 2002Tom McGlynn Technical Issues How are links/services published? –Machine versus human readable –Dynamic database or static pages –Syntax/standard to be used –How frequently updated –How is metadata about link included
February 12, 2002Tom McGlynn Technical Issues What is the vocabulary of linking? –Linking by characteristic Links by position, time, object name Other fields (exposure, observer,...) –Table links, e.g., bibcode joins –Metadata vocabulary Image/spectra resolution, … –Where do we get metadata information from?
February 12, 2002Tom McGlynn Technical Issues Formats for metadata and queries –Usefulness of existing formats –Usefulness of emerging standards Relationship to non ADEC institutions and efforts (CDS, VO)
February 12, 2002Tom McGlynn Phase 1 Plan Develop vocabulary for description of data center services (including access to data products). Establish desired ancillary metadata for services. Agree on protocol for publication of description of services. Publish descriptions of services in agreed protocol. Include links to described services/data in user interfaces.
February 12, 2002Tom McGlynn Develop Linkage Vocabulary How do we link? Develop use cases Describe current services Name and provenance of the resource How is the resource parametrized? –Position, time, name, bibcode, … How is the resource accessed? What is the format of the result?
February 12, 2002Tom McGlynn Establish Ancillary Metadata Why do we link? Type of information returned –Product summary, catalog, image, spectra, home page, … Wavelength regime Optional metadata –Resolution, epoch range, volume estimators, ‘desirability’,... Dependencies/linkages to other services
February 12, 2002Tom McGlynn Publication Protocol Answer how and why questions May use static product descriptor web pages at each site –Fixed location for each site –TBD XML? Format Role of GLU -- if any. Transition to UDDI/WSDL later?
February 12, 2002Tom McGlynn Publish Service Descriptions Multiple levels of description –Site –Data availability summaries –service data entry –service results –data products. Most popular services first
February 12, 2002Tom McGlynn Integrate into user interfaces Site-based requirements Not all services may be appropriate even when they are in principal queryable. Review by ADEC
February 12, 2002Tom McGlynn Schedule Issues Dependent on resources allocated Effort uncertain but anticipated at ~a few person- months per institution Risk of impact on other activities Coordination of diffuse resources is difficult
February 12, 2002Tom McGlynn Schedule for Phase 1 Six months to one year Development of metadata –Initial vocabulary and ancillary metadata design –Agreement on publication format Publication of services –Publication of popular services –Updates to vocabulary and ancillary metadata Integration within existing interfaces –Integration within user interfaces –Additional service descriptions
February 12, 2002Tom McGlynn Phase 2 ideas More ‘seamless’ integration of systems. Use of remote services as ‘database servers’. Adoption of common machine readable format for results. –Coordination with VO activities Use of Web as distributed file system for data products
February 12, 2002Tom McGlynn ClassX: Classifying the High Energy Sky A SkyView ROSAT field around TY Pyx showing the numerous serendipitous -- and largely unclassified -- WGACAT sources in that field. Two year NASA AISR program to develop science prototype for Virtual Observatory. Collaboration with ST ScI and CDS. Classify high energy sources: Discover and collect data on sources from existing services Collate information into coherent dossiers. Use information to classify sources. Explore discovery, integration and metadata issues for VO. Science goals: Samples of AGNs, early evolution of stars, analysis of XMM fields,...