VO Standards and Protocols XML VOTable UCD ConeSearch Roy Williams California Institute of Technology NVO co-director.

Slides:



Advertisements
Similar presentations
Copyright © 2003 Pearson Education, Inc. Slide 7-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Advertisements

VOTable 2005 Applications. Agenda Description of some applications Description of some applications Nilesh UrunkarAbout C++ Parser and CONVOT Nilesh UrunkarAbout.
6 September 2008NVO Summer School 2008 – Santa Fe1 DAL Clients: Scripting Data Access with Python Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY.
An International Virtual Observatory data exchange format VOTable Roy Williams François Ochsenbein Clive Davenhall Daniel Durand Pierre Fernique David.
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Datatypes for OGSA Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax:
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
XML Schema Heewon Lee. Contents 1. Introduction 2. Concepts 3. Example 4. Conclusion.
Open Office.Org What is the Open Office.org Source Project? Open source project through which Sun Microsystems is releasing the technology for the popular.
XML: Extensible Markup Language
Sue Wills July Objects The JavaScript language is completely centered around objects, and because of this, it is known as an Object Oriented Programming.
XML: text format Dr Andy Evans. Text-based data formats As data space has become cheaper, people have moved away from binary data formats. Text easier.
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
E-Science Data Information and Knowledge Transformation The BinX Language.
XML Parsing Using Java APIs AIP Independence project Fall 2010.
An Introduction to XML Schema CSCI 7818 by Ming Rutar.
XSLT XML DBs, and Schemas Week 18 DSA. The Whisky Case study XSLT can be applied in the client. –Add a xml processing instruction to the xml to bind to.
DSA Semester 2. XML Tagged data Hello A really interesting course, well taught Interchange of data RSS, BPEL4WS, RossettaNet … Structure document representation.
CSE 190: Internet E-Commerce Lecture 17: XML, XSL.
Introduction to XML Extensible Markup Language
Tutorial 11: Connecting to External Data
HDF 1 NCSA HDF XML Activities Robert E. McGrath Mike Folk National Center for Supercomputing Applications.
Data Formats CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
VOTable: Tabular Data for Virtual Observatory François Ochsenbein Roy Williams Clive Davenhall, Daniel Durand, Pierre Fernique, Robert Hanisch, David Giaretta,
Chapter 6 Text and Multimedia Languages and Properties
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
CIS Computer Programming Logic
Dr. Azeddine Chikh IS446: Internet Software Development.
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
DateADASS How to Navigate VO Datasets Using VO Protocols Ray Plante (NCSA/UIUC), Thomas McGlynn and Eric Winter NASA/GSFC T HE US N ATIONAL V IRTUAL.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
Session IV Chapter 9 – XML Schemas
XML as a Data Description and Distribution Language Software Development Conference 2000 San Jose, California March 23, 2000 Copyright 2000 © Faison Computing.
Web Services for Satellite Emulation Development Kathy J. LiszkaAllen P. Holtz The University of AkronNASA Glenn Research Center.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
Date : 3/3/2010 Web Technology Solutions Class: Application Syndication: Parse and Publish RSS & XML Data.
 JAVA Compilation and Interpretation  JAVA Platform Independence  Building First JAVA Program  Escapes Sequences  Display text with printf  Data.
Data Structure & File Systems Hun Myoung Park, Ph.D., Public Management and Policy Analysis Program Graduate School of International Relations International.
Serialization. Serialization is the process of converting an object into an intermediate format that can be stored (e.g. in a file or transmitted across.
11 3 / 12 CHAPTER Databases MIS105 Lec15 Irfan Ahmed Ilyas.
Property of Jack Wilson, Cerritos College1 CIS Computer Programming Logic Programming Concepts Overview prepared by Jack Wilson Cerritos College.
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
Web Technologies COMP6115 Session 4: Adding a Database to a Web Site Dr. Paul Walcott Department of Computer Science, Mathematics and Physics University.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
XML eXtensible Markup Language. XML A method of defining a format for exchanging documents and data. –Allows one to define a dialect of XML –A library.
Starlink VOTable software Author: Mark Taylor Open source Java software for table manipulation STIL:
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
CS 174: Web Programming November 4 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
Discussed in Kyoto Schema changes for the next version (Gerard Lemson)  will be included in VOTable1.2 Schema changes for the next version (Gerard Lemson)
Martin Kruliš by Martin Kruliš (v1.1)1.
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
© Roy Williams 2002 The Uphill Battle of Semantic Interoperability Roy Williams California Institute of Technology.
XML Extensible Markup Language
Apache Avro CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 9 Web Services: JAX-RPC,
1 XML and XML in DLESE Katy Ginger November 2003.
XML QUESTIONS AND ANSWERS
Accomplishments RSM v0.7 First draft XML Schema completed: VOResource.xsd NVO: Working prototype resource using VOResource as format for metadata exchange.
What is FITS? FITS = Flexible Image Transport System
XML in Web Technologies
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
Chapter 9 Web Services: JAX-RPC, WSDL, XML Schema, and SOAP
XML Problems and Solutions
Presentation transcript:

VO Standards and Protocols XML VOTable UCD ConeSearch Roy Williams California Institute of Technology NVO co-director

XML: Structured Information Antonio Stadivarius Domenico Scarlatti Io bisogno una appartamento acoglienti a Cremona … 4/13/23 April 13, iv.1723 Separation of structure from presentation The computer can read the document and answer queries like this: Find all memos from April 1723

XML Documents and data Human readable, editable, mailable Schema constrains structure -- can encode data models Can be transformed (XSLT) -- other xml -- html/pdf/excel etc Tools Parsers in Java, C, C++, Perl, Python,... Browsers and editors XML databases Binding to make API For serialization, mediation, brokers

XML for science XML is a comfortable vehicle for our metadata and data models But the real challenge is: To define NVO-specific data objects And how they are used We need consensus more than either software or hardware VOTable VOResource services -- WSDL

XML example (no schema) The Cambridge Star Atlas Wil Tirion Cambridge UP Parallel Computing Works! Geoffrey C. Fox Roy D. Williams Paul C. Messina Morgan Kaufmann

XML Parsing SAX : Event-Based Handlers functions for StartElement, Text, EndElement, etc. Found element BookCatalogue Found element Book Found Element Title Found Text The Cambridge Star Atlas Found End Element Title ….

Parsing DOM : Document Object Model Returns a tree-like Document object with data attached BookCatalogue Book Title ISBNParallel Computing Works! Cambridge Star Atlas Title Author Wil Tirion

<schema xmlns=" xmlns:cat="uri://BookCatalogue"> Book.xsd = Xml-Schema Definition XML Schema

<schema xmlns=" xmlns:cat="uri://BookCatalogue"> Book.xsd = Xml-Schema Definition XSchema All XML schemas have schema as the root element

<schema xmlns=" xmlns:cat="uri://BookCatalogue"> Catalog is a sequence of books XSchema Default Namespace declaration: all these come from this standard namespace

<schema xmlns=" xmlns:cat="uri://BookCatalogue"> Book.xsd = Xml-Schema Definition XSchema This namespace is defined here & abbreviated as "cat" This element comes from the namespace called cat Book element defined here

Namespace Content uri://BookCatalogue can be abbreviated as "cat" Here: The cat namespace contains: BookCatalogue Book Title Author ISBN Date Publisher

XML example (with schema) <BookCatalogue xmlns= "uri://BookCatalogue" xmlns:xsi=" xsi:schemaLocation= "uri://BookCatalogue > The Cambridge Star Atlas Wil Tirion Cambridge UP Parallel Computing Works! Geoffrey C. Fox Roy D. Williams Paul C. Messina Morgan Kaufmann Here is the namespace that we are using in this document Here is the URL of its schema Document is instance of a w3c schema

VOTable Full metadata representation Hierarchy of RESOURCEs containing PARAMs and TABLEs UCD (unified content descriptor) –a has unit meter –a has UCD ORBIT_SIZE_SMAJ (Semi-major axis of the orbit ) Can reference remote and/or binary streams Table can be –Pure XML –"Simple Binary" –FITS Binary Table

Sample VOTable This parameter is designed to store the observer's name Some bright stars <FIELD name="RA" ucd="POS_EQ_RA" ref="myJ2000" unit="deg" datatype="float" precision="F3" width="7"/> <FIELD name="Dec" ucd="POS_EQ_DEC" ref="myJ2000" unit="deg" datatype="float" precision="F3" width="7"/> Procyon Vega <STREAM href="ftp://server.com/mydata.fits" expires=" " actuate="onRequest"/>

Table Cell scalar arrays variable length arrays etc boolean bit unsignedByte short int long char unicodeChar float double floatComplex doubleComplex Primitives follows FITS binary table does NOT follow XML schema

VOTable is Flexy eg Table of images UCD="meta.code.mime; image.jpeg" datatype="unsignedByte" arraysize="*" eg Table of URL links UCD=meta.ref.url" datatype="char" arraysize="*"

VOTable Schema (xsd)

Table Data Model Metadata Class definition for Row FIELD –data type –semantic type Data Each Row is a list of Cells Each Cell is an array of Primitives –may be variable length

Table Data Layout All metadata first –small, complex, XML Class definition for table record + params, description, etc etc Then data –(may be) large, remote –XML | binary | FITS Instantiations of table record All records MUST have same format binary data allows streaming, parallelism

Param Data Model Param is Table with one cell Like a FIELD value But with a value attribute

Primitives All have fixed binary length Same as FITS primitives Except Unicode datatype Meaning FIT S Bytes "boolean"Logical"L""L"1 "bit"Bit"X""X"* "unsignedByte"Byte (0 to 255)"B""B"1 "short"Short Integer"I""I"2 "int"Integer"J""J"4 "long"Long integer"K""K"8 "char"ASCII Character"A""A"1 "unicodeChar"Unicode Character 2 "float"Floating point"E""E"4 "double"Double"D""D"8 "floatComplex"Float Complex"C""C"8 "doubleComplex"Double Complex"M""M"16

Multidimensional Array Cell A table cell can have lots of Primitives Example: WCS parameters are arrays – Example: up to 10 images, each 64x64 –

Hierarchy A VOTable contains RESOURCES –RESOURCE can contain: TABLE RESOURCE etc Usage example Many observations in the file, –each is a RESOURCE Each observation is –Parameters –Calibration table –Raw data table

Hierarchy New feature: GROUP

Astronomical Data Image –Standard file format: FITS Standardized c.1980 Keyword-value dictionary + binary block Catalog –Derived from image Connected set of bright pixels –Table of stars –Standard format: VOTable Standardized 2002 XML with remote binary Spectrum

XSLT Example Output from the messier catalog at VirtualSky.org Output from messier Catalog Server Messier Number Right Ascension J ' Globular Cluster Canes Venatici M3 is one of more heavily studied globular clusters due to its position in the galaxy, putting it far from interstellar absorbtion. More than 200 variable stars have been observed out of a total of near 50,000. Being one of the brightest clusters, M3 is

XSLT Result this table is the result of a conesearch

XSLT Program Data

Binding to make a Parser for(int i=0; i<table.getFieldCount(); i++){ Field field = (Field)table.getFieldAt(i); String u = field.getUcd(); if(u != null && u.equals("POS_EQ_RA_MAIN")) System.out.println("Field " + i + " is for RA"); } From the Schema an API and library is generated JAXB Breeze Castor This is JAVOT (Caltech)

Unified Content Descriptor UCD is a semantic type phot.mag;em.opt.B Integrated total blue magnitude src.orbital.eccentricityOrbital eccentricity stat.medianStatistics Median Value Base + Specifiers eg error in default right ascension –stat.error; pos.eq.ra; meta.main First word is "type" "what kind of thing is this?" How do we add a stat.error to another?

Unified Content Descriptor UCD has services Natural Language Description Find best UCD –Search in NLD Matching functions –if I want pos.eq.ra, is stat.error;pos.eq.ra correct? What about Ontology???

Some UCD S stat Statistical parameters Q stat.Fourier Fourier coefficient Q stat.Fourier.amplitude Amplitude Fourier coefficient P stat.covariance Covariance between two parameters P stat.error Statistical error P stat.error.sys Systematic error Q stat.fit Fit Q stat.fit.chi2 Chi2 Q stat.fit.dof Degrees of freedom Q stat.fit.goodness Goodness or significance of fit Q stat.fit.omc Observed minus computed Q stat.fit.param Parameter of fit Q stat.fit.residual Residual fit Q stat.likelihood Likelihood S stat.max Maximum or upper limit S stat.mean Mean, average value S stat.median Median value S stat.min Minimum or lowest limit

Some UCD S phot Photometry Q phot.calib Photometric calibration Q phot.color Color index or magnitude difference Q phot.color.Cous Color index in Cousins system Q phot.color.Gen Color index in Geneva system Q phot.color.Gunn Color index in Gunn system Q phot.color.JHN Color index in Johnson 65+ system S meta Metadata P meta.bib Bibliographic reference P meta.bib.author Author name P meta.bib.bibcode Bibcode P meta.bib.ivo IVOA identifier ivo:// P meta.bib.fig Figure in a paper P meta.bib.journal Journal name P meta.bib.page Page number P meta.bib.volume Volume number P meta.code Code or flag P meta.code.class Classification code

Cone Search First VO standard service Input: RA, DEC, SR must be present –decimal degrees J2000 Output: VOTable of sky-located data records –must have columns with UCDs: POS_EQ_RA_MAIN, POS_EQ_DEC_MAIN, ID_MAIN RA=300 DEC=25 SR=0.1 IDRADECxyz Request Response

Cone Searches in a VO Registry

Result of Cone Search RA Dec ID

Cone Search + Density Probe Cone Search Density Probe baseURL Spacing Search radius interoperating NVO-compliant services! Federation of Multiple Services