The state of VOEvent semantics THE US NATIONAL VIRTUAL OBSERVATORY Matthew J. Graham (Caltech, NVO) IVOA Strasbourg: Semantics I 26 May 2009
What is semantics important? Transient astronomy suffers from: heterogeneous data distributed data x-scale data Humans are: slow expensive non-algorithmic untrustworthy Semantics makes data machine-processible so computers can do the grunt work 26 May 2009 2
Semantic technologies RDF (Resource Description Framework): a family of W3C specifications for conceptual modelling a fact is a subject-predicate-object triple various serializations: RDF/XML, RDFa, N3 Concept schemes: controlled vocabularies taxonomies ontologies Knowledge bases: triple/quad store relational db with a suitable front end (e.g. d2r) SPARQL 26 May 2009 3
De Vorerum Natura The VOEvent specification states: “The <What> and <Why> elements work together to characterize the nature of a VOEvent” Specific semantic components: <What>: <Param name=“” value=“” ucd=“” unit=“”/> <Group name=“”> <Why> <Concept>…</Concept> 26 May 2009 4
Control your language What: Why: Register data dictionaries used by event authors: SkyAlert VO registry (part of VOEventStream registry extension) Group name + Param name must be unique Why: Transients classification scheme from dotastro.org Astronomical object type ontology from CDS/EuroVOTech Transients ontology from Caltech 26 May 2009 5
dotastro.org 128 classes Based on GCVS 26 May 2009 6
ObjectType ontology ~80 variable classes Multiple inheritance Properties 26 May 2009 7
Transients ontology 231 classes 26 May 2009 8
Looking for domain experts If interested in contributing scientific domain knowledge, please contact: mjg@caltech.edu 26 May 2009 9
Managing followup information 26 May 2009 10
Contextual technologies Linked Data – connects two URIs with an owl:sameAs: same individual with same identity UMBEL – a lightweight subject concept reference structure; an infocline OAI ORE – describes aggregations of Web resources 26 May 2009 11
Data portfolio 26 May 2009 12
Summary In the era of data intensive astronomy (Astronomy 2020), semantic technology (astroinformatics) provides the structural underpinnings for doing science Bulk of data activities will be performed by intelligent agent systems Continuing efforts to develop usable ontological structures www.practicalastroinformatics.org Looking for domain experts: mjg@caltech.edu 26 May 2009 13