Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh Alan Chappell PNNL

Slides:



Advertisements
Similar presentations
Introduction to Java 2 Programming Lecture 10 API Review; Where Next.
Advertisements

Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Datatypes for OGSA Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax:
Data formats in e-Science Two key requirements Two key requirements –Interoperability and Scalability –XML is flexible, but verbose –Binary formats are.
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
E-Science Data Information and Knowledge Transformation Edikt : e-Science Data, Information and Knowledge Transformation NeSC Review, 30 September 2003.
Andy Powell, Eduserv Foundation Feb 2007 The Dublin Core Abstract Model – a packaging standard?
OASIS OData Technical Committee. AGENDA Introduction OASIS OData Technical Committee OData Overview Work of the Technical Committee Q&A.
W3C and RDF. Why OCLC is a W3C Member Access to networked information resources –the browser and online access –the breath and depth of networked information.
A centre of expertise in digital information management UKOLN is supported by: XML and the DCMI Abstract Model DC Architecture WG Meeting,
XML: Extensible Markup Language
Data Formats: Using self-describing data formats Curt Tilmes NASA Version 1.0 Review Date.
Introduction to Databases
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY Matthew Williams
E-Science Data Information and Knowledge Transformation The BinX Language.
Wrap up  Matching  Geometry  Semantics  Multiscale modelling / incremental update / generalization  Geometric algorithms  Web Services.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28 Database Systems I The Relational Data Model.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
Philips Research France Delivery Context in MPEG-21 Sylvain Devillers Philips Research France Anthony Vetro Mitsubishi Electric Research Laboratories.
Mapping Physical Formats to Logical Models to Extract Data and Metadata Tara Talbott IPAW ‘06.
BinX and Astronomy Bob Mann Institute for Astronomy and National e-Science Centre.
SQL Server 2000 and XML Erik Veerman Consultant Intellinet Business Intelligence.
DSpace XML UI Project Texas A&M University Digital Initiatives, Research and Technology Scott Phillips, Cody Green, Alexey Maslov, Adam Mikeal, Brian Surratt,
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Data Formats: Using Self-describing Data Formats Curt Tilmes NASA Version 1.0 February 2013 Section: Local Data Management Copyright 2013 Curt Tilmes.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Database System Concepts and Architecture Lecture # 2 21 June 2012 National University of Computer and Emerging Sciences.
Metadata Xiangming Mu. What is metadata? What is metadata? (cont’) Data about data –Any data aids in the identification, description and location of.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
Technical Aspects of SIARD “SIARD under the hood” 10. April 2003 / Stephan Heuscher.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
2005 Epocrates, Inc. All rights reserved. Integrating XML with legacy relational data for publishing on handheld devices David A. Lee Senior member of.
1 Cisco Unified Application Environment Developers Conference 2008© 2008 Cisco Systems, Inc. All rights reserved.Cisco Public Introduction to Etch Scott.
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY WITHIN THE (SEMANTIC) WEB Matthew Williams
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
FEN NOEA/IT - Databases/ODB1 ODB – Object DataBases Object-Oriented – Fundamental Concepts UML and EE/R OO and Relational Databases Introduction.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
WIGOS Data model – standards introduction.
WSDL – Web Service Definition Language  WSDL is used to describe, locate and define Web services.  A web service is described by: message format simple.
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
Working with XML. Markup Languages Text-based languages based on SGML Text-based languages based on SGML SGML = Standard Generalized Markup Language SGML.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
14 October 2002GGF6 / CGS-WG1 Working with CIM Ellen Stokes
Post-NASA Review Schema Harmonisation CCSDS Spring Meeting 2014 Peter Mendham, Richard Melvin, Stuart Fowell.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
DC Architecture WG meeting Wednesday Seminar Room: 5205 (2nd Floor)
OGC Web Services with complex data Stephen Pascoe How OGC Web Services relate to GML Application Schema.
JSON. JSON as an XML Alternative JSON is a light-weight alternative to XML for data- interchange JSON = JavaScript Object Notation It’s really language.
Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh
Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh
ArrayExpress Ugis Sarkans EMBL - EBI
An Introduction to Data Modeling with Fedora Thorny Staples Fedora Commons, Inc.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Binary Data Format (BDF) ARCH/VCDE Small Working Group 03/28/2008
DFDL WG Session 3 Mike Beckerle Ascential Software Two note-takers please?
CS 325 Spring ‘09 Chapter 1 Goals:
XML: Extensible Markup Language
What is FITS? FITS = Flexible Image Transport System
Web Ontology Language for Service (OWL-S)
Data Model.
Dr. Bhavani Thuraisingham The University of Texas at Dallas
Presentation transcript:

Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh Alan Chappell PNNL

Agenda Introduction and welcome - Martin Westhead 10mins Binary Format Description Language (BFD) - Alan Chappell 10mins Binary XML (BinX) - Stephen Rutherford 10mins DFDL - Martin Westhead 15mins – Big picture – Structural Description Language – Charter (20 mins Discussion) Examples repository - Alan Chappell 10mins –Bruce Barkstrom Examples at NASA (15mins Discussion)

Motivation There will never be a standard data format –E.g. XML – verbose, tree-based, explicit structure –Legacy formats –Application specific formats –One size will never fit all But could we provide a language for describing formats –Transparency of physical representation –Automatic format conversion –Unambiguous description of data

Theres more… Explicit structure enables: Standard transformation to/from XML representation –Could allow application to read/write XML –But provide underlying efficient binary representation Data stream/file becomes database –Point to parts of the structure –Extract parts of the structure –Modify parts of the structure –Integrate parts of different structures

And more… Generic tools possible –Browsing –Conversion and transformation Annotation of data –E.g. identify bits that depict hurricane in an image Enables general semantic labels, many ontologies could be developed e.g.: –S.I. units, SQL types, Time –Community specific labels, starClass = whiteDwarf –Application specific labels, nodeColour = green Could lead to a standard transformation language

Not fairy tales Based on implemented work –BinX –BFD part of the Scientific Annotation Middleware project ( Generalized and extended a little Formal semantics Foundation for extensibility

Approach Separate out structure and semantics General structural language –Repetition –Pointers –References to data –New structures can be built (compositionality) Semantics –Hard to express so…we dont –General labeling –Label semantics define elsewhere (ontologies) –Labels can be added (extensibility)

Structure – arbitrary labels fooSet fooPair foo bunchThings thing bunchThings foo fooPair......

Structure – example labels complex Array complex float byte bit byte float complex......

Structural language Formal semantics –Structured binary sequence –Defines hierarchical structure over underlying sequence of binary values Language for describing hierarchical structure –Repetition Explicit number repeats Termination characters –Data reference Conditionals Data size –Pointers Scope –As general as possible but –Must be concise and implementable Draft language definition on web page (

CSV file example char:=byte data:=[(char - [',']).*] field:=[data; [',']] finalField:=[data; [\n]] row:=[field.*] :: [finalField] table:=[row.*]

Semantic labels Many ontologies possible Initial scope probably: –Basic types (floating point, integer, character) –Simple structures (structs, arrays, tables) Obvious extensions: –SQL types –XML Schema types Key WG goal: –Define form and requirements of new ontologies

What is an Ontology? XML Schema for new types Structural description of new types Definition of core API behaviour on new type API extensions Relationships to other types

WG goals Formal language for DFDL data structure Standard representation of this language in XML Requirements for DFDL ontology Basic types ontology Basic structures ontology

Currently under discussion Abstraction from the underlying binary –Compression, encoding, encryption –Physical vs. conceptual binary sequence Abstraction of description –complex:=[foo; foo] –Instantiate foo:= float or foo:= double at use time Filtering of results –Getting to data model and leave format behind –CSV -> [[value; value; value]; [value; value; value]]

DFDL in the VO Generic tools Metadata possibilities –Ontologies can define relationships between types –E.g. polar to Cartesian –Standard classes over data objects

Getting involved Webpages: Mailing list My address: