Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh

Similar presentations


Presentation on theme: "Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh"— Presentation transcript:

1 Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh M.Westhead@epcc.ed.ac.uk

2 Overview Background Motivation Approach Current status

3 Motivation There will never be a standard data format –E.g. XML – verbose, tree-based, explicit structure –Legacy formats –Application specific formats –One size will never fit all But could we provide a language for describing formats –Transparency of physical representation –Automatic format conversion –Unambiguous description of data

4 There’s more… Explicit structure enables: Standard transformation to/from XML representation –Could allow application to read/write XML –But provide underlying efficient binary representation Data stream/file becomes database –Point to parts of the structure –Extract parts of the structure –Modify parts of the structure –Integrate parts of different structures

5 And more… Generic tools possible –Browsing –Conversion and transformation Annotation of data –E.g. identify bits that depict hurricane in an image Enables general semantic labels, many ontologies could be developed e.g.: –S.I. units, SQL types, Time –Community specific labels, “starClass = whiteDwarf” –Application specific labels, “nodeColour = green” Could lead to a standard transformation language

6 Not fairy tales Based on implemented work –BinX http://www.edikt.org/binx/http://www.edikt.org/binx/ –BFD part of the Scientific Annotation Middleware project (http://www.scidac.org/SAM/)http://www.scidac.org/SAM/ –ESML http://esml.itsc.uah.edu/http://esml.itsc.uah.edu/ Generalized and extended a little Clear semantics Foundation for extensibility

7 Layers Data Model Structure Primitives FortranC/C++Java Binary fileText fileData stream API Data Model Transformations

8 Approach Data model –XML infoset –Obvious way to describe it: XSD API –DOM/SAX –Extended to provide non-string value access Transformations –Ontology of predefined transformations (extensible) –XML language for: Composition Attaching to file contents Populating the model

9 Or to put it another way… XSD defines models for XML documents DFDL extends XSD to define models for data in different formats Efficient read/write access to binary and text data sources using DOM/SAX

10 Current status WG status –Formed 1 year ago –6 months on a false start –First draft expected GGF11 Key discussion: –Mapping/transformation language –Linking mechanisms –XML representation –Flexibility

11 Getting involved Webpages: http://forge.gridforum.org/projects/dfdl-wg/ Mailing list (dfdl-wg@gridforum.org) My address: M.Westhead@epcc.ed.ac.uk


Download ppt "Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh"

Similar presentations


Ads by Google