E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

E-Science Data Information and Knowledge Transformation BinX An edikt Project Testbed Ted Wen, Robert Carroll, Denise Ecklund, Bob Gibbins, Davy Virdee,
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh Alan Chappell PNNL
Data formats in e-Science Two key requirements Two key requirements –Interoperability and Scalability –XML is flexible, but verbose –Binary formats are.
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
E-Science Data Information and Knowledge Transformation Edikt : e-Science Data, Information and Knowledge Transformation NeSC Review, 30 September 2003.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
XML: Extensible Markup Language
Snejina Lazarova Senior QA Engineer, Team Lead CRMTeam Dimo Mitev Senior QA Engineer, Team Lead SystemIntegrationTeam Telerik QA Academy SOAP-based Web.
E-Science Data Information and Knowledge Transformation The BinX Language.
Introduction to the ABAP Data Dictionary
ILDG File Format Chip Watson, for Middleware & MetaData Working Groups.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
BinX and Astronomy Bob Mann Institute for Astronomy and National e-Science Centre.
Technical Track Session XML Techie Tools Tim Bornholt.
Unification of CytometryML, DICOM and Flow Cytometry Standard Robert C. Leif *a and Stephanie H. Leif a a XML_Med, a Division of Newport Instruments, 5648.
CVSQL 2 The Design. System Overview System Components CVSQL Server –Three network interfaces –Modular data source provider framework –Decoupled SQL parsing.
Digital Object: A Virtual Online Storage Solution 598C Course Project Huajing Li.
XCube XML For Data Warehouses By Sven Groot. Data warehouses Contains data drawn from several databases and external sources Contains data drawn from.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
Session II Chapter 2 – Chapter 2 – XSLhttp://
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
CS 157B: Database Management Systems II May 8 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
EdSkyQuery-G Overview Brian Hills, December
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
Technical Aspects of SIARD “SIARD under the hood” 10. April 2003 / Stephan Heuscher.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
E-Science Data Information and Knowledge Transformation Edikt : e-Science Data, Information and Knowledge Transformation E-Science Centres of Excellence.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Report from Workshop 8: XML and related technologies ELAG 2001 Jan Erik Kofoed BIBSYS Library Automation.
SupervisorStudent Prof. Atilla ElciHussam Hussein ABUAZAB June 2007 Using ORACLE XML Parser to Access Ontology CMPE 588 Engineering Semantic for.
INFSO-RI Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.
Using XML to store Descriptive Metadata Richard Murphy Rosarie O’Riordan Central Statistics Office Ireland.
Mike Jackson EPCC OGSA-DAI Architecture + Extensibility OGSA-DAI Tutorial GGF17, Tokyo.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Starlink VOTable software Author: Mark Taylor Open source Java software for table manipulation STIL:
1 “Universal Data-Speak”: The eXtensible Markup Language Zack Ives CSE 590DB, Winter 2000 University of Washington 3 January 2000.
SOAP-based Web Services Telerik Software Academy Software Quality Assurance.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
Martin Kruliš by Martin Kruliš (v1.1)1.
©2001 Priority Technologies, Inc. All Rights Reserved Meteor Status Miami Face to Face Meeting January 16 – 18, 2002.
The BinX API eDIKT project team May 2003 Ted Wen Robert Carroll
Publishing Combined Image & Spectral Data Packages Introduction to MEx M. Sierra, J.-C. Malapert, B. Rino VO ESO - Garching Virtual Observatory Info-Workshop.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
Utilizing the Benefits of Native XML Database Technologies Alan Cornish Systems Librarian Washington State University Libraries.
ACG 6415 XML Schemas XML Namespaces XMLink. The XML Foundation  Many participants – an extended family! XML documents – carry data in context  Each.
Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh
Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Lecture Transforming Data: Using Apache Xalan to apply XSLT transformations Marc Dumontier Blueprint Initiative Samuel Lunenfeld Research Institute.
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
Solvency II Tripartite template V2 and V3 Presentation of the conversion tools proposed by FundsXML France.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
1 XML and XML in DLESE Katy Ginger November 2003.
XML: Extensible Markup Language
XML Related Technologies
What is FITS? FITS = Flexible Image Transport System
XML in Web Technologies
Database Processing with XML
PDAP Query Language International Planetary Data Alliance
Session 2: Metadata and Catalogues
Real-World File Structures
Querying XML XSLT.
Presentation transcript:

e-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen Robert Carroll

What is BinX?  Binary in XML –Annotation language  Using XML  Descriptive  Low-level –Software components  BinX library  Generic utilities  API

How and Why BinX is used Special Application Program Special Application Program … … BinX Library Application Program Application Program Application Program Application Program Application Program Application Program

e-Science Data Information and Knowledge Transformation The BinX Language Annotating a binary data stream Mark up data types Mark up sequences Mark up arrays Complex structures

Primitive Data Types  Mark up data types FF 7F 7F FF FF FF C C

Abstract “struct” types  Mark up a sequence Screen descriptor in GIF: Screen width: unsigned short; Screen height: unsigned short; Packed field: a byte Background colour index: byte Pixel aspect ratio: byte

Abstract “array” types  Mark up an array A 2-dimensional array containing 10-by-100, 32-bit integers

Embedded abstract types  Complex structures

User-defined metadata  Label the data types and structures

Reusable type definitions  Define macros for reuse

Linking to binary data  Reference the binary data file … …

A BinX document  –  – –  –  –  Root element Data class section Data instance section Abstract data type

DataBinX DataBinX = BinX with Data

e-Science Data Information and Knowledge Transformation The BinX Library Core library Utilities Applications

BinX Components  The library has core functionality to support generic utilities and applications Applications Utilities BinX Library Core BinX core functionality Parse/Gen BinX doc Read/write binary data Parse/Gen DataBinX Generic tools DataBinx pack/unpack Extractor, Viewer BinX editor Applications Domain-specific

BinX application models  Data catalogue model  Data manipulation model  Data query model  Data service model  Data transportation model

Data catalogue model Primary storage Binary data files Metadata Syntactic annotation Semantic annotation Classification Domain specific Cross-reference XLink BinX 1.1 BinX 1.1 BinX BinX BinX BinX BinX BinX BinX 1.2 BinX 1.2 BinX 1 BinX 1 BINARY Detailed Abstract METADATA

Data manipulation model  Extraction –Subset of a dataset  Combination –Merge several datasets  Transformation –Conversion of data types –Change of sequence order –Transposition of array dimensions  Transparency –Automatic change of byte order

Data query model  In-dataset query –XPath against virtual XML  Cross-dataset query –Link into multiple datasets  Defining result format –XQuery-based return fragment  Output interface –SAX events Utility BinX library BinX data source BinX data source DataBinX SAX Events VOTable SAX Events APP VOTable APP DataBinx BinX data source BinX data source APP Custom XQuery SAX Events BinX data source BinX data source XPath BinX data source BinX data source XLink Transform

Data service model  Publishing logical datasets in BinX DB Client BinX Grid BinX Dataset from one binary file Dataset from several binary files Dataset from multiple data sources

Data transportation model DataBinX as interlingua XML document XML document DataBinX Schema BinX Schema BinX + Binary BinX + Binary ZIP (MIME) ZIP (MIME) XSLT BinX Util ZIP tool Send Receive XSLT BinX Util ZIP tool

e-Science Data Information and Knowledge Transformation Application in Astronomy Case Study Data Conversion Between FITS and VOTable

Application in astronomy  FITS and VOTable conversion DataBinX Utility BinX library Core SIMPLE = T … END SIMPLE = T … END <?xml version=. … <?xml version=. …

FITS file SIMPLE = T / file does conform to FITS standard BITPIX = 8 / number of bits per data pixel NAXIS = 1 / number of data axes … END 3D 4A 14 0F 1C FE … … XTENSION= ‘BINTABLE’ / binary table extension BITPIX = 8 / 8-bit bytes NAXIS = 2 / 2-dimensional binary table … END 7B 3E 40 2C E7 6F … … 0 79 Primary HDU Extension Header Data

VOTable Procyon

FITS →DataBinX →VOTable  FITS to VOTable conversion DataBinX Utility FITS Schema BinX Schema BinX Preprocessor DataBinX VOTable XSLT transformer

VOTable→DataBinX→FITS  VOTable to FITS conversion XSLT transformer VOTable XSLT DataBinX FITS Schema BinX Schema BinX DataBinX Utility Binary Data Binary Data Post processor FITS Header FITS Header

e-Science Data Information and Knowledge Transformation BinX Software Software library in C++ Documentation Utilities and Samples

Future releases  XPath-based data query  DFDL support  Output through SAX events  Output as XQuery return  Database interfacing  Java wrapper for utilities

Support  Information and software download: –  Questions:  Requirements and suggestions: