Download presentation
Presentation is loading. Please wait.
Published byWesley Maxwell Modified over 9 years ago
1
e-Science Data Information and Knowledge Transformation The BinX Language
2
www.edikt.org What is BinX? Binary in XML –Use XML to mark up binary data –Mark up data types –Mark up sequences –Mark up arrays –Complex structures
3
www.edikt.org 1. 32767 2. 2147483647 3. 100.0 4. 100.0 Primitive Data Types Mark up data types FF 7F 7F FF FF FF 00 00 C8 42 42 C8 00 00 1234
4
www.edikt.org Abstract “struct” types Mark up a sequence Screen descriptor in GIF: Screen width: unsigned short; Screen height: unsigned short; Packed field: a byte Background colour index: byte Pixel aspect ratio: byte
5
www.edikt.org Abstract “array” types Mark up an array A 2-dimensional array containing 10-by-100, 32-bit integers
6
www.edikt.org Embedded abstract types Complex structures
7
www.edikt.org User-defined metadata Label the data types and structures
8
www.edikt.org Reusable type definitions Define macros for reuse
9
www.edikt.org Linking to binary data Reference the binary data file … …
10
www.edikt.org A BinX document – – – – – Root element Data class section Data instance section Abstract data type
11
www.edikt.org DataBinX DataBinX = BinX with Data 100 1000 5.257 1 2
12
e-Science Data Information and Knowledge Transformation The BinX Library
13
www.edikt.org BinX Components The library has core functionality to support generic utilities and applications Applications Utilities BinX Library Core BinX core functionality Parse/Gen BinX doc Read/write binary data Parse/Gen DataBinX Generic tools DataBinx pack/unpack Extractor, Viewer BinX editor Applications Domain-specific
14
www.edikt.org BinX application models Data catalogue model Data manipulation model Data query model Data service model Data transportation model
15
www.edikt.org Data catalogue model Primary storage Binary data files Metadata Syntactic annotation Semantic annotation Classification Domain specific Cross-reference XLink 0101 0101 01 BinX 1.1 BinX 1.1 BinX 1.2.1 BinX 1.2.1 BinX 1.2.2 BinX 1.2.2 BinX 1.2.3 BinX 1.2.3 0101 0101 01 BinX 1.2 BinX 1.2 BinX 1 BinX 1 BINARY Detailed Abstract METADATA
16
www.edikt.org Data manipulation model Extraction –Subset of a dataset Combination –Merge several datasets Transformation –Conversion of data types –Change of sequence order –Transposition of array dimensions Transparency –Automatic change of byte order
17
www.edikt.org Data query model In-dataset query –XPath against virtual XML Cross-dataset query –Link into multiple datasets Defining result format –XQuery-based return fragment Output interface –SAX events Utility BinX library 0101010 10 BinX data source BinX data source DataBinX SAX Events VOTable SAX Events APP VOTable APP DataBinx 0101010 10 BinX data source BinX data source APP Custom XQuery SAX Events 0101010 10 BinX data source BinX data source XPath 0101010 10 BinX data source BinX data source XLink Transform
18
www.edikt.org Data service model Publishing logical datasets in BinX DB 0101 0101 01 Client BinX Grid 0101 0101 01 BinX Dataset from one binary file Dataset from several binary files Dataset from multiple data sources
19
www.edikt.org Data transportation model DataBinX as interlingua XML document XML document DataBinX Schema BinX Schema BinX + Binary BinX + Binary ZIP (MIME) ZIP (MIME) XSLT BinX Util ZIP tool Send Receive XSLT BinX Util ZIP tool
20
e-Science Data Information and Knowledge Transformation Application in Astronomy Case Study 1 Data Conversion Between FITS and VOTable
21
www.edikt.org Application in astronomy FITS and VOTable conversion DataBinX Utility BinX library Core SIMPLE = T … END 01010101 SIMPLE = T … END 01010101 <?xml version=. … <?xml version=. …
22
www.edikt.org FITS file SIMPLE = T / file does conform to FITS standard BITPIX = 8 / number of bits per data pixel NAXIS = 1 / number of data axes … END 3D 4A 14 0F 1C FE 25 04 … … XTENSION= ‘BINTABLE’ / binary table extension BITPIX = 8 / 8-bit bytes NAXIS = 2 / 2-dimensional binary table … END 7B 3E 40 2C 16 70 E7 6F … … 0 79 Primary HDU Extension Header Data
23
www.edikt.org VOTable Procyon 114.827 5.227 4 5 3 4 3 2 1 2 3 3 5 6
24
www.edikt.org FITS →DataBinX →VOTable FITS to VOTable conversion DataBinX Utility FITS Schema BinX Schema BinX Preprocessor DataBinX VOTable XSLT transformer
25
www.edikt.org VOTable→DataBinX→FITS VOTable to FITS conversion XSLT transformer VOTable XSLT Preprocessor DataBinX FITS Schema BinX Schema BinX DataBinX Utility Binary Data Binary Data Post processor FITS Header FITS Header
26
www.edikt.org FITS-VOTable experiment Sample FITS file –A data table of 82 rows X 20 fields –File size: 37KB Generated DataBinX by DataBinX utility –Time spent: 268 ms –DataBinX document size: 1.2MB VOTable transformed by MSXML –Time spent: about 1 second –VOTable document size: 51KB F V DB
27
e-Science Data Information and Knowledge Transformation Application in Astronomy Case Study 2 Data Transportation by pipelining BinX and VOTable
28
www.edikt.org The Problem Three kinds of VOTable data sources –Pure XML VOTable (large) –VOTable + FITS (small) –VOTable + Binary (smaller) Difficulties –Additional parser for VOTable+Binary –Limited binary format –Byte order and data types
29
www.edikt.org The Solution: VOTable + BinX No coding necessary Smaller data files Easy to separate and restore Pipelined to work in the background Platform independent
30
www.edikt.org Approaches 1.Embedded BinX 2.BinX document linking Perhaps another method?
31
www.edikt.org Embedded BinX Example: http://www.edikt.org/binx/2003/06/binx
32
www.edikt.org BinX Document Linking Example:
33
www.edikt.org Comparison of the two approaches Embedded BinX –Advantages: One annotation file Consistency with VOTable definitions –Disadvantages: Spoil the VOTable document Difficult to parse BinX document linking –Advantages: Keep VOTable clean Easy to parse –Disadvantages: Need separate BinX document Difficult to keep consistent
34
e-Science Data Information and Knowledge Transformation BinX Software Today and the Future
35
www.edikt.org Future releases Utilities (GUI BinX editor) XPath-based data query DFDL support Text file support Output through SAX events Output as XQuery return Database interfacing Java wrapper for utilities
36
www.edikt.org Support Information and software download: –http://www.edikt.org/binx (coming soon)http://www.edikt.org/binx Questions: –support@edikt.orgsupport@edikt.org Requirements and suggestions: –tedwen@edikt.orgtedwen@edikt.org –robertc@edikt.orgrobertc@edikt.org
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.