Presentation is loading. Please wait.

Presentation is loading. Please wait.

E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen

Similar presentations


Presentation on theme: "E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen"— Presentation transcript:

1 e-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen tedwen@edikt.orgtedwen@edikt.org Robert Carroll robert.carroll@edikt.orgrobert.carroll@edikt.org

2 www.edikt.org Agenda About the BinX project Introduction to the BinX language Introduction to the BinX library Example application Overview of the BinX API Discussion

3 www.edikt.org The problem Most scientific data are in binary files Binary data files are not all standardized Binary data files are platform-dependent XML is useful to represent metadata Scientific datasets can be too large in XML

4 www.edikt.org What is BinX? Binary in XML –Annotation language Using XML Descriptive Low-level –Software components BinX library Generic utilities API

5 www.edikt.org How and Why BinX is used 0101010101 01010101010 10101010100 01000010111 01010101010 10101010110 Special Application Program Special Application Program … … BinX Library Application Program Application Program Application Program Application Program Application Program Application Program

6 e-Science Data Information and Knowledge Transformation The BinX Language Annotating a binary data stream Mark up data types Mark up sequences Mark up arrays Complex structures

7 www.edikt.org Data elements Primitive data elements –Byte, character, integer, real Complex data elements –Arrays, struct, union User-defined data elements

8 www.edikt.org Primitive Data Types Character – (Fixed length, variable length and delimited) Integer – –, Real –

9 www.edikt.org 1. 32767 2. 2147483647 3. 100.0 4. 100.0 Primitive Data Types Mark up data types FF 7F 7F FF FF FF 00 00 C8 42 42 C8 00 00 1234

10 www.edikt.org Abstract struct types Mark up a sequence Screen descriptor in GIF: Screen width: unsigned short; Screen height: unsigned short; Packed field: a byte Background colour index: byte Pixel aspect ratio: byte

11 www.edikt.org Abstract array types Mark up an array A 2-dimensional array containing 10-by-100, 32-bit integers

12 www.edikt.org Embedded abstract types Complex structures

13 www.edikt.org User-defined metadata Label the data types and structures

14 www.edikt.org Reusable type definitions Define macros for reuse

15 www.edikt.org Linking to binary data Reference the binary data file … …

16 www.edikt.org The BinX document http://www.edikt.org/binx

17 www.edikt.org A BinX document – – – – – Root element Data class section Data instance section Abstract data type

18 www.edikt.org DataBinX DataBinX = BinX with Data 100 1000 5.257 1 2

19 e-Science Data Information and Knowledge Transformation The BinX Library Core library Utilities Applications

20 www.edikt.org Output from the library DataBinX combined data and BinX document SchemaBinX Binary data stream DataBinX = SchemaBinX + Binary data

21 www.edikt.org BinX Components The library has core functionality to support generic utilities and applications Applications Utilities BinX Library Core BinX core functionality Parse/Gen BinX doc Read/write binary data Parse/Gen DataBinX Generic tools DataBinx pack/unpack Extractor Applications Domain-specific

22 www.edikt.org BinX application models Data manipulation model Data transportation model Data service model Data query model Data catalogue model

23 www.edikt.org Data manipulation model Extraction –Subset of a dataset Combination –Merge several datasets Transformation –Conversion of data types –Change of sequence order –Transposition of array dimensions Transparency –Automatic change of byte order

24 www.edikt.org Data transportation model DataBinX as interlingua XML document XML document DataBinX Schema BinX Schema BinX + Binary BinX + Binary ZIP (MIME) ZIP (MIME) XSLT BinX Util ZIP tool Send Receive XSLT BinX Util ZIP tool

25 www.edikt.org Data service model Publishing logical datasets in BinX DB 0101 0101 01 Client BinX Grid 0101 0101 01 BinX Dataset from one binary file Dataset from several binary files Dataset from multiple data sources

26 www.edikt.org Data query model Create DataBinX –From Binary and BinX Query DataBinX –Use XPath Create New DataBinX –Results from query Parse DataBinX –Create new Binary and BinX 010101010 BinX + Binary BinX + Binary DataBinX XPath New DataBinX New DataBinX 010101010 BinX + Binary BinX + Binary

27 www.edikt.org Data catalogue model Primary storage Binary data files Metadata Syntactic annotation Semantic annotation Classification Domain specific Cross-reference XLink 0101 0101 01 BinX 1.1 BinX 1.1 BinX 1.2.1 BinX 1.2.1 BinX 1.2.2 BinX 1.2.2 BinX 1.2.3 BinX 1.2.3 0101 0101 01 BinX 1.2 BinX 1.2 BinX 1 BinX 1 BINARY Detailed Abstract METADATA

28 e-Science Data Information and Knowledge Transformation Application in Astronomy Case Study Data Conversion Between FITS and VOTable

29 www.edikt.org Application in astronomy FITS and VOTable conversion DataBinX Utility BinX library Core SIMPLE = T … END 01010101 SIMPLE = T … END 01010101 <?xml version=. … <?xml version=. …

30 www.edikt.org FITS file SIMPLE = T / file does conform to FITS standard BITPIX = 8 / number of bits per data pixel NAXIS = 1 / number of data axes … END 3D 4A 14 0F 1C FE 25 04 … … XTENSION= BINTABLE / binary table extension BITPIX = 8 / 8-bit bytes NAXIS = 2 / 2-dimensional binary table … END 7B 3E 40 2C 16 70 E7 6F … … 0 79 Primary HDU Extension Header Data

31 www.edikt.org VOTable Procyon 114.827 5.227 4 5 3 4 3 2 1 2 3 3 5 6

32 www.edikt.org FITS DataBinX VOTable FITS to VOTable conversion DataBinX Utility FITS Schema BinX Schema BinX Preprocessor DataBinX VOTable XSLT transformer

33 www.edikt.org VOTableDataBinXFITS VOTable to FITS conversion XSLT transformer VOTable XSLT DataBinX FITS Schema BinX Schema BinX DataBinX Utility Binary Data Binary Data Post processor FITS Header FITS Header

34 www.edikt.org Support Information and software download: –http://www.edikt.org/binxhttp://www.edikt.org/binx Questions: –support@edikt.orgsupport@edikt.org Requirements and suggestions: –tedwen@edikt.orgtedwen@edikt.org –robertc@edikt.orgrobertc@edikt.org

35 e-Science Data Information and Knowledge Transformation BinX API

36 www.edikt.org Parsing a BinX document BxBinxFile* pReader = new BxBinxFile(); If (pReader->parse(mybinx.xml)) { BxDataset* pDataset = pReader->getDataset(); }

37 www.edikt.org Reading a BinX document BxArrayFixed* pArray = pDataset->getArray(0); BxArrayFixed* pArray = pDataset- >getArray(fixed); Get an array object BxDataset* pStruct = pArray->get(0, 0); Get a struct from the array

38 www.edikt.org Reading a BinX document BxFloat32* pReal = pStruct- >getFloat(Real); Float real = pReal->getFloat(); Get the data value

39 www.edikt.org Creating BinX document BxBinxFileWriter* pWriter = new BxBinxFileWriter(); Create a object to write out the document BxDataset* pData = new BxDataset(); Create a new dataset (in memory BinX document) BxShort16* i16 = new BxShort16(100); pData->addDataObject(i16);

40 www.edikt.org Creating BinX document BxBinaryFile* pbf = new BxBinaryFile(); Create a new binary file pbf->setDatasetPointer(pData); Create a link to the BinX document pWriter->setBinaryFilePtr(pbf); pWriter->save("TestDataset.xml"); Save the BinX document

41 www.edikt.org Merge binary data BxBinxFileReader * pFile1 = new BxBinxFileReader(file1.xml); BxBinxFileReader * pFile2 = new BxBinxFileReader(file2.xml); BxDataset * pDataset1 = pFile1->getDataset(); BxDataset * pDataset2 = pFile2->getDataset(); BxArray * pArray1 = pDataset1->getArray(0); BxArray * pArray2 = pDataset2->getArray(0); BxDataObject * pData1 = pArray1->getNext(); BxDataObject * pData2 = pArray2->getNext(); FILE * fo = fopen(output.dat,wb); pData1->toStreamBinary(fo); pData2->toStreamBinary(fo);

42 www.edikt.org Summary One BinX document can describe many binary files Generate BinX document from code Easy to use interfaces Flexible


Download ppt "E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen"

Similar presentations


Ads by Google