1 / 42

BinX – A Tool for Binary File Access

BinX – A Tool for Binary File Access. eDIKT project team Ted Wen tedwen@edikt.org Robert Carroll robert.carroll@edikt.org. Agenda. About the BinX project Introduction to the BinX language Introduction to the BinX library Example application Overview of the BinX API Discussion.

ayanna
Download Presentation

BinX – A Tool for Binary File Access

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BinX – A Tool for Binary File Access eDIKT project team Ted Wen tedwen@edikt.org Robert Carroll robert.carroll@edikt.org

  2. Agenda • About the BinX project • Introduction to the BinX language • Introduction to the BinX library • Example application • Overview of the BinX API • Discussion

  3. The problem • Most scientific data are in binary files • Binary data files are not all standardized • Binary data files are platform-dependent • XML is useful to represent metadata • Scientific datasets can be too large in XML

  4. What is BinX? • Binary inXML • Annotation language • Using XML • Descriptive • Low-level • Software components • BinX library • Generic utilities • API

  5. 0101010101010101010100010000101110101010101010101010110 0101010101 0101010101 How and Why BinX is used Special Application Program Application Program Application Program BinX Library <dataset> … … </dataset> Application Program

  6. The BinX Language Annotating a binary data stream Mark up data types Mark up sequences Mark up arrays Complex structures

  7. Primitive data elements Byte, character, integer, real Complex data elements Arrays, struct, union User-defined data elements Data elements

  8. Primitive Data Types • Character • <character-8> • <string> (Fixed length, variable length and delimited) • Integer • <byte-8> • <short-16>, <unsignedShort-16> • <integer-32>, <unsignedInteger-32> • <long-64>, <unsignedLong-64> • Real • <float-32> • <double-64> • <quadruple-128>

  9. Primitive Data Types • Mark up data types FF 7F7F FF FF FF 00 00 C8 4242 C8 00 00 1 2 3 4 • <short-16 byteOrder=“littleEndian”> 32767</short-16> • <integer-32 byteOrder=“bigEndian”> 2147483647</integer-32> • <float-32 byteOrder=“littleEndian”>100.0</float-32> • <float-32 byteOrder=“bigEndian”>100.0</float-32>

  10. Abstract “struct” types • Mark up a sequence Screen descriptor in GIF: Screen width: unsigned short; Screen height: unsigned short; Packed field: a byte Background colour index: byte Pixel aspect ratio: byte <struct> <unsignedShort-16 /> <unsignedShort-16 /> <byte-8 /> <byte-8 /> <byte-8 /> </struct>

  11. Abstract “array” types • Mark up an array A 2-dimensional array containing 10-by-100,32-bit integers <arrayFixed> <integer-32 /> <dim indexTo=“99”> <dim indexTo=“9” /> </dim> </ arrayFixed >

  12. Embedded abstract types • Complex structures <struct> <short-16 /> <arrayFixed> <byte-8 /> <dim indexTo=“7” /> </arrayFixed> <struct> <integer-32 /> <float-32 /> <double-64 /> </struct> </struct>

  13. User-defined metadata • Label the data types and structures <struct varName=“Data Sample”> <short-16 varName=“ID” /> <arrayFixed varName=“List of 10 complex numbers”> <struct varName=“Complex”> <float-32 varName=“Real” /> <float-32 varName=“Imaginary” /> </struct> <dim indexTo=“9” /> </arrayFixed> </struct>

  14. Reusable type definitions • Define macros for reuse <definitions> <defineTypetypeName=“FourCC”> <arrayFixed> <character-8 /> <dim count=“4” /> </arrayFixed> </defineType> </definitions> <struct varName=“Wave_Header”> <useTypetypeName=“FourCC” varName=“Keyword” /> <integer-32 varName=“Chunk_Size” /> </struct>

  15. Linking to binary data • Reference the binary data file <definitions> <defineType typeName=“Header”>… …</defineType> <defineType typeName=“Format_Chunk”>… …</defineType> <defineType typeName=“Data_Chunk”>… …</defineType> </definitions> <datasetsrc=“myfile.wav”> <useType typeName="Header" /> <useType typeName="Format_Chunk" /> <useType typeName="Data_Chunk" /> </dataset>

  16. The BinX document <?xml version=“1.0”?> <binx xmlns=“http://www.edikt.org/binx”> <dataset src=“binary.bin” byteOrder=“littleEndian”> <short-16/> <integer-32/> <double-64/> </dataset> </binx>

  17. A BinX document • <binxbyteOrder=“bigEndian”> • <definitions> • <defineType typeName=“myTyp”> • <arrayFixed> • <character-8/> • <dim indexTo=“9”/> • </arrayFixed> • </defineType> • </definitions> • <datasetsrc=“myfile.bin”> • <useType typeName=“myTyp”/> • <integer-32 varName=“X” /> • </dataset> • </binx> Root element Data class section Abstract data type Data instance section

  18. DataBinX DataBinX = BinX with Data <dataset src=“myfile.bin”> <struct> <short-16 /> <long-64 /> <double-64 /> </struct> <arrayFixed> <integer-32 /> <dim count=“2” /> </arrayFixed> </dataset> <dataset> <struct> <short-16>100</short-16> <long-64>1000</long-64> <double-64>5.257</double-64> </struct> <arrayFixed> <dim> <integer-32>1</integer-32> </dim> <dim> <integer-32>2</integer-32> </dim> </arrayFixed> </dataset>

  19. The BinX Library Core library Utilities Applications

  20. Output from the library • DataBinX combined data and BinX document • SchemaBinX • Binary data stream DataBinX = SchemaBinX + Binary data

  21. BinX Components • The library has core functionality to support generic utilities and applications Applications BinX core functionality Parse/Gen BinX doc Read/write binary data Parse/Gen DataBinX Utilities BinX Library Core Generic tools DataBinx pack/unpack Extractor Applications Domain-specific

  22. BinX application models • Data manipulation model • Data transportation model • Data service model • Data query model • Data catalogue model

  23. Data manipulation model • Extraction • Subset of a dataset • Combination • Merge several datasets • Transformation • Conversion of data types • Change of sequence order • Transposition of array dimensions • Transparency • Automatic change of byte order

  24. BinX + Binary Schema BinX Data transportation model DataBinX as interlingua XSLT BinX Util ZIP tool Send Receive XML document DataBinX ZIP (MIME) XSLT BinX Util ZIP tool

  25. Data service model • Publishing logical datasets in BinX 0101010101 Dataset from multiple data sources DB BinX 0101010101 0101010101 0101010101 0101010101 BinX BinX Grid Dataset from several binary files Dataset from one binary file Client

  26. BinX + Binary BinX + Binary 010101010 010101010 Data query model • Create DataBinX • From Binary and BinX • Query DataBinX • Use XPath • Create New DataBinX • Results from query • Parse DataBinX • Create new Binary and BinX DataBinX XPath New DataBinX

  27. Data catalogue model Abstract BinX 1 Primary storage Binary data files Metadata Syntactic annotation Semantic annotation Classification Domain specific Cross-reference XLink BinX 1.2 METADATA BinX 1.1 BinX 1.2.1 BinX 1.2.2 BinX 1.2.3 Detailed 0101010101 0101010101 0101010101 0101010101 BINARY

  28. Application in Astronomy Case Study Data Conversion Between FITS and VOTable

  29. Application in astronomy • FITS and VOTable conversion DataBinX Utility BinX library Core SIMPLE = T … … END 01010101 <?xml version=. <VOTABLE> … … </VOTABLE>

  30. FITS file 0 79 Primary HDU Header Data Extension Header Data

  31. VOTable <VOTABLE> <RESOURCE> <PARAM name=“Obs” value=“Bob”/> <TABLE name=“Stars”> <FIELD name=“Star-name” datatype=“char” arraysize=“10” /> <FIELD name=“RA” datatype=“float” /> <FIELD name=“Dec” datatype=“float” /> <FIELD name=“Counts” datatype=“int” arraysize=“2x3x*” /> <DATA> <TABLEDATA> <TR> <TD>Procyon</TD><TD>114.827</TD><TD>5.227</TD> <TD>4 5 3 4 3 2 1 2 3 3 5 6</TD> </TR> </TABLEDATA> </DATA> </TABLE> </RESOURCE> </VOTABLE>

  32. FITS →DataBinX →VOTable • FITS to VOTable conversion DataBinX Utility FITS XSLT transformer DataBinX Schema BinX Preprocessor XSLT VOTable

  33. VOTable→DataBinX→FITS • VOTable to FITS conversion Schema BinX VOTable DataBinX Utility DataBinX XSLT transformer Binary Data Post processor FITS Header XSLT FITS

  34. Support • Information and software download: • http://www.edikt.org/binx • Questions: • support@edikt.org • Requirements and suggestions: • tedwen@edikt.org • robertc@edikt.org

  35. BinX API

  36. Parsing a BinX document BxBinxFile* pReader = new BxBinxFile(); If (pReader->parse(“mybinx.xml”)) { BxDataset* pDataset = pReader->getDataset(); }

  37. Reading a BinX document BxArrayFixed* pArray = pDataset->getArray(0); BxArrayFixed* pArray = pDataset->getArray(“fixed”); • Get an array object BxDataset* pStruct = pArray->get(0, 0); • Get a struct from the array

  38. Reading a BinX document BxFloat32* pReal = pStruct->getFloat(“Real”); Float real = pReal->getFloat(); • Get the data value

  39. Creating BinX document BxBinxFileWriter* pWriter = new BxBinxFileWriter(); • Create a object to write out the document BxDataset* pData = new BxDataset(); • Create a new dataset (in memory BinX document) BxShort16* i16 = new BxShort16(100); pData->addDataObject(i16);

  40. Creating BinX document BxBinaryFile* pbf = new BxBinaryFile(); • Create a new binary file pbf->setDatasetPointer(pData); • Create a link to the BinX document pWriter->setBinaryFilePtr(pbf); pWriter->save("TestDataset.xml"); • Save the BinX document

  41. Merge binary data BxBinxFileReader * pFile1 = new BxBinxFileReader(“file1.xml”); BxBinxFileReader * pFile2 = new BxBinxFileReader(“file2.xml”); BxDataset * pDataset1 = pFile1->getDataset(); BxDataset * pDataset2 = pFile2->getDataset(); BxArray * pArray1 = pDataset1->getArray(0); BxArray * pArray2 = pDataset2->getArray(0); BxDataObject * pData1 = pArray1->getNext(); BxDataObject * pData2 = pArray2->getNext(); FILE * fo = fopen(“output.dat”,”wb”); pData1->toStreamBinary(fo); pData2->toStreamBinary(fo);

  42. Summary • One BinX document can describe many binary files • Generate BinX document from code • Easy to use interfaces • Flexible

More Related