Physical, Logical, Conceptual DSA Lecture
Abstraction Layers Conceptual –What data is held An Image and its meta-data Entity-Relationship model (ERM) Logical –How data is organised in storage Block and Directory structure Tables, keys Physical –How data is stored in bits JPEG as a stream of bytes A Database as files and records stored in a DBMS-specific format Abstraction Realisation (Refinement Reification) (Reverse Engineering) (Engineering, Model-Driven development
On Flickr More.. Image Data API:
FireFox IE
Opera Browser properties
JPEG and Exif This image uncompressed would be –1520 x 2032 x 3 bytes = – –File size is only 342kb –Compression ratio JPEG achieves high compression –Lossy – cannot recover the full raw data –Level of compression is variable JPEG –Joint Photographic Experts Group developed the standard for compression. –Defines how to create a byte stream for the image alone. –Wikipedia entry describes the algorithm. –JPEG is a BSI standard BS ISO/IEC :2003 – we have license for these. JPEG/JFIF –Defines how these bytes are packaged up unto a file for transmission. –This combines a JPEG compressed image (the ‘data’) with ‘meta-data’ (data about the image) in TIFF format (Tagged Image File Format) Exif defines a set of tags and coded values to define this meta-dataExif –Specification is 154 pages
Meta data “Data about Data” –The image is the Data –The make of the camera is metadata The dangers –Thumbnails not updated with the main image –Date/time, location information
Understanding the Physical Layer Description of the implementation –Standard manual –Informal explanation Hex viewer or editoreditor –XVI32 (free)XVI32 –010 editor (30 day free, $49.95 US license)010 editor Issues –How numbers are stored –How characters are stored –How strings are stored –How data is identified –How data is grouped –..
HEX dump of the file
Hexadecimal Hex – 6 in Latin – Hexagon Deca – 10 in Latin – Decimal Hexadecimal –Base bits –Digits 0 – F (0 -15) Binary DecimalHex A B C D E F
Hexadecimal - Decimal Nibble –4 bits –1 hex character Byte –8 bits –2 nibbles –2 hex characters D8 – 13 x = ?? 3C FA – –3 x 16^ x 16 ^ x – ((3 x ) x ) x = ??
All JPEGs start with these bytes Marker : Start of Block Length of Block ‘Exif’ header Intel Byte Order Tag No - 42 Offset to IFD (Image File Directory): 8 bytes Start of Data Start of IFD
Big-endian / Little-endian Big-endian –Bytes in the order from most significant to least significant –3C FA = 3C x FA = –Motorola –coded MM in Exif Little-endian –Bytes in the order from least significant to most significant –2A 00 = 00 x A = 32 –Intel processors –coded II in Exif Endianness –Affects addresses and dates –UK addresses are little-endian, Japanese big-endian – addresses little-endian, File paths big-endian –IP addresses ?
No of directory entries (11) Tag 010F ‘Make’ Data Type (string) String Length (24) Relative Offset = 92 CAMERA spaces Null (end of C string) ‘Make’ Value (24 bytes) First entry Absolute Position =Offset + start(0C)
Exercise Each directory entry is 12 bytes long Entry Type is coded – see table Tags are coded Decode 3 entries - distributed around the class
Extract from the EXIF standard
Logical structure
Conceptual Model
Workshop Flickr Review –Review approach to research – see the blog for ideas blog –Get into groups Physical layer work –The XVI32 editor can be downloaded from the autor’s site or run from the J: drive in J:/xvi –Find a Jpeg and identify some items of metadata The lecture example is linked herehere –Find a MP3 file – it probably contains ID3 metadata – see the ID3 site for documentation – and identify some metadata. –Why don’t ID3 and Exif use the same logical structure using directories?