Analysing Image Files Michael Jones
Overview Images and images Binary, octal, hexadecimal File headers and footers Example (image) files Looking for more information Michael JonesAnalysing Image Files2
Review: Locard’s (Exchange) Principle Dr Edmond Locard ( ) Quote (Paul Kirk) – Wherever he steps, whatever he touches, whatever he leaves, even unconsciously, will serve as a silent witness against him Michael JonesAnalysing Image Files3
Images and Images Common usage: ‘image’ = ‘picture’ In digital forensics – ‘image’ is an bit-by-bit copy of a digital device – Note: it is not an exact copy The physical structure of the device is not replicated Michael JonesAnalysing Image Files4
Phases of a Digital Investigation Secure the scene * Capture the evidence * – E.g., computers, devices Transfer to a secure store Create and verify images (bit-by-bit copies) Analyse the copies Produce reports * usually conducted by scenes of crime officers Michael JonesAnalysing Image Files5
Verifying Images After a copy has been made, it is important that the copy be verified against the original Technique: ‘hashing’ Method: – An algorithm is applied to the original and then to the copy – The output from the algorithm is a ‘checksum’ or a ‘hash’ – Is the hashes match, then the copy is a true copy of the original Michael JonesImage Files6
Issues with Hashing Algorithms A hashing algorithm is no good if: – The output (hash) can be predicted Why? Common hashing algorithms: – MD4, MD5 (Message Digest) Are considered insecure (sort of) – SHA-1, SHA-256, etc. (Secure Hash Algorithm) Michael JonesImage Files7
Security of Hashes Example: MD5 – Output is a 32 character hex string – The chances of 2 sources resulting in the same hash: 32 16, which is around 1 in Equivalent to around a hundredth of the number of atoms in the universe – But… The output can be predicted to an extent – With a huge amount of computing power and some time Michael JonesAnalysing Image Files8
Solution: 2 hashes If 2 algorithms are applied to the original and the copy, then manipulation becomes impossible Note – The hashes from the 2 algorithms will not match Michael JonesAnalysing Image Files9
Files and File Systems A file contain data – Office documents, image files, etc. Files are organised in a file system – Generally hierarchical in folders/directories How they are organised varies Example file systems – Windows: FAT, FAT32, NTFS – OS X: HFS, HFS+ – Linux: ext2, ext3, ext4 Michael JonesAnalysing Image Files10
Why more than one File System? Files are changed – May not be able to be restored to the same place Files are created and deleted Directories need to expand and contract Question: how to organise the file system: – To handle file creation, change, deletion – To enable fast access to files – To minimise the need for reorganisation Michael JonesAnalysing Image Files11
Binary, octal, hexadecimal Binary: base 2 – symbols 0 and 1 Octal: base 8 – symbols 0 to 7 Hexadecimal: base 16 – symbols 0 to 9 and A to F As octal and hexadecimal are powers of 2, binary can be directly converted to these and vice versa bits of binary > (octal) > 6D (hexadecimal) Michael JonesAnalysing Image Files12
Why is Hexadecimal important? Viewing binary is painful – Too many digits (bits) – Only 2 symbols Most computers use bytes (8 bits) By grouping these as 2 x 4 bits, each byte can be represented by 2 hexadecimal digits – Note: can use lowercase: 6d Michael JonesAnalysing Image Files13
File Headers and Footers All file types (formats) have a defined header – Most have a defined footer So the extension is often unimportant – Except for Windows file associations The header contains – File type identifier – Metadata Michael JonesAnalysing Image Files14
Why not have a Footer? The footer defines the end of the file If the header contains information about the length of a file, then no need for a footer Why does this matter? – No footer makes it more difficult to identify files You need to decode the header Michael JonesAnalysing Image Files15
File Carving Process of extracting files from a larger file Why? – Suppose files have been deleted, and an image taken of the file system – Many files will be contiguous – Problems if they are not Process: – Find the first header – Find the footer, or the end of file or the next header Michael JonesAnalysing Image Files16
Example file format: JPEG Joint Photographic Experts Group Compression of digital images – Header – FF D8 Footer – FF D9 Michael JonesAnalysing Image Files17
Consider this The rest of the file contains: – Metadata – Colour table – Compressed data What is the chance of FF D9 in 2 successive bytes – 1 in 256 x 256 = 1 in 65,536 Compression algorithm must ensure this sequence does not occur – Which it does - simply Michael JonesAnalysing Image Files18
Hiding Data in a JPEG Problem: compression – If data is added to image before compression Data might be corrupted when compression occurs – If data is added afterwards, how can we control side-effects? Possible solution: hiding data in metadata – EXIF information Can use existing fields, or additional ones Possible solution: adding data after the footer Michael JonesAnalysing Image Files19
Other Image File Types GIF (Graphics Interchange Format) – Created by Compuserve – Lossless format – Headers: GIF87a and GIF89a (in hex) – Has a footer – but care needed PNG – Portable Network Graphics – Open (lossless) format – Header: 89 (hex) PNG (in hex) – Has a footer Michael JonesAnalysing Image Files20
Computer Crime: Ransomware Michael JonesAnalysing Image Files21
Debunking Myths: Surveillance Michael JonesAnalysing Image Files22
Carving and Blurring Suppose someone has had a problem with his camera – And photographs are corrupted On examination, the footer of a JPEG has been corrupted – 2 files might appear as 1 Boundary between files is blurred – Can we ‘un-blur’ the files and present this to the court? Michael JonesAnalysing Image Files23
Example The byte sequence is: – FF D9 … FF D7 FF D8 … FF D9 What could be seen in a viewer? If we change the second byte to ‘D9’, then 2 files could be ‘created’ – What is the legal status of doing this? Michael JonesAnalysing Image Files24
Issues with File Carving Non-contiguous files – How likely is this? Why is this a problem? What solutions could be possible? – Think about the standard sector size (512 bytes) Michael JonesAnalysing Image Files25
Summary File carving is at the heart of digital forensics It allows a physical analysis of the data – Compared with a logical view of the data The process involves finding headers and footers – Extracting (carving) files from a digital ‘image’ Image here means bit-by-bit copy of the data (e.g., disk or SD card) Michael JonesAnalysing Image Files26