Download presentation
Presentation is loading. Please wait.
1
The Anatomy of a Zip File
How to Carve/Rebuild Zip Files by Hand By Jonathan Glass
2
Reason for this Presentation
Recently, I was charged with creating an forensic challenge at work that focused on data loss. The scenario included a several instances of data exfiltration but one specifically seemed more challenging than most. A portion of the challenge included a zip file that was uploaded directly from a mapped network drive to Google Docs. Participants were given only the memory dump and a dd of the workstation hard drive to work with. The file was never logically written directly to the C:\ drive. No other information was provided. So I will attempt to share my disjointed process of recovering the contents of the zip file without using any prior knowledge of the file.
3
First strings first. I needed a file name.
strings -td mem.dmp | grep –i docs\.google\.com ({"id":"0","rt":"3","rd":[{"version":1,"type":"change","payload":"{\"action s\":[{\"actionCategory\":\"open\",\"minimumRole\":20,\"url\":\" .google.com/file/d/0B5oGkhb7v8qKSmh3S25MMDQxTHc/edit?usp\\u003ddrive_web\"} ],\"attributes\":{\"blob_versionable\":true,\"collaboratorsCanInvite\":true ,\"copyable\":true,\"downloadable\":true,\"mine\":true,\"shareable\":true,\ "subscribed\":true},\"cosmoType\":\"DoclistBlob\",\"fileExtension\":\"zip\" ,\"fileSize\":153847,\"fileSizeFormatted\":\"153,847 bytes\",\"filters\":[\"items\"],\"id\":\"0B5oGkhb7v8qKSmh3S25MMDQxTHc\",\"l 518\",\"me\":true,\"name\":\"Hacker Jacks\"},\"lastEditedText\":\"2:52 am\",\"lastEditedUtc\": ,\"lastModByMeText\":\"2:52 am\",\"lastModByMeUtc\": ,\"mav\":0,\"mimeType\":\"application/ zip\",\"myRole\":40,\"name\":\"DocumentsToRecover.zip\",\"nameKey\":[47,69, 45,81,65,49,67,79,77,79,69,75,49,45,69,83,49,75,9,91,57,71,1,26,1,26,0],\"o e\":true,\"name\":\"Hacker Jack“… Among other interesting information, I found File Name : DocumentsToRecover.zip File Size :
4
Looking for evidence of the file in memory…
strings -td mem.dmp |grep -i documentstorecover DocumentsToRecover.zip DocumentsToRecover/ThirdFileToRecover.txt DocumentsToRecover/FirstFileToRecover.txt DocumentsToRecover/ThirdFileToRecover.txt DocumentsToRecover/SecondFileToRecover.txt DocumentsToRecover/Seconq tion/DocumentsToRecover/y DocumentsToRecover DocumentsToRecover H:\Documentation\DocumentsToRecover.zip DocumentsToRecover DocumentsToRecover DocumentsToRecover.zip.lnk DocumentsToRecover.zip.lnk DocumentsToRecover/EighthFileToRecover.txtPK DocumentsToRecover/FifthFileToRecover.txtPK DocumentsToRecover/FirstFileToRecover.txtPK DocumentsToRecover/FourthFileToRecover.txtPK DocumentsToRecover/SecondFileToRecover.txtPK DocumentsToRecover/SeventhFileToRecover.txtPK DocumentsToRecover/SixthFileToRecover.txtPK DocumentsToRecover/ThirdFileToRecover.txt Bingo! The file is still in memory. Now what?
5
Lets take a step back and look see what a zip file looks like…
B E2 BD F2 PK bE". B r Fi C 65 2E CB CC 49 0D C9 28 4A 4D 75 le.txts..I..(JMu CE CF 2B 49 CD 2B 29 E6 E B I.+)...A.PK.. E2 BD F2 B bE"..r C 65 2E File.t B xtPK C <..... This is a single file inside of zip.
6
Simple Summary of Zip File Parts
B E2 BD F2 PK bE". B r Fi C 65 2E CB CC 49 0D C9 28 4A 4D 75 le.txts..I..(JMu CE CF 2B 49 CD 2B 29 E6 E B I.+)...A.PK.. E2 BD F2 B bE"..r C 65 2E File.t B xtPK C <..... Local File Header – Each file in the zip gets a local File Data – The Compressed/Encrypted Contents Of The File Central Directory – Summarizes Local File Descriptors And Contains Additional Info
7
B A9 98 6B 45 FB 98 PK kE.. A Fi C E C 65 4F 6E F le1.txtFileOneCo E E D 0A 50 4B ntents!!!..PK... A4 98 6B DF 9A kEc..E... C E File2.t C F 43 6F 6E E 74 xtFileTwoContent D 0A 50 4B s!!!..PK E2 BD F2 B bE"..r C E CB CC ....File3.txts.. 000000A D C9 28 4A 4D 75 CE CF 2B 49 CD 2B 29 E6 E5 I..(JMu..+I.+).. 000000B B A9 .A.PK 000000C B 45 FB kE..A 000000D 000000E C E B File1.txtPK.... 000000F A4 98 6B DF 9A kEc..E.. B C E ;...File2.tx B E2 BD 62 tPK b F2 B E"..r v...F C E B ile3.txtPK...... A B Zip File with 3 Files
8
Local File Header B E2 BD F2 PK bE" B r Fi C 65 2E le.txt
9
Reading the Local File Header
B E2 BD F2 PK bE" B r Fi C 65 2E le.txt Signature Version Flags Compression method File modification time File modification date CRC-32 checksum Compressed size Uncompressed size File name length Extra field length File name 0x04034b50 (read as a little-endian number) Major Ver. 2.0 (14 HEX = 20 Decimal/10) Minor Ver. 0 None 08: deflated 23:47:02 SEE NEXT SLIDE 11/2/2014 SEE NEXT SLIDE 0x72B9F checksum 16 = 22 bytes 17 = 23 bytes 8 chars F i l e . t x t N/A “File.txt”
10
MSDOS Timestamp in 2 minutes
E2 BD E2 BD = BD E2 little endian BD E2 = 10111(23) (47) 00010(2) = 23:47: = little endian = (34) 1011(11) 00010(2) = 11/2/ (34)represents the years since
11
File headers & file data get stacked upon each other until we get to the Central Directory (The standard supports additional fields depending upon how the options used to create the archive. I’m just hitting the highlights.) Local File1 Header File 1 Data Local File2 Header File 2 Data Local File 3 Header File3 Data . . . Local File N Header File N Data Central Directory
12
Central Directory The central directory contains more metadata about the files in the archive and also contains encryption information and information about Zip64 (64-bit zip archives) archives. Furthermore, the central directory contains information about archives that span multiple files. At the end of the File! This is can be problematic for exfiltrating large archives over sketchy connections (Tor). This is why attackers often use the RAR format which includes the metadata in the beginning of the file. This allows for some content to be recovered even if only part of the archive is received.
13
Central Directory
14
Central Directory 000000B B A9 .A.PK 000000C B 45 FB kE..A 000000D 000000E C E B File1.txtPK.... 000000F A4 98 6B DF 9A kEc..E.. B C E ;...File2.tx B E2 BD 62 tPK b F2 B E"..r v...F C E B ile3.txtPK...... A B File header 1 File header 2 File header 3 End of Central Directory Record
15
Central Directory File Header Format
Just as before, not all fields are required.
16
Central Dir Vs Local Headers
Fields in the Central Directory Header no present in Local Headers File comm. Len: the length of the file comment Disk # start: the number of the disk on which this file exists Internal attr: Internal file attributes External attr: External file attributes (operating system specific) Offset of local header: Relative offset of local header. This is the offset of where to find the corresponding local file header from the start of the first disk. Extra field File Comment
17
End of central directory record
Signature The signature of end of central directory record. This is always '\x50\x4b\x05\x06'. Disk Number The number of this disk (containing the end of central directory record) Disk # w/cd Number of the disk on which the central directory starts Disk entries The number of central directory entries on this disk Total entries Total number of entries in the central directory. Central directory size Size of the central directory in bytes Offset of cd wrt to starting disk Offset of the start of the central directory on the disk on which the central directory starts Comment len The length of the following comment field ZIP file comment Optional comment for the Zip file
18
Now back to our case… strings -td mem.dmp |grep -i documentstorecover
DocumentsToRecover.zip DocumentsToRecover/ThirdFileToRecover.txt DocumentsToRecover/FirstFileToRecover.txt DocumentsToRecover/ThirdFileToRecover.txt DocumentsToRecover/SecondFileToRecover.txt DocumentsToRecover/Seconq tion/DocumentsToRecover/y DocumentsToRecover DocumentsToRecover H:\Documentation\DocumentsToRecover.zip DocumentsToRecover DocumentsToRecover DocumentsToRecover.zip.lnk DocumentsToRecover.zip.lnk DocumentsToRecover/EighthFileToRecover.txtPK DocumentsToRecover/FifthFileToRecover.txtPK DocumentsToRecover/FirstFileToRecover.txtPK DocumentsToRecover/FourthFileToRecover.txtPK DocumentsToRecover/SecondFileToRecover.txtPK DocumentsToRecover/SeventhFileToRecover.txtPK DocumentsToRecover/SixthFileToRecover.txtPK DocumentsToRecover/ThirdFileToRecover.txt File names all in a row? Reminds me of Local File Headers File names followed by “PK” all in a row? Looks like Central Directory File Headers to me.
19
Looking for evidence of the file in memory…
strings -td mem.dmp |grep -i documentstorecover DocumentsToRecover.zip DocumentsToRecover/ThirdFileToRecover.txt DocumentsToRecover/FirstFileToRecover.txt DocumentsToRecover/ThirdFileToRecover.txt DocumentsToRecover/SecondFileToRecover.txt DocumentsToRecover/Seconq tion/DocumentsToRecover/y DocumentsToRecover DocumentsToRecover H:\Documentation\DocumentsToRecover.zip DocumentsToRecover DocumentsToRecover DocumentsToRecover.zip.lnk DocumentsToRecover.zip.lnk DocumentsToRecover/EighthFileToRecover.txtPK DocumentsToRecover/FifthFileToRecover.txtPK DocumentsToRecover/FirstFileToRecover.txtPK DocumentsToRecover/FourthFileToRecover.txtPK DocumentsToRecover/SecondFileToRecover.txtPK DocumentsToRecover/SeventhFileToRecover.txtPK DocumentsToRecover/SixthFileToRecover.txtPK DocumentsToRecover/ThirdFileToRecover.txt Bingo! The file is still in memory. Now what?
20
My Game Plan As far I can tell I am looking at a zip file containing 8 files inside one directory: DocumentsToRecover DocumentsToRecover/EighthFileToRecover.txt DocumentsToRecover/FifthFileToRecover.txt DocumentsToRecover/FirstFileToRecover.txt DocumentsToRecover/FourthFileToRecover.txt DocumentsToRecover/SecondFileToRecover.txt DocumentsToRecover/SeventhFileToRecover.txt DocumentsToRecover/SixthFileToRecover.txt DocumentsToRecover/ThirdFileToRecover.txt Given the space between the Local File Headers and the Central Directory Headers, I am guessing this zip file is not in one contiguous chunk. Grabbing the entire zip file seems improbable.
21
My Game Plan Grab each file individually by creating 8 zip files.
Carve each file’s compressed File Data Create/Carve the Local File Header Create /Carve the Central Directory Unzip Repeat Sounds easy enough, right? **I have since discovered better & faster methods of accomplishing this but this way is the most educational. Lets do it the long way first.
22
Looking for Headers strings -td /mnt/hgfs/DEMO/zip/mem.dmp | grep -i FirstFileToRecover.txt DocumentsToRecover/FirstFileToRecover.txtPK No local header. strings -td /mnt/hgfs/DEMO/zip/mem.dmp | grep -i SecondFileToRecover.txt DocumentsToRecover/SecondFileToRecover.txtPK DocumentsToRecover/SecondFileToRecover.txt}SKn BOTH HEADERS!
23
SecondFileToRecover.txt Seek to the offset of the local header
Copy local header to new file.
24
SecondFileToRecover.txt Seek to the offset of the local header Copy local header to new file. Look at the size of the compressed file 0x021D = 541 bytes Seek 541 bytes from the end of the local header…
25
Start of another local header
Copy 541 bytes(File Data) from the end of the local header and paste to the new file Now we need the build the footer Compressed File Data Start of another local header
26
Seek to the offset of the file namePK we found earlier Copy Central Directory Header to new file
27
Everything should be good in the CD header except for…
We need to change the local header offset to 0x0000 because this is the first/only file in the archive. The local header starts at the beginning of the file. Now we need a End of central directory record to make a complete zip file.
28
Let’s Just Build One B PK X e..... Signature The signature of end of central directory record. This is always '\x50\x4b\x05\x06'. Disk Number Needs to be 00 because this is the only/first central directory for this archive. Disk # w/cd Disk entries This needs to be set 01 because there is only one central directory for this archive. Total entries Total number of entries in the central directory is 01 because we only have one file Central directory size The central directory header is 88(58 in HEX) bytes long. Offset of cd wrt to starting disk Local Header (72 bytes) + File Data (541 bytes) = 613 (265 HEX) (0x6502) Little Endian Comment len No comment needed 00
29
Add End of central directory record to the end of the new file and save!
Also cross fingers
30
Extraction works!
31
Don’t feel like Looking For/Creating a Central Directory? Don’t!
All of the information you need to decompress the zip is present in the local header file. If you have the local header and the file data, you don’t need the Central Directory! Although not every ZIP decompressor will use local file headers when the index is unavailable, the tar and cpio front ends to libarchive (bsdtar and bsdcpio) can do streaming decompression when reading through a pipe It will generate errors but it does work. Great for memdumps and pcaps To install on SIFT “apt-get install bsdtar”
32
3 Local File Headers & File Data but NO CD
Still Worked
33
Why is it important to understand Zip files?
Many popular file types use the Zip standard Java JAR (EAR, RAR (Java), WAR) Office Open XML (Microsoft) 2007 and greater (docx, xlsx, pptx) Open Packaging Conventions OpenDocument (ODF) XPI (Mozilla extensions) The only native compression method included with Windows.
34
References http://en.wikipedia.org/wiki/Zip_(file_format)#Structure
35
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.