THE CATHOLIC UNIVERSITY OF AMERICA School of Engineering / Department of Electrical Engineering and Computer Science A Non-Algorithmic File-Type Independent Method for Hiding Persistent Data in Files Maha Sabir, M.Sc. Dr. Jim Jones, Ph.D. Dr. Hang Liu, Ph.D. April 20, 2017
Outline Background Motivation Statement of the problem Objectives Related Work Methodology Results & Discussion Conclusion & Future work References
Anti-forensics Techniques Background Anti-forensics Techniques Data Hiding Trail Obfuscation Data Destruction
Non-Algorithmic File Data Hiding Background Data Hiding Non-Algorithmic File Data Hiding Steganography Cryptography
Definition of Terms Watermarking vs. Tagging Stealthy Non-Algorithmic File-Type Independent LSB
Motivation Protecting digital data and sensitive contents by stealthy tagging (stealthy watermarking). Trying to know where Anti-forensic techniques may try to hide data so that we know where to look.
Statement of the Problem Key Questions: Can we hide data in files that is persistent, benign, recoverable and stealthy? 1) Which tags will survive and under what conditions? 2) Are there specific locations in a file where you can store and preserve tags?
Objectives Develop a file-type independent methodology to: Identify unique file locations for hiding stealthy watermark tags that are persistent and benign. Test watermarks for persistence and document survivability. Applications: For data protectors, investigators, and forensic examiners to trace digital evidence. Foundation for techniques to find hidden data.
Related Work 1/2 1. Microsoft Word OOXML File Format The Structure of an OOXML format document The Directory Structure of a “sample.docx”
Related Work 2/2 2. Hiding Data in Files 1) Cantrell & Dampier (2004) Hiding data in files e.g. html files and early binary MS office files Any space with at least two hexadecimal zeroes Some file dead spaces are not suitable for data hiding. 2) Garfinkel and Migletz (2009) Encrypting data in content parts of the zip file archive Hiding data in XML comments. 3) Castiglione et al., (2011) Hiding data in OOXML file zip archives and evaluated steganographic methods like altering zip compression algorithm, office macros, zero dimension image, and revision identifier values.
Limitations of Prior Work Based on understanding and altering the internal structure of the files, hence each technique would only work on one file type. Based on algorithmic data hiding (e.g., LSB for steganography), which is detectable, changes the carrier file, and is file type dependent.
File Dead Space Finding File dead space, which we define as a region of a raw file that may changed without corrupting the file. Searching for a string of 16 or more consecutive 0x00 File dead space
File dead spaces(count) Dataset Description Measure File size (KB) File dead spaces(count) Maximum 9,495 153 Minimum 10 5 Average 210 8 Median 30.32 5.00 Standard Deviation 849.38 12.25 Correlation (R) 0.318
Tagging Files and Survivability Testing 16 byte string written to all the dead spaces of each file. For dead spaces > 16 bytes, tag both at the beginning and middle The first three dead spaces at: [Content_Type].xml _rel/.rels word/_rel/document.xml.rels Are stable and do not change when performing different operations
Descriptions of the Tests Test Name 1 Copy on device 2 Copy off device 3 Open/Close/No Modify/No Save 4 Open/Save/Close/No Modify 5 Open/Modify/Close/No Save 6 Open/Modify/Save/Close 7 Open/Modify/Terminate
Results & Discussion 2/2 It is possible to hide persistent data in file dead space of DOCX (OOXML) files, in the first three file dead spaces of the documents All hidden data in file dead space persists when operations like opening, closing, terminating, copying and saving Tags in zip file archive in OOXML format do not survive when documents are modified and saved. Files have several internal dead spaces.
Conclusion & Future Work It is possible to empirically find locations suitable for storing data with no clue of the files internal structure Locations can survive many but not all operations Editing a docx file proved destructive to inserted data Other operations have no effect on the tag or the document Extend to different file types video, image, pdf, ...
References Beer, R. de., Stander, A., & Belle, J. (2015). Anti-Forensics: A Practitioner Perspective. International Journal of Cyber-Security and Digital Forensics (IJCSDF), 4(2), 390–403. Cantrell, G., & Dampier, D. D. (2004). Experiments in hiding data inside the file structure of common office documents: a stegonography application. In Proceedings of the 2004 international Symposium on information and Communication Technologies (pp. 146–151). Trinity College Dublin. Castiglione, A., D’Alessio, B., De Santis, A., & Palmieri, F. (2011). Hiding Information into OOXML Documents: New Steganographic Perspectives. Journal of Wireless Mobile Networks Ubiquitous Computing and Dependable Applications, 2(4), 59–83. Fu, Z., Sun, X., & Xi, J. (2015). Digital forensics of Microsoft Office 2007-2013 documents to prevent covert communication. Journal of Communications and Networks, 17(5), 525–533 https://doi.org/10.1109/JCN.2015.000091 Garfinkel, S. L., & Migletz, J. J. (2009). New XML-Based Files: Implications for Forensics. IEEE Security Privacy. 7(2), 38–44. Jain, A., & Chhabra, G. S. (2014). Anti-forensics techniques: An analytical review. In 2014 Seventh International Conference on Contemporary Computing (IC3) (pp. 412–418). https://doi.org/10.1109/IC3.2014.6897209 Kessler, G. C. (2007). Anti-forensics and the digital investigator. In Proceedings of the 5th Australian Digital Forensics Conference (p. 1). Mt Lawley, Western Australia, Edith Cowan University.
Q & A