Three Novel Algorithms for Hiding Data in PDF Files Based on Incremental Updates Li Lei School of Information Science and Technology Sun Yat-Sen University.

Slides:



Advertisements
Similar presentations
WCAG 2 Compliance With PDF
Advertisements

Programming Paradigms and languages
Chapter 8 Creating Style Sheets.
Tutorial 12: Enhancing Excel with Visual Basic for Applications
Word Lesson 11 Customizing Tables and Creating Charts Microsoft Office 2010 Advanced Cable / Morrison 1.
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation. All.
Chapter 4 Adding Images. Inserting and Aligning Images Using CSS When you choose graphics to add to a web page, it’s important to use graphic files in.
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Microsoft ® Word Templates and Accessibility. 1 What is a Word template? File with a.dot (document template) extension Can define the following:  Paragraph.
JavaScript, Fifth Edition Chapter 1 Introduction to JavaScript.
Word Processing ADE100- Computer Literacy Lecture 12.
Microsoft Excel 2000 Adding Visual Elements and Managing Files.
Using Styles and Style Sheets for Design
Steganography Steganography refers to any methodology used to hide a message (including text, sound, or picture) in a separate file. Most commonly text.
Class Instructor Name Date. Classroom Tips Class Roster – Please Sign In Class Roster – Please Sign In Internet Usage Internet Usage –Breaks and Lunch.
Watermarking University of Palestine Eng. Wisam Zaqoot May 2010.
Microsoft Office 2007 Word Chapter 1 Creating and Editing a Word Document.
Prof. Yousef B. Mahdy , Assuit University, Egypt File Organization Prof. Yousef B. Mahdy Chapter -4 Data Management in Files.
PHP meets MySQL.
McGraw-Hill Career Education© 2008 by the McGraw-Hill Companies, Inc. All Rights Reserved. 2-1 Office PowerPoint 2007 Lab 2 Modifying and Refining a Presentation.
Committed to Shaping the Next Generation of IT Experts. Exploring Microsoft Office Word 2007 Chapter 4: Share, Compare and Document Robert Grauer, Keith.
With Microsoft Office 2007 Intermediate© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Office 2007 Intermediate.
Digital Watermarking -Interim Report (EE5359: Multimedia processing) Under the Guidance of Dr. K. R. Rao Submitted by: Ehsan Syed
A study for Relational Database watermarking scheme Speaker: Pei-Feng Shiu Date: 2012/09/21.
 2008 Pearson Education, Inc. All rights reserved Introduction to XHTML.
Chapter 8 Collecting Data with Forms. Chapter 8 Lessons Introduction 1.Plan and create a form 2.Edit and format a form 3.Work with form objects 4.Test.
Word 2013 Certification Skills Measured. 1. Create and Manage Documents  Create a Document  Navigate through a Document  Format a Document  Customize.
ReiserFS Hans Reiser
Digital Watermarking -Project Proposal (EE5359: Multimedia processing) Under the Guidance of Dr. K. R. Rao Submitted by: Ehsan Syed
Group 6 Arthur Garza Lizeth Gonzalez Javier Guzman.
XP Tutorial 8 Adding Interactivity with ActionScript.
PLACING AND LINKING GRAPHICS
Data Hiding in a Kind of PDF Texts for Secret Communication Authors : S.P. Zhong, X.Q. Cheng, and T.R. Chen Source : International Journal of Network Security,
Microsoft PowerPoint 2007 Part 5. Agenda Editing Presentation Masters Editing Notes and Handout Masters Exporting Outlines and Slides Presenting to a.
Chapter 9 Creating a Reference Document with a Table of Contents and an Index Microsoft Word 2013.
Word 2003 The Word Screen. Word 2003 Screen File Menu –Holds the options for creating a new document, opening a document, saving a document, printing.
FIRST COURSE PowerPoint Tutorial 4 Integrating PowerPoint with Other Programs and Collaborating with Workgroups.
HTML Concepts and Techniques Fifth Edition Chapter 1 Introduction to HTML.
Committed to Shaping the Next Generation of IT Experts. Exploring Microsoft Office Word 2007 Chapter 4: Share, Compare and Document Robert Grauer, Keith.
Cryptographic Anonymity Project Alan Le
© 2010 Delmar, Cengage Learning Chapter 11 Creating and Using Templates.
A NOVEL STEGANOGRAPHY METHOD VIA VARIOUS ANIMATION EFFECTS IN POWERPOINT FILES Internal guide Mrs. Hilda By Syed Ashraf ( ) Sushil sharma ( )
COMPREHENSIVE Excel Tutorial 12 Expanding Excel with Visual Basic for Applications.
Microsoft Excel 2007 Noris Bt. Ismail Faculty of Information and Communication Technology Tel : (Ext 8408) BCOMP0101.
基於 (7,4) 漢明碼的隱寫技術 Chair Professor Chin-Chen Chang ( 張真誠 ) National Tsing Hua University National Chung Cheng University Feng Chia University
T EXT D IGITAL W ATERMARKING O F A M ALAYALAM T EXT D OCUMENT B ASED O N F RAGMENTS R EGROUPING S TRATEGY. Guide:Presented by: Mrs.Sreeja Sasinas Alias.
Introduction to Computer Security ©2004 Matt Bishop Information Security Principles Assistant Professor Dr. Sana’a Wafa Al-Sayegh 1 st Semester
THE CATHOLIC UNIVERSITY OF AMERICA School of Engineering / Department of Electrical Engineering and Computer Science A Non-Algorithmic File-Type Independent.
基於(7,4)漢明碼的隱寫技術 Chair Professor Chin-Chen Chang (張真誠)
An Information Hiding Scheme Using Sudoku
Information Steganography Using Magic Matrix
Steganography.
Chair Professor Chin-Chen Chang Feng Chia University Aug. 2008
Information Steganography Using Magic Matrix
Microsoft PowerPoint 2007 – Unit 2
Embedding Secrets Using Magic Matrices
Advisor: Chin-Chen Chang1, 2 Student: Yi-Pei Hsieh2
3.00 Understanding the Adobe Dreamweaver interface. (12%)
Reversible Data Hiding Scheme Using Two Steganographic Images
Advisor:Prof. Chin-Chen Chang Student :Kuo-Nan Chen
Chair Professor Chin-Chen Chang (張真誠) National Tsing Hua University
Information Steganography Using Magic Matrix
Data hiding based Hamming code
Chair Professor Chin-Chen Chang (張真誠) National Tsing Hua University
Data hiding method using image interpolation
University of Warith AL-Anbiya’a
Information Hiding Techniques Using Magic Matrix
Image Based Steganography Using LSB Insertion Technique
Steganographic Systems for Secret Messages
Department of Computer Science, University of Central Florida ,Orlando
Presentation transcript:

Three Novel Algorithms for Hiding Data in PDF Files Based on Incremental Updates Li Lei School of Information Science and Technology Sun Yat-Sen University

Contents Introduction The Structure of PDF Files Experimental Results Proposed Algorithms Incremental Updates 6 Future work

Introduction PDF (Portable Document Format) A widely used electronic document format High printing quality Cross-platform applicability Device-independence

Hiding information in PDF file Secret message transmission Mark the source and transmission path Introduction

Existing algorithms  First category Varying the line, word, character spacing or other certain display attributes slightly. [2,3,4,5,6,7] Obvious defects that the effect of page display is disturbed and that information security is relatively low.  Second category Adding or changing the content of PDF file streams. [8,9,10] Disadvantages in guaranteeing large capacity, high security and robustness to some degree. Introduction

The structure of PDF file File structure (Physical structure) It includes the header, the body which contains a lot of objects, the cross-reference table containing information about the indirect objects in the file and the trailer. It determines how the objects are stored in a PDF file.

Document structure (Logical structure) A PDF document can be regarded as a hierarchy of objects contained in the body section of a PDF file. The document structure of PDF file is organized in the shape of an object tree topped by Catalog and five subtrees named Page tree, Outline hierarchy, Article thread, Named destinations and Interactive form included. The structure of PDF file

Object An object is the basic element in PDF files. PDF supports eight basic types of objects: Boolean Object, Numeric Object, String Object, Name Object, Array Object, Dictionary Object, Stream Object and Null Object. Objects may be labeled so that they can be referred to by other objects. A labeled object is called an indirect object. The structure of PDF file

Content stream The content stream belong to Page tree contains the almost all information about PDF text contents and display attributes. Each page’s contents will be cut to some blocks and saved in some dictionary objects named Contents object. Each Contents object will contain text object and text state. The text object describes the text contents and the text state is a collection of page display attributes. The structure of PDF file

Incremental updates The contents of PDF file can be updated incrementally without rewriting the entire file. Changes are appended to the end of the file, leaving its original contents intact. In an incremental update, any new or changed objects are appended to the file, which constitute the updated body at the end of the file, a cross-reference section and a new trailer are appended followed.

Incremental updates When Incremental updates?  Right-click and modify properties  “Save” editing operations

Proposed algorithms 1.A compensated version of modifying display attributes Text state in Contents object indicates the attributes of text display. Every attribute has a operator key word to mark it, such as Char Space: Tc, Word Space: Tw, Scale: Tz, Leading: TL, Font size: Tf, Render: Tr, Rise: Ts etc. These operator key words in the content stream can be modified to hide information.

Proposed algorithms 1.A compensated version of modifying display attributes But these algorithms affect the display of the PDF file.

Proposed algorithms 1.A compensated version of modifying display attributes we can compensate the effect of data hiding using incremental updates of PDF files: After altering the text states of contents objects to embed information, the original contents objects are written in updated body.

Proposed algorithms 2.Algorithms based on new body and cross-reference section ① In the updated body, the actual embedded carrier is indirect objects. Considering the complexity of inserting objects, content security, capacity and other factors, we select stream object as the embedded carrier. ② Select the new cross-reference section as covert information carrier. We can embed information by controlling the 10-bytes offset in cross-reference section’s entry. Use the difference of adjacent entries’ offset to represent the covert information.

Proposed algorithms 2.Algorithms based on new body and cross-reference section

The experimental results and analysis Data Embedding Capacity User interface:

Perceptual transparency property Seen from the effects chart, after having embedded data, there was not any change in display effect of the cover file. The experimental results and analysis

The robustness to reading and editing operations 1. Robustness to annotating and marking operations Apply Adobe Acrobat 9 Pro to annotate and mark the embedded PDF file in various ways. We try to extract the covert information from it. And the experiment result shows that the accuracy of extracting data is 100%. The experimental results and analysis

The robustness to reading and editing operations 1. Robustness to interactive form editing (a) is the stego file without any editing and (b) is the file been written some contents to (a). We try to extract the covert information from (b), and the experiment result shows that the accuracy of extracting test is 100%. The experimental results and analysis

FileSizePage numberEmbedded Size Size increasing percentage 1149KB4153KB2.7% 2237KB4245KB3.4% 3271KB4272KB0.4% 4298KB4306KB2.7% 5303KB6304KB0.3% 6349KB7350KB0.3% 7413KB2415KB0.5% 8543KB5544KB0.2% 9663KB4664KB0.15% 10801KB10803KB0.2% Increase in the size of carrier file 1. Algorithm 1 (Embed 128 bits) Rewriting a Contents object by incremental update will increase the size of the original file by 1 to 8 KB (depending on the size of the original Contents object). Real experimental result shows average rate of files’ size increase is around 1%.

The experimental results and analysis Increase in the size of carrier file 2. Algorithm 2, 3 (Embed 128 bits) The increase of the size caused by algorithm 2 is irrelevant to the original files. Using 4 objects to embed 128 bits, will add no more than 1KB to original PDF file. 200KB  0.5% The increase of the size caused by algorithm 3 is also irrelevant to the original files. Using 22 entries (need to add 22 new objects) of cross-reference to embed 128 bits, the maximal size increase will be around 4 to 5 KB. 200  2.5%

The experimental results and analysis Performance Comparison Performance Incremental updates methods wbStego 4.3 The methods based on varying display attributes The methods based on changing entries’ order Perceptual transparency No changed Slightly changed No changed Embedding capacity large enoughSmallBased on file SecurityHigh Relatively low Relatively Low High RobustnessStrong Relatively Strong Relatively Strong Medium

Future work Different versions of PDF files are being used at present. Some higher versions of PDF files have used cross-reference streams to store the information of indirect objects. How to advance the compatibility of different PDF versions is the emphasis for our next step work.

1. Adobe Systems Incorporated. PDF Reference, fifth edition, version S. H. Low and N. F. Maxemchuk. Performance comparison of two text marking methods. IEEE Journal on Selected Areas in Communications, Vol.16, No.4, 1998,pp J. T. Brassil, et al. Electronic marking and identification techniques to discourage document copying, IEEE Journal on Selected Areas in Communications,Vol.13, No.8, 1995, pp Shangping Zhong, Tierui Chen. Information Steganography Algorithm Based on PDF Documents. Computer Engineering, Vol.32, No.3, Feb. 2006, pp S. H. Low, et al. Document marking an identification using both line and word shifing. in Proceedings INFOCOM’95, Boston, MA, Apr. 1995, pp N. F. Maxemchuk and S. H. Low. Marking text documents. in Proceedings, International Conference Image Processing,, Boston, Santa Barbara, CA, Oct. 1997, pp E. Franz and A. Pfitzmann. Steganography secure against Cover-Stego-Attacek, 3 th International Workshop, Information Hiding 1999,2000, pp wbStego Studio. The steganography tool wbStego Youji Liu, Xingming Sun, Gang Luo. A Novel Information Hidng Algorithm Based on Structure of PDF Document. Computer Engineering, Vol.32, No.17, Sep. 2006, pp Xingtong Liu, Quan Zhang, Chaojing Tang, Jingjing Zhao and Jian Liu. A Steganographic Algorithm for Hiding Data in PDF Files Based on Equivalent Transformation, in Information Processing (ISIP), 2008 International Symposiums on, May 2008, pp Reference

It’s all Thanks