Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction Steganography with A Case Study of Steganalysis

Similar presentations


Presentation on theme: "An Introduction Steganography with A Case Study of Steganalysis"— Presentation transcript:

1 An Introduction Steganography with A Case Study of Steganalysis
Arunabha (Arun) Sen, Huan Liu Department of Computer Science and Engineering Arizona State University Tempe,Az 85287 Joint Work with Yanming Di and Avinash Ramineni

2 Secure Communication Two parties, Alice and Bob, can exchange information over an insecure medium in such a way that even if an intruder (Willie) is able to intercept, read and perform computation on the intercepted information, Willie will not be able to decipher the content of the exchanged information.

3 Sometimes encryption may not be enough
Prisoners Problem Alice and Bob are in jail and wish to hatch an escape plan . All their communications pass through the warden,Willie, and if Willie detects any encrypted messages, he can simply stop the communication. So they must find some way of hiding their secret message in an innocuous looking text.

4 Steganography Steganography is the art of hiding information in ways that prevent the detection of hidden messages. Steganography in Greek means “covered writing” Steganography and cryptography are cousins in the spy craft family While the goal of the cryptography system is to conceal the content of the messages, the goal of information hiding or steganography is to conceal their existence

5 Steganography What to hide Texts Images Sound How to hide
embed text in images/sound files embed image in image/sound files embed sound in image/sound files

6 Sometimes distinction between Steganography and Cryptography is blurry
Intruder Active Intruder Passive Intruder Alters messages just Listens Plain Text Plain Text Encryption Key K cipher text ,c=Ek(p) Decryption Key Encryption method Decryption method

7 Steganographic System

8 Comparison Cryptography C = Ek (P) Plain text Steganography
P = Dk (C) Key Cipher text f Steganography secret message cover image Stego message f

9 Encryption Example Plain Text Pleasetransferonemilliondollarstomyswissbankaccountsixtwotwo Cipher Text ASSASASAAASASASAAAAFDSGGSFSSQWEDSCVBNMDKL Why does cipher text have to look gibberish? Why cant it look like Mydaughtersbirthdayisseptemberthirdnineteensixtytwo If cipher text looks like above, is it cryptography or steganography?

10 Real Example During WW2 the following cipher message was actually sent by a German spy Apparently neutral’s protest is thoroughly discounted and ignored. Isman hard hit. Blockade issue affects pretext for embargo on by-products, ejecting suets and vegetable oils Hidden Message Pershing sails from NY June 1 (Can be obtained by extracting the second letter in each word of the message sent)

11 Information Hiding in Images
Digital images are stored in 24-bit or 8-bit files All color variations are derived from three primary colors red, green and blue Each primary color is represented by 1 byte; 24 bit images are 3 bytes per pixel to represent a color value FFFFFF  ( 100% Red + 100% Green + 100% Blue) A 1024 x 768 pixel image with 24-bit/pixel will have a file size exceeding 2 Mbytes In 8-bit color images such as GIF files, each pixel is represented by a single byte and each pixel merely points to color index table (palette ) with 256 possible colors

12 Digital Watermarking Watermarking is used primarily for identification and entails embedding a unique piece of information within a medium without noticeably altering the medium The difference between Steganography and Watermarking is primarily one of intent. Steganography conceals information; Watermarks extend information and become an attribute of the cover image Publishing and broadcasting industries are interested in techniques for hiding encrypted copyright marks and serial numbers in digital films, audio recordings, books and multimedia products.

13 Steganographic Techniques
Genome Steganography: Encoding a hidden message in a strand of human DNA Hiding in Text: Information can be hidden in the documents by manipulating the positions of lines and words, hiding the data in html files Hiding in the disk space:Hiding the data in unused or reserved space. Hiding in network packets:Packets that are transmitted through the internet.

14 Steganographic Techniques
Hiding the data in software and circuitry:Data can be hidden in the layout of the code distributed in a program or the layout of electronic circuits on a board. Information Hiding in Images:Ranges from least significant bit insertion to masking and filtering to applying more sophisticated image processing algorithms LSB insertion: A simple approach for embedding information in a cover image. Encodes the message in each and every LSB of every pixel of an image

15 Some software tools for steganography
S-Tools: It includes programs that process GIF and BMP images, process audio files and will even hide information in the unused areas of the floppy diskettes StegoDos: StegoDos also known as the Black Wolfs Picture Encoder version 0.90a.It works only for 320* 200 images with 256 colors Camouflage: is a steganographic tool that allows hiding files by scrambling them and then attaching them to the file of your choice Mp3 Stego: MP3Stego is a steganographic tool that will hide information in MP3 files during the compression process.

16 Information Hiding in Images
Least Significant bit Insertion Masking and Filtering Algorithms and Transformations Least Significant bit insertion A 1024x768 pixel image with 24 bits per pixel can hide 1024x768x3= bits=294,912 bytes of information On average LSB requires that only half the bits in an image be changed

17 Algorithm and Transformation
Jpeg-Jsteg => Steganography tool that creates a stego-image from the input of a message to be hidden and a loss less cover image. The software combines the message and cover image , using the Jpeg algorithm it creates a lossy JPEG stego-image JPEG images use the Discrete Cosine Transforms to do the compression

18 Discrete Cosine Transformation
Two dimensional DCT is applied on blocks of 8x8 pixels Transforms 8x8 pixel blocks into 64 DCT coefficients Modifying one coefficient affects all 64 image pixels DCT based image compression relies on two techniques to represent the images Quantization Entropy Coding Least-significant bits of quantized DCT coefficients are used as redundant data

19 Discrete Cosine Transformation
One Dimensional DCT Two Dimensional DCT

20 TCP Header

21 Hiding Data in TCP/IP Header
Place to hide secret message Reserved bits Sequence number field Initial Sequence Number (ISN) is a randomly generated number ISN = M + F (localhost, localport, remotehost, remoteport)

22 Information Hiding Experiments in TCP Header
Take each character of the message to be hidden (8 bit ASCII) Scale it to a 32-bit number by multiplying with an appropriate constant. Use the scaled number as the Initial Sequence Number How good is this Information Hiding technique ? Perform Entropy Test

23 Information Theory What is information and how do you measure it?
The crux of Information Theory is measure of information Consider the following messages The Sun will rise There will be scattered rainstorms There will be a tornado The less likely the message the more information it conveys

24 Information Theory If xi denotes an arbitrary message and P(xi) = Pi is the probability of the event that xi is selected for transmission, then the amount information associated with xi should be some function of Pi Shannon defined information measure by the logarithmic function Ii = logb (1/Pi) . The quantity Ii is called the self information of message xi

25 Entropy Consider an information source that emits a sequence of symbols selected from an alphabet of M different symbols let X denote the entire set of symbols x1, x2,…,xM . We can treat each symbol xi as a message that occurs with probability Pi and conveys self information Ii.The set of symbol probabilities must satisfy =1 The amount of information produced by the source during an arbitrary symbol interval is a discrete random variable having possible values I1,I2,….,IM.The expected information per symbol is then given by the statistical average H(X)= which is called the source entropy.

26 Experimental Results The input to the program is a text file    Message size 11 26 340 1353 9477 39675 Entropy of text 3.45 3.6 4.91 4.84 4.93 4.55 Entropy of Random sequence number 4.7 8.40 10.4 13.21 15.27

27 Attacks on Steganographic systems
Statistical Attacks Statistical tests can reveal if an image holds steganographic content Chi-Square Attack Entropy Test Visual Attacks The idea of visual attacks is to remove all parts of the image covering the message The human eye can now distinguish whether there is a potential message or still image content.

28 Steganography and Communication Theory
Steganography can be formalized by communication theory Parameters of information hiding, such as number of bits that can be hidden, invisibility of the message and its resistance to removal can be related to the characteristics of communication system , such as capacity, signal to noise ratio and jamming margin

29 Steganography and Communication Theory
The notion of capacity and data hiding indicates the maximum number of bits hidden and successfully recovered by the stegosystem The S/N ratio serves as a measure of invisibility or detectability – Information bearing signal (message to be concealed) Noise (Cover image)

30 Steganography and Communication Theory
High S/N ratio is desired in a typical communication system In a steganographic system, a very low S/N ratio corresponds to lower perceptibility and possibility of greater success in concealing the embedded message The measure of jamming resistance can be used to describe a level of resistance to removal or destruction of the embedded message

31 A Case Study: LSB based steganography in JPEG images
JPEG images use the discrete cosine transforms to do the compression Two dimensional DCT is applied on blocks of 8x8 pixels Transforms 8x8 pixel blocks into 64 DCT coefficients Modifying one coefficient affects all 64 image pixels DCT based image compression relies on two techniques to represent the images Quantization Entropy Coding Least-significant bits of quantized DCT coefficients are used for hiding the message bits

32 Steganalysis of JPEG based LSB steganography
Modifying the LSB bits changes the statistical properties of the image Pairs of Values: when embedding message bits into the LSB’s of quantized DCT coefficients, the frequency counts of the DCT coefficients change in pair E.g., (4, 5) (6, 7) … (4, )(5, ) the frequency counts only change 45 or 5 4 In the stego and cover image the sum of frequency counts between 4 and 5 remain the same

33 Turn into a Classification Problem
Objective of steganalysis is to distinguish the normal images from stego images Classification problem: Classify images into two separate classes: stego and normal Classification is a supervised learning technique Use a set of images with hidden data in them as training data Use classification algorithms to construct classifiers.

34 When a classification algorithm is run on data set (Stego and cover images), it needs to find some decision boundary between the two classes and create a model. The model so generated can be used to predict the class to which each of the images belong to, given the test data. We evaluate three classification algorithms C4.5: A decision tree induction method Logistic Regression Neural Networks

35 Experiments Used 180 JPEG images of size 768 x 512
Created stego images with different amounts of data hidden in them Created image sets containing 5000,3000,2000,1000, 500 bytes of hidden messages Tools used for hiding are : JSTEG , JPHIDE , F5 The secret message hidden was taken from Gutenberg’s E-text of Shakespeare’s First Folio Measure the amount of data hidden in the images by a unit bits/pixel Use 10 fold cross validation to test the three methods

36 Some common steganographic methods
JSTEG Steganographic program by Derek Upham It can be viewed as the prototype of all LSB based methods Hides the data in the JPEG images by replacing the LSB’s of the quantized DCT Does not use encryption or random bit selection It sequentially modifies all quantized dct coefficients having values other than 0,1

37 Results for JSTEG Plots and tables

38 JPHide Steganographic program by Allan Latham It uses random bit selection—message bits are hidden in a randomly selected LSB’s The selection of random bits is controlled by a key Also encrypts the message before embedding it It modifies the DCT coefficients –1,0,1 in a special manner

39 Results for JPHIDE

40 F5 Program by Andreas Westfeld
Observed that replacing the LSB’s of the DCT coefficients is vulnerable to statistical attack Proposed a new method of hiding by decrementing the absolute value of the quantized DCT coefficients Tries to minimize the number of bits that are modified by allowing high capacity Uses a matrix encoding for minimizing the number bits that are modified

41 Results for F5

42 Summary

43 Conclusion and Future work
The present methods for steganalysis are method specific Our method is general and can be easily extended to other LSB based steganographic methods Existing LSB steganographic methods are easy to detect if the amount of information hidden is not too small Identifying the maximum capacity of information that can be hidden in an image using a particular steganographic tool has to be modeled


Download ppt "An Introduction Steganography with A Case Study of Steganalysis"

Similar presentations


Ads by Google