Download presentation
Published byLindsey Bishop Modified over 9 years ago
1
An Introduction Steganography with A Case Study of Steganalysis
Arunabha (Arun) Sen, Huan Liu Department of Computer Science and Engineering Arizona State University Tempe,Az 85287 Joint Work with Yanming Di and Avinash Ramineni
2
Secure Communication Two parties, Alice and Bob, can exchange information over an insecure medium in such a way that even if an intruder (Willie) is able to intercept, read and perform computation on the intercepted information, Willie will not be able to decipher the content of the exchanged information.
3
Sometimes encryption may not be enough
Prisoners Problem Alice and Bob are in jail and wish to hatch an escape plan . All their communications pass through the warden,Willie, and if Willie detects any encrypted messages, he can simply stop the communication. So they must find some way of hiding their secret message in an innocuous looking text.
4
Steganography Steganography is the art of hiding information in ways that prevent the detection of hidden messages. Steganography in Greek means “covered writing” Steganography and cryptography are cousins in the spy craft family While the goal of the cryptography system is to conceal the content of the messages, the goal of information hiding or steganography is to conceal their existence
5
Steganography What to hide Texts Images Sound How to hide
embed text in images/sound files embed image in image/sound files embed sound in image/sound files
6
Sometimes distinction between Steganography and Cryptography is blurry
Intruder Active Intruder Passive Intruder Alters messages just Listens Plain Text Plain Text Encryption Key K cipher text ,c=Ek(p) Decryption Key Encryption method Decryption method
7
Steganographic System
8
Comparison Cryptography C = Ek (P) Plain text Steganography
P = Dk (C) Key Cipher text f Steganography secret message cover image Stego message f
9
Encryption Example Plain Text Pleasetransferonemilliondollarstomyswissbankaccountsixtwotwo Cipher Text ASSASASAAASASASAAAAFDSGGSFSSQWEDSCVBNMDKL Why does cipher text have to look gibberish? Why cant it look like Mydaughtersbirthdayisseptemberthirdnineteensixtytwo If cipher text looks like above, is it cryptography or steganography?
10
Real Example During WW2 the following cipher message was actually sent by a German spy Apparently neutral’s protest is thoroughly discounted and ignored. Isman hard hit. Blockade issue affects pretext for embargo on by-products, ejecting suets and vegetable oils Hidden Message Pershing sails from NY June 1 (Can be obtained by extracting the second letter in each word of the message sent)
11
Information Hiding in Images
Digital images are stored in 24-bit or 8-bit files All color variations are derived from three primary colors red, green and blue Each primary color is represented by 1 byte; 24 bit images are 3 bytes per pixel to represent a color value FFFFFF ( 100% Red + 100% Green + 100% Blue) A 1024 x 768 pixel image with 24-bit/pixel will have a file size exceeding 2 Mbytes In 8-bit color images such as GIF files, each pixel is represented by a single byte and each pixel merely points to color index table (palette ) with 256 possible colors
12
Digital Watermarking Watermarking is used primarily for identification and entails embedding a unique piece of information within a medium without noticeably altering the medium The difference between Steganography and Watermarking is primarily one of intent. Steganography conceals information; Watermarks extend information and become an attribute of the cover image Publishing and broadcasting industries are interested in techniques for hiding encrypted copyright marks and serial numbers in digital films, audio recordings, books and multimedia products.
13
Steganographic Techniques
Genome Steganography: Encoding a hidden message in a strand of human DNA Hiding in Text: Information can be hidden in the documents by manipulating the positions of lines and words, hiding the data in html files Hiding in the disk space:Hiding the data in unused or reserved space. Hiding in network packets:Packets that are transmitted through the internet.
14
Steganographic Techniques
Hiding the data in software and circuitry:Data can be hidden in the layout of the code distributed in a program or the layout of electronic circuits on a board. Information Hiding in Images:Ranges from least significant bit insertion to masking and filtering to applying more sophisticated image processing algorithms LSB insertion: A simple approach for embedding information in a cover image. Encodes the message in each and every LSB of every pixel of an image
15
Some software tools for steganography
S-Tools: It includes programs that process GIF and BMP images, process audio files and will even hide information in the unused areas of the floppy diskettes StegoDos: StegoDos also known as the Black Wolfs Picture Encoder version 0.90a.It works only for 320* 200 images with 256 colors Camouflage: is a steganographic tool that allows hiding files by scrambling them and then attaching them to the file of your choice Mp3 Stego: MP3Stego is a steganographic tool that will hide information in MP3 files during the compression process.
16
Information Hiding in Images
Least Significant bit Insertion Masking and Filtering Algorithms and Transformations Least Significant bit insertion A 1024x768 pixel image with 24 bits per pixel can hide 1024x768x3= bits=294,912 bytes of information On average LSB requires that only half the bits in an image be changed
17
Algorithm and Transformation
Jpeg-Jsteg => Steganography tool that creates a stego-image from the input of a message to be hidden and a loss less cover image. The software combines the message and cover image , using the Jpeg algorithm it creates a lossy JPEG stego-image JPEG images use the Discrete Cosine Transforms to do the compression
18
Discrete Cosine Transformation
Two dimensional DCT is applied on blocks of 8x8 pixels Transforms 8x8 pixel blocks into 64 DCT coefficients Modifying one coefficient affects all 64 image pixels DCT based image compression relies on two techniques to represent the images Quantization Entropy Coding Least-significant bits of quantized DCT coefficients are used as redundant data
19
Discrete Cosine Transformation
One Dimensional DCT Two Dimensional DCT
20
TCP Header
21
Hiding Data in TCP/IP Header
Place to hide secret message Reserved bits Sequence number field Initial Sequence Number (ISN) is a randomly generated number ISN = M + F (localhost, localport, remotehost, remoteport)
22
Information Hiding Experiments in TCP Header
Take each character of the message to be hidden (8 bit ASCII) Scale it to a 32-bit number by multiplying with an appropriate constant. Use the scaled number as the Initial Sequence Number How good is this Information Hiding technique ? Perform Entropy Test
23
Information Theory What is information and how do you measure it?
The crux of Information Theory is measure of information Consider the following messages The Sun will rise There will be scattered rainstorms There will be a tornado The less likely the message the more information it conveys
24
Information Theory If xi denotes an arbitrary message and P(xi) = Pi is the probability of the event that xi is selected for transmission, then the amount information associated with xi should be some function of Pi Shannon defined information measure by the logarithmic function Ii = logb (1/Pi) . The quantity Ii is called the self information of message xi
25
Entropy Consider an information source that emits a sequence of symbols selected from an alphabet of M different symbols let X denote the entire set of symbols x1, x2,…,xM . We can treat each symbol xi as a message that occurs with probability Pi and conveys self information Ii.The set of symbol probabilities must satisfy =1 The amount of information produced by the source during an arbitrary symbol interval is a discrete random variable having possible values I1,I2,….,IM.The expected information per symbol is then given by the statistical average H(X)= which is called the source entropy.
26
Experimental Results The input to the program is a text file Message size 11 26 340 1353 9477 39675 Entropy of text 3.45 3.6 4.91 4.84 4.93 4.55 Entropy of Random sequence number 4.7 8.40 10.4 13.21 15.27
27
Attacks on Steganographic systems
Statistical Attacks Statistical tests can reveal if an image holds steganographic content Chi-Square Attack Entropy Test Visual Attacks The idea of visual attacks is to remove all parts of the image covering the message The human eye can now distinguish whether there is a potential message or still image content.
28
Steganography and Communication Theory
Steganography can be formalized by communication theory Parameters of information hiding, such as number of bits that can be hidden, invisibility of the message and its resistance to removal can be related to the characteristics of communication system , such as capacity, signal to noise ratio and jamming margin
29
Steganography and Communication Theory
The notion of capacity and data hiding indicates the maximum number of bits hidden and successfully recovered by the stegosystem The S/N ratio serves as a measure of invisibility or detectability – Information bearing signal (message to be concealed) Noise (Cover image)
30
Steganography and Communication Theory
High S/N ratio is desired in a typical communication system In a steganographic system, a very low S/N ratio corresponds to lower perceptibility and possibility of greater success in concealing the embedded message The measure of jamming resistance can be used to describe a level of resistance to removal or destruction of the embedded message
31
A Case Study: LSB based steganography in JPEG images
JPEG images use the discrete cosine transforms to do the compression Two dimensional DCT is applied on blocks of 8x8 pixels Transforms 8x8 pixel blocks into 64 DCT coefficients Modifying one coefficient affects all 64 image pixels DCT based image compression relies on two techniques to represent the images Quantization Entropy Coding Least-significant bits of quantized DCT coefficients are used for hiding the message bits
32
Steganalysis of JPEG based LSB steganography
Modifying the LSB bits changes the statistical properties of the image Pairs of Values: when embedding message bits into the LSB’s of quantized DCT coefficients, the frequency counts of the DCT coefficients change in pair E.g., (4, 5) (6, 7) … (4, )(5, ) the frequency counts only change 45 or 5 4 In the stego and cover image the sum of frequency counts between 4 and 5 remain the same
33
Turn into a Classification Problem
Objective of steganalysis is to distinguish the normal images from stego images Classification problem: Classify images into two separate classes: stego and normal Classification is a supervised learning technique Use a set of images with hidden data in them as training data Use classification algorithms to construct classifiers.
34
When a classification algorithm is run on data set (Stego and cover images), it needs to find some decision boundary between the two classes and create a model. The model so generated can be used to predict the class to which each of the images belong to, given the test data. We evaluate three classification algorithms C4.5: A decision tree induction method Logistic Regression Neural Networks
35
Experiments Used 180 JPEG images of size 768 x 512
Created stego images with different amounts of data hidden in them Created image sets containing 5000,3000,2000,1000, 500 bytes of hidden messages Tools used for hiding are : JSTEG , JPHIDE , F5 The secret message hidden was taken from Gutenberg’s E-text of Shakespeare’s First Folio Measure the amount of data hidden in the images by a unit bits/pixel Use 10 fold cross validation to test the three methods
36
Some common steganographic methods
JSTEG Steganographic program by Derek Upham It can be viewed as the prototype of all LSB based methods Hides the data in the JPEG images by replacing the LSB’s of the quantized DCT Does not use encryption or random bit selection It sequentially modifies all quantized dct coefficients having values other than 0,1
37
Results for JSTEG Plots and tables
38
JPHide Steganographic program by Allan Latham It uses random bit selection—message bits are hidden in a randomly selected LSB’s The selection of random bits is controlled by a key Also encrypts the message before embedding it It modifies the DCT coefficients –1,0,1 in a special manner
39
Results for JPHIDE
40
F5 Program by Andreas Westfeld
Observed that replacing the LSB’s of the DCT coefficients is vulnerable to statistical attack Proposed a new method of hiding by decrementing the absolute value of the quantized DCT coefficients Tries to minimize the number of bits that are modified by allowing high capacity Uses a matrix encoding for minimizing the number bits that are modified
41
Results for F5
42
Summary
43
Conclusion and Future work
The present methods for steganalysis are method specific Our method is general and can be easily extended to other LSB based steganographic methods Existing LSB steganographic methods are easy to detect if the amount of information hidden is not too small Identifying the maximum capacity of information that can be hidden in an image using a particular steganographic tool has to be modeled
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.