Steganography in digital images
Copyright protection “Signature” or “watermark” of the creator/sender Invisible Hard to remove Robust to processing 64 bits are likely enough Fingerprinting (traitor tracing) Identifier of the recipient/customer Requirements as before Authentication (verifying data integrity and origin) Invisible Fragile Hard to modify the image while preserving the watermark Data hiding applications
Covert (stealth) communication. The goal is to hide the real message in some other (cover) content. The cover has no value other than a decoy. Statistical undetectability – no statistical test should exist that could distinguish between clean objects and those containing a secret message. Large payload – it is a communication method! Robustness may or may not be an issue. If the channel is noise-free, no robustness is needed. A successful attack on steganography can detect the presence of the message but not necessarily read it!! Steganography
Steganography is not watermarking What do they have in common - Both hide data - Often similar tools are used What is different about them - A digital watermark contains information about the cover object in which it is embedded while the cover in stego is just a decoy. - The presence of a digital watermark is often advertised, not concealed, while the presence of a steganographic message should be secret (we should not be able to tell that something is in!) - A watermark is usually a few bits (typically 1 – 60 bits), while steganography strives for large payload (it is used for communication after all)
Both are privacy tools involving keys that enable two or more parties communicate privately Crypto makes the message unintelligible to those not possessing the correct keys, but the existence of secret message is obvious (overt) Stego conceals the very presence of message (covert), the communicated object is just a decoy. Steganography vs. cryptography
Schwarzenegger’s letter
Steganography is sometimes called - Secret writing - Concealed writing - Covert communication - Stealth communication - Data hiding - Electronic invisible ink - The prisoners’ problem Word origin From Greek Steganos (covered) and graphia (writing)
Steganography ~470 B.C. First written evidence by Greek historian Herodotus. Term coined by Johannes Trithemius in Steganography in its modern form is only ~15 years old.
Data-hiding software Number of data-hiding software released per year. Data provided courtesy of Neil Johnson.
Stego software by media type Data provided courtesy of Neil Johnson.
Three fundamental types of steganography 1. Steganography by cover selection Sender selects a cover from a large set of available covers so that the required message is communicated. 2. Steganography by cover synthesis Sender creates the cover that communicates the desired message. 3. Steganography by cover modification Sender modifies an existing cover in order to convey the required message.
Steganography by cover modification Cover objectStego-object 00101…1 Compression Encryption Image source 00101…1 Decryption Decompression Communication is monitored by a warden looking for suspicious artifacts Main requirement: Undetectability (no algorithm can decide about stego and cover objects with success better than random guessing) Warden: passive, active, malicious Alice Bob encryption key stego key
Steganographic security Cover source ………… random variable x on X Stego source ………… random variable y on X Measure of security … D KL (p x ||p y ) = i p x (i) log p x (i)/p y (i) Kullback-Leibler divergence (relative entropy) Perfect security ……… D KL (p x ||p y ) = 0 -security ……………... D KL (p x ||p y ) < x ~ p x y ~ p y Warden
LSB embedding and its analysis
LSB embedding Cover image is grayscale with M×N pixels. All pixels are 8-bit integers in {0, …, 255}. To embed a message: Visit pixels pseudo-randomly. Embed one bit at every pixel. - message bit LSB of the pixel value Continue, until all bits are embedded To read the message: Follow the same path and read LSBs of visited pixels. Example: Pixel value is 11 = ( ) 2 We want to embed bit “0” (change 11 to 10) We want to embed bit “1” (no change is needed)
LSB embedding (continued) Maximal payload that can be embedded: M×N bits. Assume we embed payload of m MN bits. Relative payload = m/(MN) bits per pixel (bpp). 0 1. When embedding m bits, we make on average m/2 changes. Change rate = (m/2)/(MN) = /2. 0 1/2. Change rate = probability of making an embedding change.
LSB embedding is very popular General (can be applied to any digital file consisting of numerical data) Extremely simple Fast High capacity ( 1 bit per pixel, embedding efficiency 2 ) Does not require any software present on the computer One command line in UNIX Perl script ( source: A. Ker, Oxford University ): perl -n0777e ’$_=unpack"b*",$_;split/(\s+)/,,5; output.pgm secrettextfile LSB plane of images resembles random noise embedding is undetectable?
Example: LSB plane of Lenna LSB bitplane of a never-compressed image Black dot = odd pixel value White dot = even pixel value
Properties of LSB flipping FlipLSB(x) is idempotent, e.g., LSBflip(LSBflip(x)) = x for all x LSB flipping induces a permutation on {0, …, 255} 0 1, 2 3, 4 5, …, 254 255 LSB flipping is “asymmetrical” (e.g., 3 may change to 2 but never to 4 ) | LSB(x) – x | = 1 for all x (embedding distortion is 1 per pixel) LSBflip(x) = x + 1 – 2(x mod 2)
Effect of LSB embedding on histogram parts untouched by embedding 2i2i 2i+1 LSB flipping pair 2i, 2i+1 h c [2i] = number of pixels with value 2i in the cover image h c [2i+1] = number of pixels with value 2i + 1 in the cover image h s [2i] = (h c [2i] + h c [2i+1])/2 h s [2i+1] = (h c [2i] + h c [2i+1])/2 For a fully embedded image: 2i2i 2i+1 h c [2i] h c [2i+1] h s [2i+1] h s [2i]
“Twin peaks” in the histogram The peaks can be tested for using a chi-square test
Spread-spectrum steganography
SS steganography Spread each bit b {-1,1} among s pixels x 1, …, x s : y i = x i + b e i e i = spreading sequence, e ~ N(0, 2 ) Maximal payload = MN/s (bits) or 1/s (bpp). To read the message: (1/s) i y i e i = (1/s) i x i e i + (1/s) b i e i 2 = ~ N(0,E(x) 2 /s) b 2b 2 > 0 b = 1 < 0 b = -1 This term will be small with high probability (E(x) = energy per pixel)
SS steganography Spreading buys us robustness at the expense of s-times lower payload. Consider a distorted signal: z i = y i + n i (1/s) i z i e i = (1/s) i x i e i + (1/s) i n i e i + (1/s) b i e i 2 ~ N(0,E(x) 2 /s)~ N(0,E(n) 2 /s) Both terms will be small with high probability b 2b 2
Steganalysis in the wide sense Traditional steganalysis: a steganography system is considered broken, when the mere presence of a hidden message is detected Forensic analysis: detection of the message may not be sufficient; often, other information would be useful identification of the embedding algorithm (LSB, 1, …) the stego software used (F5, OutGuess, Steganos, …) the stego key (StegoSuite © by Wetstones, Inc.) the hidden bit-stream the decrypted message