Steganalysis of Block-Structured Stegotext

Slides:

Advertisements

Similar presentations

Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.

Advertisements

Sampling and Pulse Code Modulation

Steganography - A review Lidan Miao 11/03/03. Outline History Motivation Application System model Steganographic methods Steganalysis Evaluation and benchmarking.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.

Chapter 6 Information Theory

Desynchronization Attacks on Watermarks

Digital Data Transmission ECE 457 Spring Information Representation Communication systems convert information into a form suitable for transmission.

1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.

Review of Probability.

INFORMATION THEORY BYK.SWARAJA ASSOCIATE PROFESSOR MREC.

Mathematical Preliminaries. 37 Matrix Theory Vectors nth element of vector u : u(n) Matrix mth row and nth column of A : a(m,n) column vector.

Data Hiding in Image and Video Part I: Fundamental Issues and Solutions ECE 738 Class Presentation By Tanaphol Thaipanich

Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

Multiple Antennas Have a Big Multi- User Advantage in Wireless Communications Bertrand Hochwald (Bell Labs)

University of Massachusetts Amherst · Department of Computer Science Square Root Law for Communication with Low Probability of Detection on AWGN Channels.

CHAPTER 5 SIGNAL SPACE ANALYSIS

Quantization Watermarking Design and Analysis of Digital Watermarking, Information Embedding, and Data Hiding Systems Brian Chen, Ph. D. Dissertation,

Name Iterative Source- and Channel Decoding Speaker: Inga Trusova Advisor: Joachim Hagenauer.

1 Channel Coding (III) Channel Decoding. ECED of 15 Topics today u Viterbi decoding –trellis diagram –surviving path –ending the decoding u Soft.

Spectrum Sensing In Cognitive Radio Networks

Jayanth Nayak, Ertem Tuncel, Member, IEEE, and Deniz Gündüz, Member, IEEE.

Steganalysis of Block-DCT Image Steganography Ying Wang and Pierre Moulin Beckman Institute, CSL & ECE Department University of Illinois at Urbana-Champaign.

Digital Communications I: Modulation and Coding Course Spring Jeffrey N. Denenberg Lecture 3c: Signal Detection in AWGN.

- A Maximum Likelihood Approach Vinod Kumar Ramachandran ID:

CSC2535: Lecture 4: Autoencoders, Free energy, and Minimum Description Length Geoffrey Hinton.

Linear Algebra Review.

Statistical Data Analysis - Lecture /04/03

The Viterbi Decoding Algorithm

12. Principles of Parameter Estimation

The Johns Hopkins University

LECTURE 11: Advanced Discriminant Analysis

ASEN 5070: Statistical Orbit Determination I Fall 2014

Advanced Wireless Networks

Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.

Part III Datalink Layer 10.

Subject Name: Digital Communication Subject Code:10EC61

Scalar Quantization – Mathematical Model

Roberto Battiti, Mauro Brunato

Subject Name: Information Theory Coding Subject Code: 10EC55

Chapter 2 Minimum Variance Unbiased estimation

Localizing the Delaunay Triangulation and its Parallel Implementation

لجنة الهندسة الكهربائية

Sparse and Redundant Representations and Their Applications in

Turnstile Streaming Algorithms Might as Well Be Linear Sketches

A Brief Introduction to Information Theory

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

10701 / Machine Learning Today: - Cross validation,

ELEC6111: Detection and Estimation Theory Course Objective

Foundation of Video Coding Part II: Scalar and Vector Quantization

EE513 Audio Signals and Systems

Chair Professor Chin-Chen Chang (張真誠) National Tsing Hua University

Computing and Statistical Data Analysis / Stat 7

Error Detection and Correction

8. One Function of Two Random Variables

Writing of Wet Paper for the Case of Binary Images

Chapter 6 Random Processes

Improved Spread Spectrum: A New Modulation Technique for Robust Watermarking IEEE Trans. On Signal Processing, April 2003 Multimedia Security.

Information Theoretical Analysis of Digital Watermarking

12. Principles of Parameter Estimation

16. Mean Square Estimation

Compute-and-Forward Can Buy Secrecy Cheap

CH2 Time series.

8. One Function of Two Random Variables

Watermarking with Side Information

Continuous Random Variables: Basics

F test for Lack of Fit The lack of fit test..

Presentation transcript:

Steganalysis of Block-Structured Stegotext Ying Wang and Pierre Moulin Beckman Institute, CSL & ECE Department University of Illinois at Urbana-Champaign January 21st, 2004 Good morning, everyone. Today my topic is steganalysis of block-structured stegotext.

Outline What is steganography? Relative entropy as a measure of detectability Design perfectly undetectable steganography Spread spectrum Quantization index modulation (QIM) Block-based embedding Here is the outline of my talk. I will mainly focus on how to design perfectly undetectable steganography by modifying existing watermarking schemes such as spread spectrum and quantization index modulation. Especially, I am excited to show you how we can design QIM to achieve perfect undetectability. We will also look at the impact of the block-based embedding on relative entropy, which is a measure of the detectability of steganography schemes.

What is steganography? Steganography is a branch of information hiding, aiming to achieve perfectly undetectable communication. A steganographer embeds a message m into the host signal or covertext S^N, possibly with the help of a secret key k shared by the encoder and decoder. The output of the encoder X^N is the stegotext. A steganalyzer observes the stegotext. ( and make a decision on whether the observed is an innocent covertext or an evil stegotext.) If he decides what he observes is not a legal covertext, he terminates the transmission and the decoder receives nothing. Even the stegotext is not suspicious, an active steganalyzer can corrupt the stegotext with noise before he forwards it to the decoder. We will consider the active steganalyzer case throughout my talk. One question is: how well would the steganalyzer do if he knows the covertext and stegotext distributions exactly? %Now we give the steganalyzer the full advantage. That is, he knows the %probability distributions of the covertext and stegotext.

Relative entropy Steganalyzer performs a binary hypothesis test: Relative entropies and are measures of the difficulty of discriminating between hypotheses, relating to error probability bounds. means perfect undetectability! In this best scenario for the steganalyzer, basically he performs a binary hypothesis test, deciding which event is more likely: the observed text is from a covertext distribution or it is from a stegotext distribution. In 1998, Cachin was the first to point out that the security or the detectability of a stegography scheme can be measured by relative entropies between the stegotext and covertext. It actually follows naturally from the use of information theory on statistics. So, if the covertext and stegotext have the same distribution, then the relative entropies are 0 and the hypothesis testing error probability is ½. We say that the steganography scheme has perfect undetectability. From this point on, I will assume that the covertext is Gaussian. The reason is that for zero-mean Gaussian random process, things are simplified and life is easier for me. The relative entropy only involves the second-order statistics, that is, the covariance matrix of the covertext.

Design perfectly undetectable spread-spectrum steganography White covertext: Proper scaling and embedding where and lead to too! Colored covertext: Diagonalize then scale and embed So, how to design perfectly undetectable steganography schemes? Can we modify the existing watermark schemes to achieve perfect undetectability? Well, the answer is yes. Let’s first look at spread-spectrum steganography. If the covertext is white, that is i.i.d Gaussian, we can properly scale the covertext and add an information-bearing pattern to it. This additive information-bearing pattern is a carefully designed random function of message and secret key. When averaging over all messages and secret keys, the additive pattern can be made to asymptotically approach a white Gaussian distribution in the relative entropy sense. If we are careful with the selection of the scaling factor and the variance of the pattern, we can actually make the stegotext have the same Guassian distribution as the covertext. That is, we achieve the perfect undetectability! Of course, the secret key here has to have infinite entropy to make the pattern be Gaussian. If the covertext is colored, we simply need to do one step more: diagonalize the covertext then do the scaling and embedding.

Design perfectly undetectable QIM steganography Scalar quantization, 1-bit message , 1 dimensional sample Two quantizers and , step size We know that one advantage quantization index modulation has over spread-spectrum is that it can reject the covertext interference, hence the capacity is higher. So how about QIM steganography? Can we also design a QIM scheme to achieve perfect undetectability? For simplicity, we will only look at scalar quantization throughout my talk. All the following results can be generalized to high dimensional case. Please read our paper if you are interested. In scalar quantization, we have a one dimensional covertext sample s and suppose we want to embed 1-bit message m. So we need two quantizers corresponding to the two messages. As shown in this picture, the ordinary QIM simply maps the covertext to either the blue crosses to send message 1, or the red circles to send message 0. The step size is delta. Apparently, this cannot be a steganography scheme since the output is clustered at these quatization points.

Randomized QIM? Randomized by dither variable Good but not enough. General result: preprocessing does not help! How about Randomized QIM? Its output is generated from the covertext in this form. Here the dither variable V is uniformly distributed between 0 and the step size delta. Without this dither variable, it is the distortion compensated QIM. With the dither varable, the stegotext is the sum of the original covertext and a quantization error E. The quantization error is independent of the covertext and uniformly distributed between 0 and the step size delta. Therefore, the distribution of the stegotext x is the convolution of the original Gaussian distribution and the uniform distribution. Although the stegotext distribution can be very close to the Gaussian distribution if the quantization step size is small but still they can never be the same. Naturally people would ask: why don’t we preprocess the covertext S so that we can make the stegotext X be Gaussian? Well, the answer is: no, we can’t! Preprocessing the covertext does not help! Let us simply think of the characteristic functions of Gaussian distribution and uniform distribution. The characteristic function of Gaussian distribution has no zeros while the characteristic function of uniform distribution has zeros. So no matter what you do to the covertext, the product of characteristics functions will always have zeros and can never be Gaussian.

Stochastic QIM, , no key Red tiles, m=1; Green tiles, m=0 Stochastic encoder One way to achieve perfect detectability is to use this stochastic QIM scheme. We partition the real number axis into two sets of tiles: red tiles for message 1 and green tiles for message0. If we can make red tiles have the distribution shown by the red segments here, which is along the profile of 2 times the Gaussian covertext distribution, and make green tiles have the distribution shown by the green segments here, which is also along the profile of 2 times the Gaussian covertext distribution. Then the stegotext’s distribution is a mixture of these two distributions, and is exactly the same as the Gaussian covertext distribution, How do we make that happen? Well, if we want to send message 1 and the covertext happens to fall onto the red tiles, we can simply transmit the covertext; otherwise, we can stochastically map the point on green tiles to some point on the red tiles and transmit the mapped value. By this way, we make the stegotext distribution identical with the covertext distribution! That is, perfect undetectability! Here, the encoder is stochastic but the decoder does not need to know how the encoder did the mapping. So, no key needs to be shared between the encoder and decoder. This scheme is not optimal in the sense of distortion control, hence may have low capacity.

Stochastic QIM with keys Postprocess the QIM output! Steganographic constraint: Minimum distortion: Linear programming problem! We can actually design stochastic QIM with keys to improve the capacity. We stochastically postprocess the QIM output, that is to find an optimal stochastic mapping from the QIM output Xtilde to the final stegotext X. The mapping should satisfy the steganographic constraint for any stegotext x, that is to have the same distribution as the covertext. At the same time, the distortion should be minimized since the stochastic mapping is basically introducing noise. The stochastic mapping here is a broader class than deterministic mappings. The advantage of stochatic mapping is that we are solving a linear programming problem. Both the constraint and cost functions are convex functions of the stochastic mapping. Of course, we can also take some suboptimal solutions. For example, we can simply scale the stegotext by the random QIM to let the new stegotext have the same variance as the covertext.

This picture shows the capacities of the scaled randomized QIM and the stochastic QIM without key. Indeed, the stochastic QIM with keys can improve capacity.

Block-based embedding Partition into blocks of length . Embed in each block using aforementioned steganographic methods. In practice, people often do block-based embedding. That is, to partition the covertext sequence into locks and do the embedding using those steganographic schemes I just mentioned. For each block, perfect undetectability constraint is satisfied. But how about the whole sequence?

Detectability for Gaussian covertext For stationary Gaussian covertexts, perfect undetectability in each block doesn’t mean undetectability for the whole sequence. Analysis for low distortions For stationary Gaussian covertexts, we can write the covariance matrix of the stegotext as the sum of the covariance matrix of the covertext plus an additional matrix, which has zero diagonal blocks and nonzero off-diagonal blocks. In the low distortion case, we can approximate the relative entropy between the stegotext and covertext in this form, which means that the detectability of the block-based steganography almost linearly increase with the number of blocks.

Conclusions Perfect undetectability is achievable for Gaussian covertexts, using modified spread-spectrum or stochastic QIM schemes. Relative entropy almost increases linearly with the number of blocks. Finally, we reach our conclusions. I showed how to design perfectly undetectable steganographic schemes for Guassian covertexts by modifying either spread-spectrum or QIM embedding methods. Especially, the stochastic QIM schemes can also be generalized to non-Gaussain covertexts. Then I talked about the detectability of block-based embedding. This concludes my talk. Thank you for your attention!