Information Theoretical Analysis of Digital Watermarking

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Capacity of Wireless Channels
Enhancing Secrecy With Channel Knowledge
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Topic 6: Introduction to Hypothesis Testing
Chapter 6 Information Theory
Digital Watermarking for Multimedia Security R. Chandramouli MSyNC:Multimedia Systems, Networking, and Communications Lab Stevens Institute of Technology.
Visual Recognition Tutorial
Fundamental limits in Information Theory Chapter 10 :
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Inferences About Process Quality
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
1 Statistical NLP: Lecture 5 Mathematical Foundations II: Information Theory.
Information and Coding Theory
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.
Statistical Decision Theory
Information Coding in noisy channel error protection:-- improve tolerance of errors error detection: --- indicate occurrence of errors. Source.
Channel Capacity.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
CHAPTER 5 SIGNAL SPACE ANALYSIS
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1 Channel Coding (III) Channel Decoding. ECED of 15 Topics today u Viterbi decoding –trellis diagram –surviving path –ending the decoding u Soft.
Secure Spread Spectrum Watermarking for Multimedia Young K Hwang.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Timo O. Korhonen, HUT Communication Laboratory 1 Convolutional encoding u Convolutional codes are applied in applications that require good performance.
WS 2007/08Prof. Dr. J. Schütze, FB GW KI 1 Hypothesis testing Statistical Tests Sometimes you have to make a decision about a characteristic of a population.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
A Brief Maximum Entropy Tutorial Presenter: Davidson Date: 2009/02/04 Original Author: Adam Berger, 1996/07/05
Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
ENTROPY Entropy measures the uncertainty in a random experiment. Let X be a discrete random variable with range S X = { 1,2,3,... k} and pmf p k = P X.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Lecture 1.31 Criteria for optimal reception of radio signals.
Chapter 7. Classification and Prediction
Deep Feedforward Networks
STATISTICS POINT ESTIMATION
12. Principles of Parameter Estimation
Introduction to Information theory
Cryptographic Hash Function
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
LECTURE 03: DECISION SURFACES
Task: It is necessary to choose the most suitable variant from some set of objects by those or other criteria.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Maximal Independent Set
Watermarking with Side Information
S Digital Communication Systems
CONCEPTS OF HYPOTHESIS TESTING
Chapter 9 Hypothesis Testing.
LECTURE 05: THRESHOLD DECODING
REMOTE SENSING Multispectral Image Classification
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Discrete Event Simulation - 4
LECTURE 05: THRESHOLD DECODING
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Using Secret Key to Foil an Eavesdropper
Chapter 11 Limitations of Algorithm Power
One-Way Analysis of Variance
Spread Spectrum Watermarking
LECTURE 05: THRESHOLD DECODING
12. Principles of Parameter Estimation
The z-test for the Mean of a Normal Population
Multivariable optimization with no constraints
Optimization under Uncertainty
Presentation transcript:

Information Theoretical Analysis of Digital Watermarking Multimedia Security

Definitions: X : the output of a source with alphabet X W : a message in a discrete alphabet W={1,2,…,M} Assumption : X is a discrete alphabet, follows a discrete distribution : a rv. which indicates whether X will be watermarked. The variable S is introduced in the model only to provide the possibility of expressing mathematically the existence or non-existence of a watermark in a simple way.

K : a secret key defined on a discrete alphabet k. S=1 : (watermarked version) : (The output of the watermarking function ) S=0 : (non-watermarked version) : The output of the watermarking function depends on the value of K, a secret key which uniquely identifies the copyright owner.

General model of a watermarking system X k W Y Z K General model of a watermarking system g fs ψ q

The watermarked version Y then passes through a noisy channel and is transformed into . This channel models both unintentional distortions suffered by Y and attacks aimed at deleting or corrupting the watermark information. In both cases we assume that the secret key is not known, so the noisy channel can be defined by the distribution which is independent of K.

Finally, Z is processed to obtain a point which will be used by the recipient instead of X. There are two tests that can serve to verify the ownership of Z : the watermark detection test the watermark decoding test the detection test is used to obtain an estimate of S (to decide whether Z has been watermarked using k) the decoding test is used to obtain an estimate of W.

Imperceptibility : Let be a perceptually significant distortion. A watermarking system must guarantee that the functions , and g introduce imperceptible alternations with respect to X. With expectations taken wrt. X, W, K, (Mean Distortion Consraints)

or (Maximum constraints)

Hiding Information The performance of the watermark decoding process is measured by the probability of error, defined as :

For each value of K, the space y is partitioned into decision regions where is the no. of possible hidden messages. Decoding errors are due to the uncertainty about the source output X from which the watermarked version was obtained.

Detecting the Watermark For each value of k, the watermark detection test can be mathematically defined as a binary hypothesis test in which we have to decide if Z was generated by the distribution of or the distribution of , where and W is modeled as a random variable.

Let be the critical region for the watermark detection test performed with k, i.e. the set of point in y where is decided for that key. The watermark detection test is completely defined by the sets

The performance of the watermark detection test is measured by the probabilities of false alarm and detection , defined as :

Suppose there is no distortion during distribution, so Z=Y optimizing the performance of the watermark detection test in terms of and is in a way equivalent to maximizing the Kullback-Laibler distance between distributions : and The maximum achievable distance is limited by the perceptual distortion constraint and entropy of the source.

The probability of collision between keys and : the probability of deciding in the watermark detection test for certain key when Z has been watermarked using a different key . In the context of copyright protection, this probability should be constrained below a maximum allowed value for all pairs ( , ) since otherwise the author in possession of could claim authorship of information watermarked by the author who owns .

This constraint imposes a limit to the cardinality of the key space since the minimum achievable maximum probability of collision between keys increase with the number of keys for fixed and .

Attacks In the following discussion we will assume that the attacker has unlimited computation power and that the algorithm for watermarking, detection and decoding are public. The security of the watermarking system relies exclusively on the secret key K of the copyright owner.

The Elimination Attack Alternate a watermarked source output Y to obtain a negative result in the watermark detection test for the secret key used by the legitimate owner. The alteration made by the attacker should not be perceptible, since the resulting output Z will be used as a substitute for the watermarked source output Y.

This constraint can be expressed in mathematical form as an average distortion constraint or as a maximum distortion constraint where d(.,.) is a distortion function and is the maximum distortion allowed by the attacker.

The Elimination Attack can be represented by a game-theoretic model : Given a certain watermarked source output Y, the attacker will choose the point , subject to the distortion constraint, which maximizes his probability of success.

Under a maximum distortion constraint, this maximum probability of success for a given Y can be expressed as After averaging out over y, the average probability of success in the elimination attack is

We can model the transformation made by the attacker as a channel with conditioned pdf . Then the optimal elimination strategy can be seen as a worst-case channel in the sense that it minimizes the for given critical regions and watermarking function . Note that the attacker is limited to those channels which satisfy the average distortion constraint.

The minimum achievable is a non-increasing function of . The optimum watermarking strategy consists in choosing the watermarking function and the critical regions maximizing the minimum achievable by the attacker through the choice of a channel .Hence, the design of the watermarking system is a robust hypothesis testing problem.

The Corruption Attack The attacker is not interested in eliminating the watermark, but increasing the probability of error in the watermark decoding process.

Cryptographic Security The securing level of the system can be measured by the uncertainty about the key given a watermarked source output Y. Using an information-theoretical terminology, this uncertainty is the conditioned entropy , also called equivocation.

Size of Key Space A minimum cardinality of the key space K is a necessary condition for specifying the equivocation . Increasing the equivocation helps in increasing the robustness against elimination attacks. However, increasing the number of available keys also increases the probability of collision among keys. Therefore, if we specify a maximum allowable probability of collision, this constraint will impose a limit on the maximum number of keys.

Summary Decoding of hidden information is affected by uncertainty due to the source output (not available at the receiver), distortion and attacks. We can think that there is a channel between W and Z which can be characterized by a certain capacity. Watermarking and watermark detection under a constrained maximum probability of collision between keys can be seen as an application of identification via channels, with additional constraints derived from the limited admissible perceptual detection in the watermarking process. The combination of watermark detection and data hiding can be related to the theory of identification plus transmission codes.