Accusation probabilities in Tardos codes Antonino Simone and Boris Škorić Eindhoven University of Technology CWG, Dec 2010.

Slides:



Advertisements
Similar presentations
Feedback Reliability Calculation for an Iterative Block Decision Feedback Equalizer (IB-DFE) Gillian Huang, Andrew Nix and Simon Armour Centre for Communications.
Advertisements

Real-Time Template Tracking
Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.
1 An Asymmetric Fingerprinting Scheme based on Tardos Codes Ana Charpentier INRIA Rennes Caroline Fontaine CNRS Télécom Bretagne Teddy Furon INRIA Rennes.
Asymptotically false-positive- maximizing attack on non-binary Tardos codes Antonino Simone and Boris Škorić Eindhoven University of Technology IH 2011,
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
Evaluating Classifiers
Fast Algorithms For Hierarchical Range Histogram Constructions
Traitor Tracing Jan-Jaap Oosterwijk Eindhoven University of Technology (TU/e) Department of Mathematics.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Traitor Tracing Vijay Ramachandran CS 655: E-commerce Foundations October 10, 2000.
Traitor Tracing Papers Benny Chor, Amos Fiat and Moni Naor, Tracing Traitors (1994) Moni Naor and Benny Pinkas, Threshold Traitor Tracing (1998) Presented.
Statistical properties of Tardos codes Boris Škorić and Antonino Simone Eindhoven University of Technology Stochastics Seminar, 28 Nov
x – independent variable (input)
A Data-Driven Approach to Quantifying Natural Human Motion SIGGRAPH ’ 05 Liu Ren, Alton Patrick, Alexei A. Efros, Jassica K. Hodgins, and James M. Rehg.
1 An Asymptotically Optimal Algorithm for the Max k-Armed Bandit Problem Matthew Streeter & Stephen Smith Carnegie Mellon University NESCAI, April
Parametric Inference.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Using CTW as a language modeler in Dasher Martijn van Veen Signal Processing Group Department of Electrical Engineering Eindhoven University.
Asymptotic fingerprinting capacity in the Combined Digit Model Dion Boesten and Boris Škorić presented by Jan-Jaap Oosterwijk.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
1 Chapter 12 Introduction to Statistics A random variable is one in which the exact behavior cannot be predicted, but which may be described in terms of.
Standardized Score, probability & Normal Distribution
1 Failure Correction Techniques for Large Disk Array Garth A. Gibson, Lisa Hellerstein et al. University of California at Berkeley.
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
The Holey Grail A special score function for non-binary traitor tracing Boris Škorić Jan-Jaap Oosterwijk Jeroen Doumen.
Blind Pattern Matching Attack on Watermark Systems D. Kirovski and F. A. P. Petitcolas IEEE Transactions on Signal Processing, VOL. 51, NO. 4, April 2003.
Anti-collusion fingerprinting for Multimedia W. Trappe, M. Wu, J. Wang and K.J. R. Liu, IEEE Tran. Signal Processing, Vol. 51, No. 4, April 2003.
Collusion-Resistance Misbehaving User Detection Schemes Speaker: Jing-Kai Lou 2015/10/131.
Accusation probabilities in Tardos codes Antonino Simone and Boris Škorić Eindhoven University of Technology WISSec 2010, Nov 2010.
Hidden Markov Models Yves Moreau Katholieke Universiteit Leuven.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Threshold Phenomena and Fountain Codes Amin Shokrollahi EPFL Joint work with M. Luby, R. Karp, O. Etesami.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
University of Massachusetts Amherst · Department of Computer Science Square Root Law for Communication with Low Probability of Detection on AWGN Channels.
Information Theory Linear Block Codes Jalal Al Roumy.
Secure Spread Spectrum Watermarking for Multimedia Young K Hwang.
ECE 8443 – Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem Proof EM Example – Missing Data Intro to Hidden Markov Models.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
1 Traitor Tracing. 2 Outline  Introduction  State of the art  Traceability scheme  Frameproof code  c-secure code  Combinatorial properties  Tracing.
A Power Independent Detection (PID) Method for Ultra Wide Band Impulse Radio Networks Alaeddine EL-FAWAL Joint work with Jean-Yves Le Boudec ICU 2005:
1 Watermarking Scheme Capable of Resisting Sensitivity Attack IEEE signal processing letters, vol. 14, no. 2, February. 2007, pp Xinpeng Zhang.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
Mix networks with restricted routes PET 2003 Mix Networks with Restricted Routes George Danezis University of Cambridge Computer Laboratory Privacy Enhancing.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.
1 Review of Probability and Random Processes. 2 Importance of Random Processes Random variables and processes talk about quantities and signals which.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Reasoning in Psychology Using Statistics Psychology
Watermarking Scheme Capable of Resisting Sensitivity Attack
How to forecast solar flares?
Introduction to Audio Watermarking Schemes N. Lazic and P
The normal distribution
LECTURE 10: EXPECTATION MAXIMIZATION (EM)
Reasoning in Psychology Using Statistics
Expectation-Maximization
Antonino Simone and Boris Škorić Eindhoven University of Technology
Modelling data and curve fitting
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Chapter 6 Confidence Intervals.
Sudocodes Fast measurement and reconstruction of sparse signals
Foundation of Video Coding Part II: Scalar and Vector Quantization
Dynamic Traitor Tracing for Arbitrary Alphabets: Divide and Conquer
Information Theoretical Analysis of Digital Watermarking
Confidence Intervals for the Mean (Large Samples)
Chapter 6 Confidence Intervals.
Presentation transcript:

Accusation probabilities in Tardos codes Antonino Simone and Boris Škorić Eindhoven University of Technology CWG, Dec 2010

Outline Introduction to forensic watermarking ◦ Collusion attacks ◦ Aim ◦ Attack models Tardos scheme ◦ Code length history ◦ q-ary version ◦ Properties New parameterization Majority voting effect Performance of the Tardos scheme ◦ False accusation probability Results & Summary

Forensic Watermarking EmbedderDetector original content payload content with hidden payload WM secrets payload original content Payload = some secret code indentifying the recipient ATTACK

Collusion attacks "Coalition of pirates" 1 pirate #1 Attacked Content / #2 #3 #4 = "detectable positions"

Aim Trace at least one pirate from detected watermark BUT Resist large coalition  longer code Low probability of innocent accusation (FP) (critical!)  longer code Low probability of missing all pirates (FN) (not critical)  longer code AND Limited bandwidth available for watermarking code

Attack models Once pirates detect watermark positions, what can they do? 1. Restricted digit model ◦ Choice from available symbols only 2. Unreadable digit model ◦ Erasure allowed 3. Arbitrary digit model ◦ Arbitrary symbol (but not erasure) 4. General digit model AABD BABB AACA ABAB ABCBC ABDABD Alphabet={A,B,C,D} AABD BABB AACA ?AB?AB A?BC?BC ?ABD?ABD AABD BABB AACA ABCDABCD AABCDABCD ABCDABCD AABD BABB AACA ?ABCD?ABCD A?ABCD?ABCD ?ABCD?ABCD More realistic scenario Simpler to analyze equivalent for binary symbols

Code length history Construction Boneh and Shaw 1998: Tardos 2003: Chor et al 2000: Staddon et al 2001: Huang + Moulin; Amiri + Tardos 2009: Lower bound c 0 = #pirates n = #users m = code length in symbols q = alphabet size  1 = Prob[accuse specific innocent]  = Prob[not all accused are guilty]  2 = False Negative prob.

n users embedded symbols m content segments Symbols allowed Symbol biases drawn from distribution F watermark after attack ABCB ACBA BBAC BABA ABAC CAAA ABAB biases ACAC ABAB AABCABC p 1A p 1B p 1C p 2A p 2B p 2C p iA p iB p iC p mA p mB p mC c pirates q-ary Tardos scheme (2008) Arbitrary alphabet size q Dirichlet distribution F Symbol-symmetric =y ABCB ACBA BBAC BABA ABAC CAAA ABAB

Tardos scheme continued Accusation: Every user gets a score User is accused if score > threshold Sum of scores per content segment Given that pirates have y in segment i: Symbol-symmetric g 0 (p) g 1 (p) p p

Properties of the Tardos scheme Asymptotically optimal Random code book No framing ◦ No risk to accuse innocent users if coalition is larger than anticipated F, g 0 and g 1 chosen ‘ad hoc’ (can still be improved)

Accusation probabilities m = code length c = #pirates μ ̃ = expected coalition score per segment Pirates want to minimize μ ̃ and make longer the innocent tail Curve shapes depend on:  F, g 0, g 1 (fixed ‘a priori’)  Code length  # pirates  Pirate strategy Central Limit Theorem  asymptotically Gaussian shape (how fast?) 2003  2010: innocent accusation curve shape unknown… till now! threshold total score (scaled) innocent guilty

New parameterization Necessary a new parameterization! K b =quantity depends on pirate strategy K b can be pre-computed Which strategy minimizes μ ̃ ? Symbol-symmetric  we take care only the symbol occurrences  = pirate occurrences vector   α = # α in segment c pirates   α  α = c W(b) b

Some attack definitions Majority voting ◦ y i = symbol that occurs most in segment i AABD BABB AACA ABAB ABCBC ABDABD AABP[A]=1/3 P[B]=1/3 P[D]=1/3 AABD BABB AACA ABAB ABCBC ABDABD P[A]=2/3 P[B]=1/3 AP[B]=2/3 P[C]=1/3 P[A]=1/3 P[B]=1/3 P[D]=1/3 Interleaving attack ◦ Prob[y i = α ] =  α /c Example:

Majority voting Theorem: Majority voting strategy minimizes μ ̃ Proof (intuitive): Case 1 : only 2 symbols detected c=19 Best choice W(b) b

Majority voting Theorem: Majority voting strategy minimizes μ ̃ Proof (intuitive): Case 2: more than two symbols detected one symbol occurs more than c/2 times c=19 Best choice W(b) b

Majority voting Theorem: Majority voting strategy minimizes μ ̃ Proof (intuitive): Case 3: more than two symbols detected all symbols occur less than c/2 times c=19 Best choice W(b) b

Innocent curve behaviour Motivations: ◦ Most critical part in the Tardos scheme (FP ≈ ) ◦ Still unknown ◦ Unknown innocent curve  unknown real code length ◦ Is Gaussian approximation good?

Approach Fourier transform property: Steps: 1.S =  i S i Si Si   = pdf of total score S S   = InverseFourier[ ] 2. 3.Compute Depends on strategy New parameterization for attack strategy 4.Compute 5. Taylor Trouble doing numerics (integral does not converge)

Main result: false accusation probability curve Example: interleaving attack threshold/√m exact FP log 10 FP Result from Gaussian

Main result: false accusation probability curve Example: interleaving attack Better than Gaussian! Conclusion: Gaussian approximation is worse for larger q

Main result: false accusation probability curve Example: majority voting attack threshold/√m exact FP Result from Gaussian FP is  70 times less than Gaussian approx in this example But  Code 2-5% shorter than predicted by Gaussian approx log 10 FP

Summary Results: introduced a new parameterization of the attack strategy majority voting minimizes μ ̃ first to compute the innocent score pdf ◦ quantified how close FP probability is to Gaussian ◦ sometimes better then Gaussian! ◦ safe to use Gaussian approx ◦ larger q  Gaussian approximation less good Future work: study more general attacks different parameter choices Thank you for your attention!