Multimedia Data Hiding: What, Why, and How? Kaushal Solanki Vision Research Laboratory University of California, Santa Barbara Contact info: solanki@ece.ucsb.edu http://vision.ece.ucsb.edu
Outline Preview Introduction: What is Data Hiding? Applications: Why Data Hiding? Techniques: How to Hide Data? Principles State of the art Challenges 20/11/2004
The “digital” age has come… For a moment, sit back, and think how much our lives have changed in the past decade itself. 20/11/2004
Telecommunication: “Then” 20/11/2004
Digital age telecommunication 20/11/2004
Information access: “Then” 20/11/2004
Digital age Information access 20/11/2004
We are migrating from analog to digital… Digital cameras replacing their analog counterpart. Audio CDs replacing cassettes. DVDs replacing video cassettes. Digital video cameras replacing analog ones. Digital high definition TVs have a very good future. 20/11/2004
The endangered species… 20/11/2004
The future… 20/11/2004
The future… 20/11/2004
The digital representation Many advantages Lossless and convenient to store. Lossless and convenient transmission. Lossless and easy to make a copy. Extremely easy to share across many users. Easy to edit. Durable, inexpensive and easily searchable retrieval. 20/11/2004
But … Some advantages have a dark side : Piracy is very easy. Sharing and peer to peer transfer of multimedia and software. Kazaa, Limewire, Morpheus, Napster, etc. Unreliable Cannot be used as evidence in court. Can be misused to create false impressions. 20/11/2004
Outline Preview Introduction: What is Data Hiding? Applications: Why Data Hiding? Techniques: How to Hide Data? Principles State of the art Challenges 20/11/2004
A potential Solution Multimedia Data Hiding Idea: Imperceptibly embed copyright or other information into the multimedia data. Challenge: Maintain imperceptibility while embedding sufficient information and be robust against manipulations. 20/11/2004
Multimedia Data Hiding: An Emerging Field Very young and growing field. Well over 90% publications in past 7 years. Why so? Digital pictures, video, and sound tracks gaining popularity. Widespread illegal copying and piracy. Emergence of peer-to-peer file sharing networks. 20/11/2004
Multimedia Data Hiding: A multidisciplinary field Image and signal processing Cryptography Communication theory Error correction coding Signal compression Visual perception theory Information theory 20/11/2004
Ever heard of… Steganography Watermarking Fingerprinting Authentication Information hiding, or information embedding Data embedding 20/11/2004
What is Data Hiding? Data hiding is the process by which a message signal, or signature is imperceptibly embedded within a host data set to form a composite signal. 20/11/2004
A data hiding system Multimedia “Host” Embedding Algorithm Decoding Secret message Attack Channel Secret message Composite data Secret “key” 20/11/2004
Design Issues Transparency: No perceptual degradation Robustness: Survive benign or malicious attacks Volume: Embed as much information as possible Blind/Non-Blind Performance: The encoding/decoding complexity 20/11/2004
Other Design Issues Undetectability: Significant for steganographic applications. How much information can I transmit without it being detected? Graceful degradation: Less severe attack => More information recovered Not considered widely in the literature! 20/11/2004
Tradeoffs Robustness Undetectability Capacity Robustness Performance 20/11/2004
Outline Preview Introduction: What is Data Hiding? Applications: Why Data Hiding? Techniques: How to Hide Data? Principles State of the art Challenges 20/11/2004
Applications: Why Data Hiding? Steganography Copyright protection Document authentication Annotation of medical images Tamper detection and localization Seamless upgrade of multimedia Secure communication in mobile environments Embedding meta data into multimedia 20/11/2004
Applications(I): Steganography “Steganography is the art and science of communicating in a way that hides the existence of the communication itself” Prisoners’ problem, Simmons[1983] Alice and Bob are in jail and wish to hatch an escape plan. All their communication pass through warden Willie, who would frustrate their plan by throwing them into secret confinement if he detects any communication between them. 20/11/2004
Applications(I): Steganography Robustness Undetectability Capacity Secure Steganography Naïve Steganography 20/11/2004
Applications(II): Digital Watermarking Copyright protection: Owner identification and proof of ownership. Digital fingerprinting: Hide unique signatures into different copies of the same content before distribution. Design issues: Robustness to a variety of malicious attacks. Transparency: preserve the ‘value’ of the content. 20/11/2004
Applications(II): Digital Watermarking Robustness Undetectability Capacity Digital Watermarking Secure Steganography Naïve Steganography 20/11/2004
Applications (III): Document Authentication Embedding information robustly in documents to authenticate them. Can be used for authenticating passport, identification cards, etc. Design issues: Robustness to print-scan, with control on printing. Fast decoding. 20/11/2004
Applications (IV) : Tamper Detection and Localization Need protection against tampering of “evidence” within a forensic image. Appropriately designed hiding system can decode the hidden information robustly and localize the tampered area. Design issues: Robustness against tampering (of course!). Transparency. 20/11/2004
Tamper Localization Example 6,301 bits hidden against image tampering After decoding is performed, the tampered region can be automatically localized 20/11/2004
Applications (V): Other applications Seamless upgrade of multimedia. Secure communication in mobile environments. Embedding meta data in multimedia. Broadcast monitoring. We believe that many other applications will come up driven by the availability of the technology 20/11/2004
Outline Preview Introduction: What is Data Hiding? Applications: Why Data Hiding? Techniques: How to Hide Data? Principles State of the art Challenges 20/11/2004
Broad Classification Least Significant Bit (LSB) modulation Modify the LSB with the data to be embedded. Spread Spectrum (SS) modulation Involves adding a spread version of the data. Quantization Index Modulation (QIM) Hide data in the choice of vector quantizer 20/11/2004
LSB Hiding Modify the least significant bit of the image. Can be either in spatial or transform domain. Advantages: Very simple method – low complexity. Virtually no perceptual distortion (spatial LSB hiding). Disadvantages: Not robust against attacks. Can be easily “detected”. 20/11/2004
Spread Spectrum Hiding Probably most widely used watermarking method. Add a spread version of the watermark to the host image, either in spatial or transform domain. Advantages: Robust against various attacks. Allows perceptual shaping. Disadvantages: Blind detection possible, but difficult. Host signal interference for blind detection. Low capacity. 20/11/2004
QIM Hiding Hide data into choice of quantizer. Provably “good”, under certain assumptions [Chen & Wornell 2001]. Advantages: Blind decoding comes “naturally” (no host signal interference). Robust against various attacks. High capacity constructions possible. Disadvantages: Perceptual adaptation is not straightforward. Steganalysis is shown to be easier than SS. 20/11/2004
Embed data in choice of vector quantizers Red Quantizer: Embed 0 Blue Quantizer: Embed 1 Embed data in choice of vector quantizers 20/11/2004
Scalar QIM 20/11/2004
Scalar QIM Example Consider the case of odd-even quantizers. Quantization to odd value is a ‘1’, and to even values is a ‘0’. Let us assume that the host coefficient value is 6.729. To embed ‘1’, we send the nearest odd number, i.e., 7.0 Likewise, to embed ‘0’, we send 6.0 20/11/2004
Outline Preview Introduction: What is Data Hiding? Applications: Why Data Hiding? Techniques: How to Hide Data? Principles State of the art Challenges 20/11/2004
“Well known” Principles Embed in transform domain DCT, Wavelet, DFT, etc. Key dependent transformaion Embed in high-variance bands [Moulin & Mihcak] Embed in low and mid frequency bands. Use error correction codes for achieving robustness. 20/11/2004
Information-theoretic results Data hiding can be considered as communication with side information at the encoder. Message = data to be transmitted. Channel = “attack”. Side information = host. Writing on dirty paper [Costa `82]. Showed that capacity is same for the side information case as one without side information. That is ‘no host interference’ is possible. 20/11/2004
Summary of Info. theoretic results Formulation D1 denotes mean squared embedding induced distortion. D2 denotes mean Squared attack distortion. Data Hiding Capacity with AWGN attack, termed Vector Capacity is (for small D1 and D2 regime): [Costa] 20/11/2004
Outline Preview Introduction: What is Data Hiding? Applications: Why Data Hiding? Techniques: How to Hide Data? Principles State of the art Challenges 20/11/2004
State of the art Steganography and Steganalysis Image Data Hiding Techniques that are undetectable using histograms have been developed. Stochastic QIM [Moulin et al `04]. Steganalysis of block-DCT hiding [Fridrich et al] Image Data Hiding Robust image-adaptive hiding: [Solanki et al `04]. Reversible data hiding Image-in-image hiding [Solanki et al `03] 20/11/2004
State of the art (II) Surviving Geometric Attacks Rotation, Scale, and Translation (RST) invariant spread spectrum data hiding [Ruanaidh and Pun `98]. Use tessellation points in images [Bas et al `02]. Print-and-scan resilient data hiding Modeling spatial and geometric distortions separately [Lin et al `99]. SELF: Selective embedding in low frequencies scheme [Solanki et al `04]. 20/11/2004
Outline Preview Introduction: What is Data Hiding? Applications: Why Data Hiding? Techniques: How to Hide Data? Principles State of the art Challenges 20/11/2004
Challenges Ahead Steganography Steganalysis Digital watermarking Creating methods that use “memory”. Steganalysis Use continuity in images to detect low rate hiding. Digital watermarking Consistently surviving geometric attacks, such as Stirmark random bending. Survive print-scan reliably. Tuning data hiding schemes for specific applications, such as tamper detection. 20/11/2004
Thank You