Audio and Video Watermarking Joseph Huang & Weechoon Teo Mr. Pirate
What is watermarking? Permanent proof of originality for paper media. Permanent proof of ownership for digital media. Watermarking preserves intellectual property unlike encryption. Watermarking is statistically and physically invisible (PRN). Watermarking can be detected even after distortions. Watermarking is done in the frequency, temporal, and/or spatial domains.
Audio Watermarking Robustness: Watermark has to be robust to signal manipulation. Impossible to remove without significant alteration of the signal. Statistically undetectable by others to prevent the efforts of unauthorized removal. Can be fulfilled if the potential number of keys that produce distinct watermarks is large. Detection scheme should be as statistically reliable as possible. False rejection or acceptance of watermark should be minimal.
Audio Watermarking: A Temporal Method, p.1 Does not require original signal for the detection of watermark. Requires only a seed or key. Watermark is embedded into the audio signal by changing the least significant bits of the 16-bit or 8-bit audio samples. Results only in slight amplitude modification in the time domain.
Audio Watermarking: A Temporal Method, p.2 Watermarked signal is formed by the following equation: y(i) is the watermarked audio signal. x(i) is the original audio signal. w(i) is form from a random number generator. f(x(i), w(i)) is a function that accounts for the basic audio masking properties. S is defined as follows
Audio Watermarking: A Temporal Method, p.3 The watermark detection value, r, is calculated by the equation below: Theoretically r € [0, 1], but due to estimation of x(i) by y(i), r € [0-e, 1+e]. A detection threshold of 0.5 can be used to decide on the existence of audio watermark. Figure on right shows the pdf for the value of r in a non-watermarked and watermarked signal. Both distributions have been calculated using 1000 different watermarks with SNR = 26.
Audio Watermarking: Results, p.1 Detection values in a watermarked signal using various seed (key is 444). Only the correct key yields a value of r higher than threshold. No significant shift in PDF after resampling from 44.1KHz to KHz and back 100% success in watermark detection after resampling Requantization from 16-bit to 8-bit and back results in increase of deviation of PDF. Still achieve 99.8% accuracy in watermark detection.
No significant shift in PDF after MPEG3 Layer III 80kbs lossy compression. Based on 0.5 threshold, still achieve 100% watermark detection. Filtered by a moving average filter of length 20 which introduces a noticeable audio distortion Shift in mean and variance but still results in 100% detection kHz signal Low-pass filtered by a 25th order Hamming LPF with cut-off at 22.05KHz. Shift in mean and variance but still results in 100% detection. Audio Watermarking: Results, p.2
Video Watermarking Issues on identical watermarks for each frame Problems in maintaining statistical invisibility. Issues on independent watermarks for each frame Problems in easy removal of watermarks. Robustness: Must survive frame averaging, frame dropping, frame swapping, cropping, temporal rescaling. Must be able to discern imposter watermarks (deadlock). Problems in use of the original video sequence. Problems when no video sequence is needed.
Video Watermarking: Deadlocking Detection and Generation of Pseudorandom Sequence Original sequence is present for comparisons, but what about imposters? Possible solution: Public/Private Key Pseudorandom Generator Embedded Watermark for added authorization Real Original Private KeyPublic Key WatermarkedWatermark Embedded Supplied by author PRN
Video Watermarking: A Method, p.1 Temporal WT Extract Blocks Spatial Masking DCT Frequency Masking X DCT Author signature IDCT X + Watermark block Video FramesWavelet Frames Temporal Wavelet Transform yields: 1) Low-pass frames (Static, non-moving component) 2) High-pass frames (Dynamic, moving component) Frequency and Spatial Masking are tuned to human visual perception.
Video Watermarking: A Method, p.2 Detection of Watermark With knowledge of location in video sequence X = input, R = received coeffs, F = original coeffs, N = noise, W = watermark H 0 : X k = R k - F k = N k (No watermark) H 1 : X k = R k - F k = W k + N k (Watermark) Without knowledge of location in video sequence (just one frame present) Only look at the low-pass frames (static, non-moving component) Decision thresholds are determined by a scalar similarity Typical results
References P. Bassia and I. Pitas, “Robust Audio Watermarking in the Time Domain.” Dept. of Informatics, University of Thessaloniki. Jian Zhao, “Look, It’s Not There.” BYTE Magazine - January M. Swanson, B. Zhu, and A. Twefik, “Multiresolution Scene-Based Video Watermarking Using Perceptual Models.” IEEE Journal on Selected Areas in Communications, IEEE 1998.
Answer to Questions, p.1 How is the key embedded into the watermarked signal, y(i)? The key is a unique code for an author’s identification. This unique code is used to generate a maximum length Pseudo-random Noise sequence. This PN sequence is then used to generate the watermark signal w(i) as show in the diagram above. Thus the key is really utilized by the function w(i). A masking threshold for the audio signal can be generated using MPEG Audio Psychoacoustic Model 1. The PN sequence generated by the key is then filtered with the masking filter M(w) to ensure that the spectrum of the watermark is below the masking threshold. This ensures that the watermark is inaudible after embedding into the signal. The figure on the right shows how y(i) is generated.
Answer to Questions, p.2 What does “statistically undetectable” mean? How do we make a watermark “statistically undetectable”? By “statistically undetectable”, we mean that a pirate is unable to detect the watermark simply by generating the whole set of all possible watermarks. In other word the possibility of a pirate correctly guessing the right key is close to zero. This is to ensure that a pirate is unable to remove or claim ownership for the watermark in the audio signal. The condition for “statistically undetectable” is simply fulfilled by having a huge set of keys that will generate distinct watermarks. This will result in statistical safety for the watermarked audio signal.