Audio Watermarking Charalampos Laftsidis Artificial Intelligence and Information Analysis Lab Aristotle University of Thessaloniki February 2001
The technique’s motiv Due to contemporary technology, there are broadly available tools in order to reproduce and retransmit multimedia data. Potential of both legal and unauthorized manipulation The objective of the watermarking technique Protection against data piracy. (Unauthorized copying and redistribution of data).
Rightful ownership authentication A watermarking system provides owner authentication. It processes the claim of whether the person under consideration is the owner of the digital data (hypothesis testing). The output of the system is therefore binary: Rightful owner or not. Probabilities of false detection and of false alarm (pfd, pfa). Watermarking: a data hiding technique (method for secretly and imperceptibly embedding signals into digital data)
Watermarking system’s requirements Inaudible watermarks Statistically invisible watermarks Similarity of the watermark’s compression characteristics as those of the original signal Reliable detection scheme Robustness to deliberate attacks Robustness to signal manipulation (filtering, compression, resampling, requantization, cropping, noise corruption, D/A - A/ etc.) The system’s algorithm should be available to users The system’s performance should be independent from the signal
Audio masking The effect of a stronger sound on the loudness and hearing threshold of a weaker one, when the latter lies in the frequency or temporal neighborhood of the former one. –Masker (host signal) –Maskee (embedded signal) The human auditory system is a frequency analyzer consisting of a set of 24 bandpass filters. Frequency masking Temporal masking
Different watermarking methods Watermark embedding in the time domain Watermark embedding in the frequency domain (temporal masking is unavoidable) Watermarking MPEG audio streams Echo-hiding techniques (also used for multi-bit information embedding) Phase coding method
Modules of a watermarking system Watermark-signal generation module Watermark embedding module Watermark detection module
Watermark generation Use of a chaotic map (recursive calls of a function). Thresholding the produced values. Formulation of a vector of 1 and –1 (actual watermark). The use of a chaotic map is significant in order to prevent the inverse calculation of the watermark.
Watermark embedding Segmentation of the original sound data in blocks of N samples. Generation of a watermark w(i) of length N using a seed (starting point). Modulation of the watermark, thus producing a signal dependent watermark w’(i): or where denotes a superposition law, which can be addition, multiplication, exponential law.
Watermark embedding Filtering of w’(i) through a lowpass filter (a Hamming filter of order L with b l coefficients for example): Adding the resulting watermark to the original data:
Test signal: segment from Vivaldi’s “L’amoroso” concerto for violin. Signal to Noise Ratio (SNR)=22dB
Watermark detection Correlation Simple correlation Circular correlation The latter case can be calculated through the Fourier transform:
Watermark detection Calculation of the filtered watermark vector w’(i) (filtering just the series of 1 and -1) Calculation of the circular correlation between the test signal and the watermark: If the signal is watermarked, then:
Watermark detection Definition of a scaling factor: Calculation of the detection ratio: If E(w’(i)) is not equal to 0, then where:
Watermark detection Fusion (average) of the detection ratios for all periods: Calculation of the final detection value: Comparison of R to a predefined threshold
Detection results Receiver operating characteristics (ROC): Choice of threshold’s position Probability of False Acceptance (Pfa) Probability of False Rejection (Pfr) Plotting Pfa versus Pfr (in logarithmic scale) Definition of the Equal Error Rate (EER)
Parameters Segment’s size N. Smallest number of segments permitted. Power of the watermark (SNR). Watermark generation map. Watermark’s filtering. Type of embedding: Multiplicative: Additive: Fusion among periods. Detection threshold.
Subjective quality evaluation Presentation of the original and watermarked versions to a set of listeners. 1st test: try to find the watermarked version among 3 presentations: original, watermarked, original or original, original, watermarked 2nd test: mark the quality of the watermarked version as: 5.Imperceptible 4.Perceptible, but not annoying 3.Slightly annoying 2.Annoying 1.Very annoying. Present versions that contain multiple watermarks.
Frequency masking (MPEG-1 psychoacoustic model) Modification of the watermark according to the spectral characteristics of the original signal. Calculation of the spectrum and normalization by a constant value. (s(n): original signal, w(n): predefined window)
Frequency masking (MPEG-1 psychoacoustic model) Identification of tonal components: where j defines a neighborhood around k and can be up to 6, depending on the value of k. Division of the frequency axis into 24 critical bands, according to the perceptual model of the human ear. The bandwidth of each of those critical bands is defined as 1 Bark.
Frequency masking (MPEG-1 psychoacoustic model) Calculation of non-tonal components for every critical band from the remaining signal energy. Calculation of the absolute hearing threshold. Removal of components that fall below the absolute hearing threshold or of those that are separated by more than 0.5 Barks. Calculation of individual and global masking thresholds.
Topics to be investigated The deadlock problem: the method cannot easily distinguish which watermark was embedded first, if a pirate embeds his own one on watermarked data. Special attacks on the watermark. Watermarking of short segments of sounds that may be used. Inability to use the full properties of a high-pass (especially) chaotic generators, because of filtering the watermark during the embedding procedure.
Bibliography P. Bassia, I. Pitas, N. Nikolaidis “Robust audio watermarking in the time domain”, Dept. of Informatics, University of Thessaloniki, November L. Boney, A. H. Tewfik, K. N. Hamdy, “Digital watermarks for audio signals”, in Proc. of EUSIPCO ’96, September 1996, vol. III, pp M. D. Swanson, B. Zhu, A. H. Tewfik, L. Boney, “Robust audio watermarking using perceptual masking”, Elsevier Signal Processing, Sp. Issue on Copyright Protection and Access control, vol. 66, no. 3, pp , W. Kim, J. Lee, W. Lee, “An audio watermarking scheme robust to mpeg audio compression”, in Proceedings NSIP ’99, Antalya, Turkey, June 1999, vol. I, pp I. J. Cox, J. Kilian, F. T. Leighton, T. Shammon, “Secure spread spectrum watermarking for multimedia”, IEEE Transactions on Image Processing, vol. 6, no. 12, pp , K. Nahrsted, “Non-invertible watermarking methods for mpeg video and audio”, in Multimedia and Security Workshop, ACM Multimedia ’98, Bristol, UK, September D. Gruhl, A. Lu, W. Bender, “Echo Hiding”, in Proceedings of 1 st Information Hiding Workshop, Cambridge, U.K., May 1996, pp W. Bender, D. Gruhl, N. Morimoto,, A. Lu, “Techniques for data hiding”, IBM Systems Journal, vol. 35, no. 3 and 4, pp , F. A. Peticolas, R. J. Anderson, “Weaknesses of Copyright Marking Systems”, in Multimedia and Security Workshop, ACM Multimedia ’98, pp , Bristol, UK, September 1998.