Probability and Random Processes with Applications to Signal Processing Stark, H. and Woods, J.W.
Chapter 1: Random variables Probability Histogram or probability density function Cumulative function Mean Variance Moments Some representations of random variables Bi-dimensional random variables Marginal distributions Independence Correlations Gaussian expression of multiple random variables Changing random variables
Physical measurements Introduction signal = every entity which contains some physical information Examples: Acoustic waves Music, speech, ... Electric current given by a microphone Light source (star, …) ... Light waves Current given by a spectrometer Number series Physical measurements ... Photography
Signal processing = procedure used to: extract the information (filtering, detection, estimation, spectral analysis...) Adapt the signal (modulation, sampling….) (to transmit it or save it) pattern recognition In physics: TS Physical system signal Transmission Detection Analysis interpretation Noise source
Exemples: Astronomy: Electromagnetic waves information concerning stars Sig. Process.: sampling filtering spectrale analysis ... signal V(t) Atmosphere noise Transmitted light Signal processing: Spectral analysis Synchronous detection ... Light rays incident I(t) detector Sample test Periodic opening
Classification of signals : Dimensional classification : Number of free variables. Examples : Electrical potential V(t) = Unidimensional signal Statistic image black and white brightness B(x,y) = bi-dimensional signal Black and white film B(x,y,t) = tri-dimensional signal ... The signal theory is independent on the physic phenomenon and the types of variables. Phenomenological Classification Random or deterministic evolution Deterministic signal : temporal evolution can be predicted or modeled by an appropriate mathematical mode Random signal : the signal cannot be predicted statistical description Every signal has a random component (external perturbation, …)
Morphological classification: [Fig.2.10,(I)]
Probability
Probability If two events A and B occurs, P(B/A) is the conditional probability If A and B are independent, P(A,B)=P(A).P(B)
Random variable and random process Let us consider the random process : measure the temperature in a room Many measurements can be taken simultaneously using different sensors (same sensors, same environments…) and give different signals z1 t1 t t2 z2 Signals obtained when measuring temperature using many sensors z3
Random variable and random process The random process is represented as a function Each signal x(t), for each sensor, is a random signal. At an instant t, all values at this time define a random variable z1 t1 t t2 z2 Signals obtained when measuring temperature using many sensors z3
Probability density function (PDF) The characteristics of a random process or a random variable can be interpreted from the histogram N(m, ti) = number of events: "xi = x + Dx" Precision of measurment N(m) x Nmes = total number of measurments (m+1) Dx m Dx
PDF properties Id Δx=dx (trop petit) so, the histogram becomes continuous. In this case we can write:
Histogram or PDF Random signal Sine wave : f(x) x -1 1 Uniform PDF
Cumulative density function
examples
Expectation, variance Every function of a random variable is a random variable. If we know the probability distribution of a RV, we can deduce the expectation value of the function of a random variable: Statistical parameters : Average value : Mean quadratic value: Variance : Standard deviation :
Moments of higher order The definition of the moment of order r is: The definition of the characteristic function is: We can demonstrate:
Exponential random variable
Uniform random variable f(x) c x a b
Gaussian random variable
Triangular random variable f(x) c a b F(x) c x
Triangular random variable f(x) c a b F(x) c x
Bi-dimensional random variable Two random variables X and Y have a common probability density functions as : (X,Y) fXY(x,y) is the probability density function of the couple (X,Y) Example:
Bi-dimensional Random variables Cumulative functions: Marginal cumulative distribution functions Marginal probability density functions
Bi-dimensional Random variables Moments of a random variable X If X and Y are independents and in this case
Covariance
Covariance
Correlation coefficient
Correlation coefficient
Correlation coefficient
PDF of a transformed RV Suppose X is a continuous RV with known PDF Y=h(X) a function of the RV X What is the PDF of Y?
PDF of a transformed RV: exercises X is a uniform random variable between -2 and 2. Write the expression on pdf of X Find the PDF of Y=5X+9
Exercise Let us consider the bidimensional RV: Find c Compute the CDF of f(x,y) Compute GX(x) and GY(y) Compute the moments of order 1 and 2
Sum of 2 RVs
Sum of 2 RVs
Sum of 2 RVs
Sum of 2 RVs
Chapter 2: Random functions Definitions Probability density functions Cumulative density functions Moments of a random functions Covariance Stationary process Statistical auto and inter-correlation Spectral density estimation From autocorrelation Bank filters method Periodogram White noise analysis
Signals obtained when measuring temperature using many sensors Random functions Let us consider the random process : measure the temperature in a room Many measurements can be taken simultaneously using different sensors (same sensors, same environments…) and give different signals z1 t1 t t2 z2 Signals obtained when measuring temperature using many sensors z3
PDF and CDF of Random Process Probability density function is fX(x;t) or f(x1,x2,x3,…xn;t1,t2,…tn) Cumulative density function We can write
Mean and correlations
Correlation coefficient
Stationarity
Stationarity
ergodic
Ergodic
Ergodic
Ergodic
White noise
White noise effect
Noise
Autocorrelation properties
Cross correlation properties
Jointly Stationary Properties Uncorrelated: Orthogonal: Independent: if x(t1) and y(t2) are independent (joint distribution is product of individual distributions)
Cross correlation
Spectral density
Spectral density
Spectral density
Cross spectral density
Simulation
Simulation
Simulation
Simulation
Filtering of random signals
Filtering of random signals
Filtering of random signals
Filtering of random signals
Filtering of random signals
Filtering of random signals
Chapter 3: Signal modelling Definition AR modeling Expression Spectral density estimations Coefficients calculation MA modeling Expression and spectral density estimation ARMA modeling
Numerical filtering FIR Filter IIR Filter
Filter realization Non recursive, using delay elements, multiplication, addition x(n) Z-1 Z-1 Z-1 Z-1 b0 b1 b2 b3 bM y(n) +
Filter realization Recursive realization w(n) y(n) x(n) + - Z-1 Z-1
Example: Equivalent numerical RC filter R Analog Differential Equation x(t) y(t) C Numerical approximation Equation of differences N =1, M= 0, ao=RC+1, a1= -RC, bo=1 Filter realization Recursive equation Computer algorithm Numerical filter x(k) y(k) * x(k) G(z)=z-1 x(k-1) + bo /ao * z-1 -a1/ao y(k-1)
Modeling?
AR Modeling The aim is to represent a stochastic signal using a parametric model. An autoregressive signal of order p is written as indicated in the following equation : A sample at instant n can be estimated from its p previous parameters. The difference between the estimated value and the original value is a white noise v(n). V(z)
AR modeling AR realization: generation of signal from white noise Problem: Determine -Order -coefficients x(n) Z-1 Z-1 Z-1 Z-1 a0=1 a1 a2 a3 ap v(n) +
AR model : Coefficients
AR modeling
Moving average (MA) model A signal is MA modeled of order q when the signal can be written as: V(n) is a white noise Problem: Determine - structure of the filter -Order -coefficients
MA model: realization v(n) Z-1 Z-1 Z-1 Z-1 a0=1 a1 a2 a3 ap x(n) +
ARMA Model V(n) is a white noise Problem: Determine A signal is ARMA modeled (AutoRegressive-Moving Average), order p and q, if the signal can be written as: V(n) is a white noise Problem: Determine - structure of the filter -Order -coefficients
ARMA realization Model v(n) x(n) + - Z-1 Z-1 Z-1 Z-1
Spectral Density Spectral density Properties : By inverse Fourier transform: Energy Frequency distribution of the signal is independent of the phase of the signal (Arg[X(f)]) Not sensible to the delay of the signal
Spectral density: Periodogram A periodogram is a method used to compute the spectral density, using parts from the original signal x(t) t T = Random Limitations: Width and window used, Duration of measures, number of X(f,T)
spectral density: periodogram
Spectral density: Banc of filters Principle f1, B1 Multi Channel Display x(t) f2, B2 fn, Bn selectif filters Bn G(f) fn
Spectral density: from autocorrelation Wiener-Khinchine theorem: For stationary signals : For ergodic signals: Unique definition of the spectral density, if it is random or deterministic. Experiment: T finite x(t) A/D FFT retard
Spectral density: Spectrum analyzer Principle: x(t) xm(t) Selective filter B, fo Power measurements Display Commanded Oscillator Scanning B |Xm(f)| > < -fo fo f
Spectral density: After modelling
Spectral density: After modulation
Estimation of parameters of signals, Statistical parameters
Estimation of parameters from spectral density Spectral moment formula 1- Power of the signal : M0 2- Mean frequency: MPF=M1/M0 3- Dissymmetry coefficient: CD
Estimation of parameters from spectral density 4- Kurtosis (pate coefficients) 5- Median frequency Fmed: compose the surface under S(f) into 2 equals area 6- Peak of frequency 7-relative energy by frequency band
Estimation of parameters from spectral density 8- Ratio H/L (High/Low): 9- Percentiles or fractiles fk: 10- Spectral Entropy H
Estimation of parameters from spectral density
Chapter 4: detection and classification in random signals Definition Statistical tests for detection Likelihood ratio Example of detection when change in mean Example of detection when change in variances Multidimensional detection
Detection: definition Hypotheses : estimated Known or unknown
Gaussian distributions Normal distributions
Chi2 distributions Loi du Chi 2 (Khi-two of Pearson) 10 dof 15 dof chi2 with k degree of freedom E[chi2]=k Variance of Chi2=2k
Fisher Test F(6,7) F(6,10) Example: Detection in signals Student distribution F(6,7) F(6,10) Student with k degree of freedom Fisher-Snédécor Distribution Fisher with k and l degree of freedom Example: Detection in signals
Detection: definition Hypotheses : estimated Known or unknown
Parameters definition False alarm Detect H1, H0 is correct Detection Detect H1, H1 is correct Miss detection Detect H0, H1 is correct
Likelihood ratio Detection in signals
Variation in mean Detection in mean H0 : z(t) = 0 + b(t) = b(t) H1 : z(t) = m + b(t)
Detection in variance Detection in variance
Parameters False Alarm probability Detection probability
Parameters
Neymen pearson method Fix the probability of false alarm Estimate the threshold
Detection: multidimensional case
Distribution de Fisher-Snedecor a = 0,05
DISTRIBUTION DU KHI-DEUX
DISTRIBUTION DU KHI-DEUX (suite)
LOI NORMALE CENTRÉE RÉDUITE
Chapter 4: Time frequency and wavelet analysis Definition Time frequency Shift time fourier transform Winer-ville representations and others Wavelet transform Scalogram Continuous wavelet transform Discrete wavelet transform: details and approximations applications
The Story of Wavelets Theory and Engineering Applications Time frequency representation Instantaneous frequency and group delay Short time Fourier transform –Analysis Short time Fourier transform – Synthesis Discrete time STFT
Time-domain techniques Freq.-domain techniques Signal processing Signal Processing Time-domain techniques Freq.-domain techniques TF domain techniques Filters Fourier T. Stationary Signals Non/Stationary Signals STFT WAVELET TRANSFORMS CWT DWT MRA 2-D DWT SWT Applications Denoising Compression Signal Analysis Disc. Detection BME / NDE Other…
FT At Work
FT At Work F F F
FT At Work F
Stationary and Non-stationary Signals FT identifies all spectral components present in the signal, however it does not provide any information regarding the temporal (time) localization of these components. Why? Stationary signals consist of spectral components that do not change in time all spectral components exist at all times no need to know any time information FT works well for stationary signals However, non-stationary signals consists of time varying spectral components How do we find out which spectral component appears when? FT only provides what spectral components exist , not where in time they are located. Need some other ways to determine time localization of spectral components
Stationary and Non-stationary Signals Stationary signals’ spectral characteristics do not change with time Non-stationary signals have time varying spectra Concatenation
Non-stationary Signals 5 Hz 20 Hz 50 Hz Perfect knowledge of what frequencies exist, but no information about where these frequencies are located in time
FT Shortcomings Complex exponentials stretch out to infinity in time They analyze the signal globally, not locally Hence, FT can only tell what frequencies exist in the entire signal, but cannot tell, at what time instances these frequencies occur In order to obtain time localization of the spectral components, the signal need to be analyzed locally HOW ?
Short Time Fourier Transform (STFT) Choose a window function of finite length Put the window on top of the signal at t=0 Truncate the signal using this window Compute the FT of the truncated signal, save. Slide the window to the right by a small amount Go to step 3, until window reaches the end of the signal For each time location where the window is centered, we obtain a different FT Hence, each FT provides the spectral information of a separate time-slice of the signal, providing simultaneous time and frequency information
STFT
STFT Time parameter Frequency parameter Signal to be analyzed FT Kernel (basis function) STFT of signal x(t): Computed for each window centered at t=t’ Windowing function Windowing function centered at t=t’
STFT at Work Windowed sinusoid allows FT to be computed only through the support of the windowing function 1 1 0.5 0.5 -0.5 -0.5 -1 -1 -1.5 -1.5 100 200 300 100 200 300 1 1 0.5 0.5 -0.5 -0.5 -1 -1 -1.5 -1.5 100 200 300 100 200 300
STFT Time-Frequency Representation (TFR) STFT provides the time information by computing a different FTs for consecutive time intervals, and then putting them together Time-Frequency Representation (TFR) Maps 1-D time domain signals to 2-D time-frequency signals Consecutive time intervals of the signal are obtained by truncating the signal using a sliding windowing function How to choose the windowing function? What shape? Rectangular, Gaussian, Elliptic…? How wide? Wider window require less time steps low time resolution Also, window should be narrow enough to make sure that the portion of the signal falling within the window is stationary Can we choose an arbitrarily narrow window…?
Selection of STFT Window Two extreme cases: W(t) infinitely long: STFT turns into FT, providing excellent frequency information (good frequency resolution), but no time information W(t) infinitely short: STFT then gives the time signal back, with a phase factor. Excellent time information (good time resolution), but no frequency information Wide analysis window poor time resolution, good frequency resolution Narrow analysis windowgood time resolution, poor frequency resolution Once the window is chosen, the resolution is set for both time and frequency.
Heisenberg Principle Time resolution: How well two spikes in time can be separated from each other in the transform domain Frequency resolution: How well two spectral components can be separated from each other in the transform domain Both time and frequency resolutions cannot be arbitrarily high!!! We cannot precisely know at what time instance a frequency component is located. We can only know what interval of frequencies are present in which time intervals http://engineering.rowan.edu/~polikar/WAVELETS/WTpart2.html
STFT Amplitude ….. ….. time t0 t1 tk tk+1 tn ….. ….. Frequency
The Short Time Fourier Transform Take FT of segmented consecutive pieces of a signal. Each FT then provides the spectral content of that time segment only Spectral content for different time intervals Time-frequency representation Time parameter Signal to be analyzed FT Kernel (basis function) Frequency parameter STFT of signal x(t): Computed for each window centered at t= (localized spectrum) Windowing function (Analysis window) Windowing function centered at t=
Properties of STFT Linear Complex valued Time invariant Time shift Frequency shift Many other properties of the FT also apply.
Alternate Representation of STFT STFT : The inverse FT of the windowed spectrum, with a phase factor
Filter Interpretation of STFT X(t) is passed through a bandpass filter with a center frequency of Note that (f) itself is a lowpass filter.
Filter Interpretation of STFT x(t) X X x(t)
Resolution Issues All signal attributes located within the local window interval around “t” will appear at “t” in the STFT Amplitude time k n Frequency
Time-Frequency Resolution Closely related to the choice of analysis window Narrow window good time resolution Wide window (narrow band) good frequency resolution Two extreme cases: (T)=(t) excellent time resolution, no frequency resolution (T)=1 excellent freq. resolution (FT), no time info!!! How to choose the window length? Window length defines the time and frequency resolutions Heisenberg’s inequality Cannot have arbitrarily good time and frequency resolutions. One must trade one for the other. Their product is bounded from below.
Time-Frequency Resolution
Time Frequency Signal Expansion and STFT Synthesis Basis functions Coefficients (weights) Synthesis window Synthesized signal Each (2D) point on the STFT plane shows how strongly a time frequency point (t,f) contributes to the signal. Typically, analysis and synthesis windows are chosen to be identical.
STFT Example 300 Hz 200 Hz 100Hz 50Hz
STFT Example
STFT Example a=0.01
STFT Example a=0.001
STFT Example a=0.0001
STFT Example a=0.00001
Discrete Time Stft
The Story of Wavelets Theory and Engineering Applications Time – frequency resolution problem Concepts of scale and translation The mother of all oscillatory little basis functions… The continuous wavelet transform Filter interpretation of wavelet transform Constant Q filters
Time – Frequency Resolution Time – frequency resolution problem with STFT Analysis window dictates both time and frequency resolutions, once and for all Narrow window Good time resolution Narrow band (wide window) Good frequency resolution When do we need good time resolution, when do we need good frequency resolution?
Scale & Translation Translation time shift f(t) f(a.t) a>0 If 0<a<1 dilation, expansion lower frequency If a>1 contraction higher frequency f(t)f(t/a) a>0 If 0<a<1 contraction low scale (high frequency) If a>1 dilation, expansion large scale (lower frequency) Scaling Similar meaning of scale in maps Large scale: Overall view, long term behavior Small scale: Detail view, local behavior
The Mother of All Oscillatory Little Basis Functions The kernel functions used in Wavelet transform are all obtained from one prototype function, by scaling and translating the prototype function. This prototype is called the mother wavelet Translation parameter Scale parameter Normalization factor to ensure that all wavelets have the same energy
Continuous Wavelet Transform translation Mother wavelet Normalization factor Scaling: Changes the support of the wavelet based on the scale (frequency) CWT of x(t) at scale a and translation b Note: low scale high frequency
Computation of CWT Amplitude Amplitude time time Amplitude Amplitude b0 bN time b0 bN time Amplitude Amplitude b0 bN time b0 bN time
WT at Work High frequency (small scale) Low frequency (large scale)
Why Wavelet? We require that the wavelet functions, at a minimum, satisfy the following: Wave… …let
The CWT as a Correlation Recall that in the L2 space an inner product is defined as then Cross correlation:
The CWT as a Correlation wavelets Meaning of life: W(a,b) is the cross correlation of the signal x(t) with the mother wavelet at scale a, at the lag of b. If x(t) is similar to the mother wavelet at this scale and lag, then W(a,b) will be large.
Filtering Interpretation of Wavelet Transform Recall that for a given system h[n], y[n]=x[n]*h[n] Observe that Interpretation:For any given scale a (frequency ~ 1/a), the CWT W(a,b) is the output of the filter with the impulse response to the input x(b), i.e., we have a continuum of filters, parameterized by the scale factor a.
What do Wavelets Look Like??? Mexican Hat Wavelet Haar Wavelet Morlet Wavelet
Constant Q Filtering A special property of the filters defined by the mother wavelet is that they are –so called – constant Q filters. Q Factor: We observe that the filters defined by the mother wavelet increase their bandwidth, as the scale is reduced (center frequency is increased) w (rad/s)
Constant Q B B B B B B STFT f0 2f0 3f0 4f0 5f0 6f0 B 2B 4B 8B CWT
Inverse CWT provided that
Properties of Continuous Wavelet Transform Linearity Translation Scaling Wavelet shifting Wavelet scaling Linear combination of wavelets
Example
Example
Example
Spectrogram & Scalogram Spectrogram is the square magnitude of the STFT, which provides the distribution of the energy of the signal in the time-frequency plane. Similarly, scalogram is the square magnitude of the CWT, and provides the energy distribution of the signal in the time-scale plane:
Energy It can be shown that which implies that the energy of the signal is the same whether you are in the original time domain or the scale-translation space. Compare this the Parseval’s theorem regarding the Fourier transform.
CWT in Terms of Frequency Time-frequency version of the CWT can also be defined, though note that this form is not standard, where is the mother wavelet, which itself is a bandpass function centered at t=0 in time and f=f0 in frequency. That is f0 is the center frequency of the mother wavelet. The original CWT expression can be obtained simply by using the substitution a=f0 / f and b= In Matlab, you can obtain the “pseudo frequency” corresponding to any given scale by where fs is the sampling rate and Ts is the sampling period.
Discretization of Time & Scale Parameters Recall that, if we use orthonormal basis functions as our mother wavelets, then we can reconstruct the original signal by where W(a,b) is the CWT of x(t) Q: Can we discretize the mother wavelet a,b(t) in such a way that a finite number of such discrete wavelets can still form an orthonormal basis (which in turnallows us to reconstruct the original signal)? If yes, how often do we need to sample the translation and scale parameters to be able to reconstruct the signal? A: Yes, but it depends on the choice of the wavelet!
Dyadic Grid Note that, we do not have to use a uniform sampling rate for the translation parameters, since we do not need as high time sampling rate when the scale is high (low frequency). Let’s consider the following sampling grid: b where a is sampled on a log scale, and b is sampled at a higher rate when a is small, that is, where a0 and b0 are constants, and j,k are integers. log a
Dyadic Grid If we use this discretization, we obtain, A common choice for a0 and b0 are a0 = 2 and b0 = 1, which lend themselves to dyadic sampling grid Then, the discret(ized) wavelet transform (DWT) pair can be given as Inverse DWT DWT
Note that… We have only discretized translation and scale parameters, a and b; time has not been discretized yet. Sampling steps of b depend on a. This makes sense, since we do not need as many samples at high scales (low frequencies) For small a0, say close to 1, and for small b0, say close to zero, we obtain a very fine sampling grid, in which case, the reconstruction formula is very similar to that of CWT For dense sampling, we need not place heavy restriction on (t) to be able reconstruct x(t), whereas sparse sampling puts heavy restrictions on (t). It turns out that a0 = 2 and b0 = 1 (dyadic / octave sampling) provides a nice trade-off. For this selection, many orthonormal basis functions (to be used as mother wavelets) are available.
Discrete Wavelet Transform We have computed a discretized version of the CWT, however, we still cannot implement the given DWT as it includes a continuous time signal integrated over all times. We will later see that the dyadic grid selection will allow us to compute a truly discrete wavelet transform of a given discrete time signal.