Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1 BME452 Biomedical Signal Processing Lecture 3  Signal conditioning.

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1 BME452 Biomedical Signal Processing Lecture 3  Signal conditioning

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 2 Lecture 3 Outline In this lecture, we’ll study the following signal conditioning methods (specifically for noise reduction)  Ensemble averaging  Median filtering  Moving average filtering  Principal component analysis  Independent component analysis (in brief) Before we study these, an introduction to some mathematics will be given

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 3 Mean The arithmetic mean is the "standard" average, often simply called the "mean" where N is used to denote the data size (length) In MATLAB, n=1,….N but sometimes we use n=0,1,….N-1. Example An experiment yields the following data: 34,27,45,55,22,34 To get the arithmetic mean  How many items? There are 6. Therefore N=6  What is the sum of all items?  =217.  To get the arithmetic mean divide sum by N, here 217/6=36.1667 Expectation  What is expected value of X, E[X]? Simply said, it refer to the sum divided by the quantity, i.e. mean of the value in the square brackets  Eg: E[x 2 ]=

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 4 Very often, we set the mean to zero before performing any signal analysis This is to remove the dc (0 Hz) noise  xm=x-mean(x) Mean removal for signals

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 5 Mean removal across channels/recordings Sometimes, a noise corrupts all the signals in a multi-channel signal or across all the recordings of a single channel signal  Since the noise is common to all the channels/recordings, the simplest way of removing this noise is to remove mean across channels/recordings

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 6 Standard deviation (  ) Measures how spread out are the values in a data set Suppose we are given a signal x1,..., xN of real value numbers (all recorded signals are real values) The arithmetic mean of this population is defined as The standard deviation of this population is defined as Given only a sample of values x1,...,xN from some larger population, many authors define the sample (or estimated) standard deviation by This is known as an unbiased estimator for the actual standard deviation

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 7 Standard deviation example

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 8 Interpreting standard deviation A large standard deviation indicates that the data points are far from the mean and a small standard deviation indicates that they are clustered closely around the mean For example, each of the three samples (0, 0, 14, 14), (0, 6, 8, 14), and (6, 6, 8, 8) has an average of 7. Their standard deviations are 7, 5 and 1, respectively. The third set has a much smaller standard deviation than the other two because its values are all close to 7.

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 9 Normalisation Sometimes, we may wish to normalise a signal to mean=0 and set the standard deviation to 1 For example, if we record the same signal but using different instruments with different amplification factor, it will be difficult to analyse the signals together In this regard, we will normalise the signals using

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 10 Variance Variance is simply the square of standard deviation Uncertainty measure  Variance may be thought of as a measure of uncertainty  When deciding whether measurements agree with a theoretical prediction, variance could be used  If variance (using the predicted mean) is high, then the measurements contradict the prediction  Example: say we have predicted that x[1]=7, x[2]=6, x[3]=5  x is measured 3 times => (7.2 6.7 5.6); (4.2 6.8 5.2); (11.2 6.3 5.9) Do this =>  Compute the variance using the predicted value as mean  var[1]=12.76, var[2]=0.610, var[3]=0.605  So, we know that x[1] measurements are contradicting the prediction and probably not x[2] and x[3] measurements

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 11 Covariance If we have multi-channel/multi-trial recorded signals, we can have cross variance or simply covariance Covariance measure the variance between different signals (from different channels/recordings) Covariance between two signals, X and Y with respective means, μ and ν, The covariance sometimes is used as a measure of "linear dependence" between the two signals but correlation is a better measure

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 12 Correlation Correlation between two signals, X and Y is It is simply normalised covariance It measures linear dependence between X and Y The correlation is 1 in the case of an increasing linear relationship, −1 in the case of a decreasing linear relationship, and some value in between in all other cases, indicating the degree of linear dependence between the variables The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 13 Application of correlation (example) The diagram shows how the unknown signal can be identified  A copy of a known reference signal is correlated with the unknown signal  The correlation will be high if the reference is similar to the unknown signal  The unknown signal is correlated with a number of known reference functions  A large value for correlation shows the degree of similarity to the reference  The largest value for correlation is the most likely match

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 14 Application of correlation (another example) Application to heart disease detection using ECG signals  Cross correlation is one way in which different types of heart diseases can be identified using ECG signals  Each heart disease has a unique ECG signal  Some example of ECG signals for different diseases are shown below  The system has a library of pre-recorded ECG signals (known as templates)  An unknown ECG signal is correlated with all the ECG templates in this library  The largest correlation is the most likely match of the heart disease

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 15 Signal-to-noise ratio (SNR) Before we move into the noise reduction methods, we need a measure of noise in the signals  This is important to gauge the performance of the noise reduction techniques For this purpose, we use SNR  SNR=10log 10 [(signal energy)/(noise energy)] The original noise  x(noise) = x(original signal) – x(noisy signal) After using some noise reduction method,  x(noise) = x(original signal) – x(noise reduced signal)

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 16 Ensemble averaging If we have many recordings, we can use ensemble averaging to reduce noise that is not correlated between the recordings Ensemble averaging to reduce noise from Evoked Potential (EP) EEG Repeated different recordings are known as trials EP EEG signals from trial to another are about the same (high correlation) But noise will be different from one trial to another (low correlation) Hence, it would be possible to use ensemble averaging to reduce noise ……………. EP EEG EP EEG+noise (trial 1) EP EEG after ensemble averaging EP EEG+noise (trial 2) EP EEG+noise (trial 20)

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 17 Worked example 1 - Ensemble averaging Assume we have 3 signals corrupted with noise. Assume we have the original also (for SNR computation). 1. Set the mean to zero first 2. The ensemble average is (the average is done for each sample point n) 3. The noises in the signals are (original signal – noise corrupted signal) n012 3 Noisy signal 1-2.2-0.20.22.2 Noisy signal 2 -2.10.1 1.9 Noisy signal 3 -1.9-0.20.12.0 Original -2.00.0 2.0 n0123 Ensemble average -2.1-0.10.12.0 n0123 Noisy signal 1 2.94.95.37.3 Noisy signal 2 2.95.1 6.9 Noisy signal 3 3.14.85.17 Original 3557 n012 3 Signal 1 noise0.2 -0.2 Signal 2 noise0.1-0.1 0.1 Signal 3 noise-0.10.2-0.10.0 Ensemble average noise0.1 -0.10.0 4. SNR=10log 10 (e(signal)/e(noise)) Original signal energy8 signal 1signal 2signal 3 ensemble average noise energy 0.160.040.060.03 SNR 16.9923.0121.2524.26

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 18 Median filtering Similar to ensemble averaging, if we have many recordings, we can use median filtering to reduce noise that is not correlated between the recordings What is median filtering? If we have x[1] as [3 2 1 0 6 7 9 3 2] from 9 trials, we sort the numbers from small to big, then the centre value (i.e. 5 th ) as the median Sorted x[1] is [0 1 2 2 3 3 6 7 9], so median x[1]= 3 Median filtering is advantageous as compared to ensemble averaging if there is one trial containing a lot of noise AND if the number of trials/recordings are small This is because the one heavily noise corrupted signal will distort the ensemble average values but will less likely affect the median values

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 19 Worked example 2 – median filtering Assume we have 3 signals corrupted with noise, one heavily corrupted (assume the mean has been set to zero) 2. The noises in the signals are (original signal – noise corrupted signal) 4. SNR=10log10(e(signal)/e(noise)) 1. The ensemble average and median filtered signals Which technique gave better noise reduction using SNR – ensemble averaging or median filtering? Why? n012 3 Noisy signal 1-2.150.95-0.251.45 Noisy signal 2 -9054 Noisy signal 3 -2.25-0.450.851.85 Original -2.00.0 2.0 n0123 Ensemble average -4.470.171.872.43 Median filter -2.2500.851.85 n012 3 Noise in signal 10.15-0.950.250.55 Noise in signal 2 70-5-2 Noise in signal 3 0.250.45-0.850.15 Noise in ensemble averaging 2.47-0.17-1.87-0.43 Noise in median filter signal 0.250-0.850.15 Original signal energy8 signal 1signal 2signal 3 ensemble average Median filter noise energy 1.29781.019.810.81 SNR 7.93-9.898.990.889.95

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 20 Moving average filtering How do we reduce noise if we have only one signal from one recording/ trial? We can’t use ensemble averaging and median filtering Normally, in any signal, the few points before and after a certain point n are correlated (i.e. related) But generally the noise is not correlated So, we can use moving average (MA) filtering It is defined as where S is the filter order Example, for S=3, y[5]=(x[5]+x[6]+x[7])/3 For signals x and y to remain of same sample length:  We have to pad (S-1) zeros to the signal x to get the last (S-1) points of the signal y

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 21 Moving average filtering –zero padding If zero padding is NOT allowed If zero padding is allowed

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 22 Example - moving average filtering Assume we have a EEG signal corrupted with noise Set the mean to zero Apply moving average filter to the noisy signal (use filter order=3 and 5) The higher filter order will remove more noise, but it will also distort the signal more (i.e. remove the signal parts also) So, a compromise has to be found for the value of S (normally by trial and error) load eeg; N=length(eeg); for i=1:N-3, eegMA1(i)=(eeg(i)+eeg(i+1)+eeg(i+2))/3; end eegMA1(255)=(eeg(255)+eeg(256))/2; eegMA1(256)=eeg(256)/1; for i=1:N-5, eegMA2(i)=(eeg(i)+eeg(i+1)+eeg(i+2)+eeg(i+3)+eeg(i+4))/5; end eegMA2(253)= (eeg(253)+eeg(254)+eeg(255)+eeg(256))/4; eegMA2(254)=(eeg(254)+eeg(255)+eeg(256))/3; eegMA2(255)=(eeg(255)+eeg(256))/2; eegMA2(256)=eeg(256)/1; subplot(3,1,1), plot(eeg, 'g '); subplot(3,1,2), plot(eegMA1,'r'); subplot(3,1,3), plot(eegMA2,‘b');

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 23 Median filter for noisy images Consider applying median filtering to some noisy images In computer, these grayscale images are stored as 2D arrays  x(i,j) where I and j are the coordinates and x is the grayscale values (in general from 0 (black) 255 (white)) After applying median filter Mean (averaging) filter could be applied in similar manner though for images, median filter normally gives better results

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 24 Principal component analysis PCA can be used to reduce noise from signals provided we have repeated recordings or signals from a number of trials or multi- channel signals Principal components (PCs) are obtained from PCA, which are orthogonal signals, i.e. signals that are uncorrelated to each other Since noise is less correlated between the trials as compared to the signals, the first few PCs will account for the signals while the last few PCs will account for the noise By discarding the last few PCs before reconstruction, we’ll get the signals without noise/with less noise

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 25 Principal component analysis -algorithm PCA algorithm  Organise the data, X in M x N matrix  Set mean to zero  Compute CX=covariance of matrix, X  Compute eigenvalue, eigenvector of CX  Sort eigenvectors (i.e. principal components) in descending order  Compute Zscores  Decide how many PCs to keep using some criteria  Reconstruct the noise reduced signals using the first few PCs and Zscores

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 26 Eigenvector, eigenvalue – a brief review The steps of setting mean to zero and computing covariance have been covered earlier, so let us move to the step of computing eigenvector, eigenvalue Let us assume that A=cov(X), where X is the mean zero data In MATLAB, [V,D] = eig(A) produces matrices of eigenvalues (D) and eigenvectors (V) of matrix A It is obtained from A.*V = D.*V Note: A has to be a square matrix Eg: So is the eigenvector and 4 is the eigenvalue can be assumed to be the vector direction And eigenvalue=4 is the weight of this vector 3 2 A

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 27 Eigenvector, eigenvalue (cont.) Finding the eigenvalues and eigenvectors for bigger than 3 x 3 matrix is extremely difficult, so we will skip the algorithms and just use MATLAB function eig Example, for the following square matrix: Decide which, if any, of the following vectors are eigenvectors of that matrix and give the corresponding eigenvalue Answer: The eigenvector is because = 1. The eigenvalue is 1

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 28 Sort the eigenvectors Sort the eigenvectors from big to small using eigenvalues Let’s use the example we saw earlier for ensemble averaging and median filtering X=[2.9 4.9 5.3 7.3; 2.9 5.1 5.1 6.9; 3.1 4.8 5.1 7] Xm=[-2.2 -0.2 0.2 2.2; -2.1 0.1 0.1 1.9; -1.9 -0.2 0.1 2.0] A=Cov(Xm’) The eigenvectors are, [V,D]=eig(A) The corresponding eigenvalues are 0.0017, 0.0272, 8.4578 So now sort the eigenvectors in the order of eigenvalues: 8.4578, 0.0272, 0.0017 So the eigenvectors are

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 29 Zscores Zscores=Vsort’*Xm where V is the sorted eigenvectors and Xm is the mean zero data matrix In the previous example, the size of A=3 So, we will have 3 Zscores Zscores will have the same dimensions as Xm Zscore 1 Zscore 2

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 30 How to select the number of PCs to keep The PCs with higher eigenvalues represent the signals while the PCs with lower eigenvalues represent the noise So we keep the first few PCs and discard the rest But how many PCs do we keep? Using certain percentage of variance to retain, normally 95% or 99% Eigenvalues represent the weight of the PCs i.e. some sort of variance (power) measure of the PCs So, we can use sum(D 1 :D q )/sum(D 1 :D last )>0.99, where D represents the eigenvalues [D1,D2,D3,…Dlast] In our example, say we wish to retain 99% variance  eigenvalues are 8.4578, 0.0272, 0.0017 Sum(D1:Dlast)= 8.4867 Sum(D1:D1)=8.4578; sum(D1:D1)/sum(D1:Dlast)=0.9996 Sum(D1:D2)=8.4849; sum(D1:D1)/sum(D1:Dlast)=0.9998 Sum(D1:Dlast)=8.4867; sum(D1:Dlast)/sum(D1:Dlast)=1.0 Since the first eigenvalue accounted for 99.96% variance (which is more than 99%) and we can discard the second and third PC If we wish to retain 99.97%, how many PCs do we retain? Answer=2

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 31 Reconstruct using the selected PCs To get back the original signals without noise, we need to reconstruct using the selected PCs X nonoise =V selected *Zscore selected In our example, only 1 PC was selected, so the first eigenvector and the first Zscore will be used to get back the 3 noise reduced signals Xnonoise=Vsort(:,1)*Zscore(1,:) noise=Xm-Xnonoise Energy (noise) = Original signal, x=[-2 0 0 2]; this is the actual original mean removed signal - from the earlier slide Energy (original signal)=8; SNR= SNR using PCA is generally higher than ensemble averaging or median filtering and we do get 3 signal outputs unlike one signal output from ensemble averaging or median filtering

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 32 Principal component analysis – an example of application Consider the following 3 noise corrupted signals Obtain the principal components (in descending order of eigenvalue magnitude) Obtain the Zscores Decide how many PCs to retain - assume that we retain only the first PC By retaining the first PC only for reconstruction, we will have 3 noise reduced EP Reconstruct using only one PC EP signal (trial 1) EP signal (trial 2) EP signal (trial 3) Noisy EP signal (trial 1) Noisy EP signal (trial 2) Noisy EP signal (trial 3)

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 33 Independent component analysis –a brief study ICA is a new method that could be used to separate noise from signal Sometimes known as blind source separation Requires more than one signal recording (like PCA) ICA separates the signals into independent signals (signals and noises – we keep the signals, discard the noises) Example: Assume, we have 3 observed (i.e. recorded signals): x1[n], x2[n] and x3[n] from 3 original signals sources: s1[n], s2[n] and s3[n] x1[n]=a11.s1[n]+a12.s2[n]+a13.s3[n] x2[n]=a21.s1[n]+a22.s2[n]+a23.s3[n] x3[n]=a31.s1[n]+a32.s2[n]+a33.s3[n] The matrix, is known as mixing matrix ICA can be used to obtain the original signals by obtaining the unmixing matrix W  W=A -1 The original signals can be obtained by using s1[n]=w11.x1[n]+w12.x2[n]+w13.x3[n] s2[n]=w21.x1[n]+w22.x2[n]+w23.x3[n] s3[n]=w31.x1[n]+w32.x2[n]+w33.x3[n]

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 34 Independent component analysis – a pictorial example Figures from Independent Component Analysis, Hyvarinen, Karhunen and Oja

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 35 Maximising non-gaussianity using kurtosis How ICA works? The central limit theorem says that sums of non-gaussian random variables are closer to gaussian than the original ones => the independent signals are less gaussian than the combined signals So by maximising non-gaussian behaviour, we get closer to the original signals Kurtosis could be used to measure gaussian behaviour BUT what is gaussian?  See next slide more gaussian less gaussian Source (original signals) Mixed (combined signals)

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 36 Gaussian and probability distributions Gaussian (or normal) probability distribution is BUT what is probability distribution? Probability distribution for discrete-time signals is simply the number of occurences vs value Eg: if x has values from 1 to 10 Gaussian distribution Super-gaussian distribution  The data close to mean have higher occurences Sub-gaussian distribution  Most the data have similar number of occurences count(1:10)=0; for i=1:10, y=find(x==i); count(i)=length(y); end plot(y); x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7] Probability distribution of x

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 37 Kurtosis Non-gaussianity can be measure using kurtosis Gaussian signals have kurtosis=3 Sub-gaussian signals have lower kurtosis value Super-gaussian signals have higher kurtosis value Examples x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7]; y=kurtosis(x,0); %unbiased kurtosis using MATLAB y=1.9509 Gaussian distribution signal x = randn(1,100000); % gaussian signal with mean=0, std=1 plot(x); y=kurtosis(x,0) %unbiased kurtosis using MATLAB y=3.00

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 38 EP signal, kurtosis=3.32 noise, kurtosis=2.81X2=EP+noise, kurtosis=2.61 X1= EP+noise, kurtosis=2.79 Example – Kurtosis for EP and noise Original signalsRecorded signals Can you see that kurtosis is lower for combined signals, i.e. the actual independent signals (i.e. sources) have higher kurtosis

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 39 ICA tries to obtain EP and noise by estimating the unmixing matrix The solution is In the beginning, we don’t know the unmixing matrix! A simple ICA method is to randomly generate values in [0,1] for the unmixing matrix Now, EP[n]=w11.X1[n]+w12.X2[n] and noise[n]=w21.X1[n]+w22.X2[n] Kurtosis values are computed for these estimated EP and noise Repeat with other random values for the unmixing matrix (say for a thousand times) The unmixing matrix that gave the highest kurtosis values will denote the actual EP and noise Actual ICA algorithms use complicated neural network learning algorithms, so we’ll skip them It suffices to know that by using certain measures like kurtosis (representing non-gaussianity), we can separate the signals into independent components Simple ICA algorithm – an example using EP and noise Unmixing matrix

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 40 Study guide (Lecture 3) From this week’s lecture, you should know  Basic mathematics– mean, standard deviation, variance, covariance, correlation, autocorrelation, SNR, etc.  Uses of these basic maths in signal analysis  Noise reduction methods like ensemble averaging, median filtering, moving average filtering, principal component analysis and basics of independent component analysis End of lecture 3

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1 BME452 Biomedical Signal Processing Lecture 3  Signal conditioning.

Similar presentations

Presentation on theme: "Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1 BME452 Biomedical Signal Processing Lecture 3  Signal conditioning."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1 BME452 Biomedical Signal Processing Lecture 3  Signal conditioning.

Similar presentations

Presentation on theme: "Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1 BME452 Biomedical Signal Processing Lecture 3  Signal conditioning."— Presentation transcript:

Similar presentations

About project

Feedback