Download presentation
Presentation is loading. Please wait.
1
Signal processing
2
Example data – ChIP-Seq
3
Gaussian peak with normal noise
Frequency Frequency Frequency
4
Removing High Frequences
Frequency
5
Smoothing w=ones(2*width+1,'d') convolve(w/w.sum(),y,'valid‘)
Intensity w=ones(2*width+1,'d') convolve(w/w.sum(),y,'valid‘) Frequency Frequency Frequency
6
Peak Finding The derivative of a function is zero at its
minima and maxima. The second derivative is negative at maxima and positive at minima.
7
Detection of steps Motivation: To demonstrate a general strategy for
Intensity Motivation: To demonstrate a general strategy for separating signal from noise: Characterize the signal and the noise Make a model of the data Select detection method Select parameters using simulations
8
Detection of steps: Characterization of noise
Remove signal by subtracting a moving average
9
Detection of steps: Model of data
S/N=0.75 S/N=1 S/N=2 points=1000 x = linspace(-1,1,points) y=noise*random.normal(size=len(x)) y[points/2:]+=signal
10
Detection of steps: Detection method
S/N=0.75 S/N=1 S/N=2 Steps can be converted into peaks by calculating the difference between the moving average in two windows
11
Detection of steps: Detection method
S/N=0.75 S/N=1 S/N=2 Bin size = 10 Average Intensity Average Intensity Average Intensity Bin size = 30 Average Intensity Average Intensity Average Intensity Bin size = 100 Average Intensity Average Intensity Average Intensity
12
Detection of steps: Simulations - peak location
S/N=0.05 S/N=0.25 S/N=1 Bin size = 10 Bin size = 30 Bin size = 100
13
Detection of steps: Simulations – correct peak
S/N=0.05 S/N=0.25 S/N=1 Bin size = 10 Frequency Frequency Frequency Score Score Score Bin size = 30 Frequency Frequency Frequency Score Score Score Bin size = 100 Frequency Frequency Frequency Score Score Score
14
Detection of steps: Simulations - FDR and FNR
S/N=0.05 S/N=0.25 S/N=1 Bin size = 10 False Rate False Rate False Rate Threshold Threshold Threshold False Negative Rate Bin size = 30 False Rate False Rate False Rate False Discovery Rate Threshold Threshold Threshold Bin size = 100 False Rate False Rate False Rate Threshold Threshold Threshold
15
Peak Finding Characterize the signal and the noise
Intensity Characterize the signal and the noise Make a model of the data Select detection method Select parameters using simulations
16
Peak Finding: Characterizing the noise
Intensity Let’s first try without removing the peaks
17
Peak Finding: Characterizing the noise
Removing the peaks by looking for outliers in the root mean square deviation (RMSD) Intensity RMSD
18
Peak Finding: Characterizing the peaks
Intensity
19
Peak Finding: Model of data
S/N=1 S/N=2 S/N=4 points=1000 x = linspace(-1,1,points) y=noise*random.normal(size=len(x)) y+=signal*gaussian(x,0,0.01)
20
Peak Finding: Detection method
S/N=1 S/N=2 S/N=4 Peaks can be detected by finding maxima in the moving average with a window size similar to the peak width
21
Peak Finding: Detection method – moving average
Signal Bin size = 5 Bin size = 20 Bin size = 80 S/N=1 S/N=2 S/N=4
22
Peak Finding: Detection method – RMSD
Signal Bin size = 5 Bin size = 20 Bin size = 80 S/N=1 S/N=2 S/N=4
23
Peak Finding: Information about the Peak
maximum mean variance skewness kurtosis full width at half maximum (FWHM) Intensity height centroid (mean) area
24
Information about a Peak
A peak is defined by Centroid or mean To calculate any of these measures we need to know where the peak starts and ends.
25
Where does a peak start and end?
26
Estimating quantity Peak height Peak height Curve fitting
Intensity Peak area m/z
27
What is the best way to estimate quantity?
Peak height - resistant to interference - poor statistics Peak area - better statistics - more sensitive to interference Curve fitting - better statistics - needs to know the peak shape - slow
28
Sampling Intensity Retention Time
29
Sampling
30
Summary Fourier transform - transformation to frequency space and back
Signal – how do we detect and characterize signals? Noise – how do we characterize noise? Modeling signal and noise Simulation to select thresholds and select parameters Filters – fitering by low-pass (i.e. smoothing) and high-pass filters (e.g. adaptive background correction) Detection methods based on moving average and RMSD Convolution - describes the response of a linear and time-invariant system to an input signal Cross-correlation is a measure of similarity of two signals Autocorrelation can be used for finding periodic signals obscured by noise The dot product can be used to determine how similar two signals are Coincidence measurements enhance the signal and supresses noise The quantity associated with a peak – height and area Sampling – how often do we need to sample a peak to get a good estimate of its area?
31
Peak Finding Examples
32
Peak Finding Example 1 RMSD Window Size 40 80 160
33
Peak Finding Example 1 RMSD Window Size 40 80 160
34
Peak Finding Example 1 RMSD Window Size 40 80 160
35
Peak Finding Example 1 RMSD Window Size 40 80 160
36
Peak Finding Example 1 Smoothing Window Size 40 80 160
37
Peak Finding Example 1 Smoothing Window Size 40 80 160
38
Peak Finding Example 1 Smoothing Window Size 40 80 160
39
Peak Finding Example 1 Smoothing Window Size 40 80 160
40
Peak Finding Example 1 Smoothing Window Size 40 80 160
41
Peak Finding Example 2 RMSD Window Size 40 80 160
42
Peak Finding Example 2 RMSD Window Size 40 80 160
43
Peak Finding Example 2 RMSD Window Size 40 80 160
44
Peak Finding Example 2 RMSD Window Size 40 80 160
45
Peak Finding Example 2 Smoothing Window Size 40 80 160
46
Peak Finding Example 2 Smoothing Window Size 40 80 160
47
Peak Finding Example 2 Smoothing Window Size 40 80 160
48
Peak Finding Example 2 Smoothing Window Size 40 80 160
49
Peak Finding Example 2 Smoothing Window Size 40 80 160
50
Peak Finding Example 3
51
Peak Finding Example 4
52
Peak Finding Example 5
53
Homework: Background Subtraction
Using Smoothing
54
Extra Homework
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.