Download presentation
Presentation is loading. Please wait.
1
Lecture 3 Review of Linear Algebra Simple least-squares
2
Set up for standard Least Squares y i = a + b x i y 1 1 x 1 a y 2 = 1 x 2 b … … … y N 1 x N d = G m
3
Standard Least-squares Solution m est = [G T G] -1 G T d
4
practice Set up a simple least-squares problem, identifying the vectors d and m and the matrix G Solve it using the least-squares formula, m est = [G T G] -1 G T d
5
Lecture 4 Probability and what it has to do with data analysis
6
the Gaussian or normal distribution p(x) = exp{ - (x-x) 2 / 2 2 ) 1 (2 ) expected value variance Memorize me !
7
x p(x) x x+2 x-2 95% Expectation = Median = Mode = x 95% of probability within 2 of the expected value Properties of the normal distribution
8
Functions of a random variable any function of a random variable is itself a random variable Errors propagate from observations to inferences
9
General rule given a distribution p(x) e.g. where x are observations and a function y(x) e.g. where y are inferences p(y) = p[x(y)] |dx/dy|
10
Suppose y(x) is a linear function y=Mx Then, regardless of the type of distribution, p(x): In the special case that p(x) is a normal distrbution p(y) is a normal distribution, too. y=Mx C y = M C x M T
11
Means and Variances Add Special case: y=Mx y 1 = Ax 1 ± Bx 2 So that M = [A, B] C y = M C x M T y=Mx y = Ax 1 ± Bx 2 y 2 = A 2 x 1 2 + B 2 x 2 2 Note that variance always add
12
practice I would say … practice transforming a distribution of two variables, p(x 1,x 2 ) p(y 1,y 2 ) when the functions y 1 (x 1,x 2 ) and y 2 (x 1,x 2 ) are simple (but nonlinear) expressions and p(x 1,x 2 ) is simple, too. … but actually, even the simplest version would be too long for a midterm.
13
Lecture 5 Probability and Statistics
14
Rule for propagating error in least-squared M=[G T G] -1 G T Uncorrelated data with equal variance C d = d 2 I C m = M C d M T = d 2 [G T G] -1 C y = M C x M T
15
From this follows the famous rule for the error associated with the mean. If G = N -1 [1, 1, … 1] T m = d / N the estimated mean is a normally-distributed random variable the width of this distribution, m, decreases with the square root of the number of measurements
16
practice Set up a simple (e.g. linear) error-propagation problem by identifing the matrices M and C d Compute and interpret C m using the rule And then write down 95% confidence intervals C y = M C x M T
17
Lecture 6 Bootstraps Maximum Likelihood Methods
18
More or less the same thing in the 2 pots ? Take 1 cup p(y) Duplicate cup an infinite number of times Pour into new pot p(y)
19
Bootstrap method random sampling with replacement use the original dataset x to create many new datasets x (i) compute a y(x) from each and empirically examine their distribution
20
The Principle of Maximum Likelihood Given a parameterized distribution p(x;m) Chose m so that it maximizes L(m) L/ m i = 0 the dataset that was in fact observed is the most probable one that could have been observed L(m) = i ln p(x i ; m)
21
Application to Normal Distribution Sample mean and sample variance are the maximum likelihood estimates of the true mean and variance of a normal distribution
22
practice I would say … use maximum likelihood to find the m associated with a parameterized distribution p(d,m) when p(d,m) is something fairly simple … but I think even the simplest such a problem would be too long for a midterm
23
Lecture 7 Advanced Topics in Least Squares
24
When the data are normally-distributed with variance C d Maximum likelihood implies generalized least- squares: Minimize (d-Gm) T C d -1 (d-Gm) Which has solution m = [G T C d -1 G] -1 G T C d -1 d and C m = [G T C d -1 G] -1
25
In the special case of uncorrelated data with different variances C d = diag( 1 2, 2 2, … N 2 ) = d i ’= i -1 d i multiply each data by the reciprocal of its error G ij ’ = i -1 G ij multiply each row of the data kernel by the same amount Then solve by ordinary least squares 1 2 0 0 … 0 2 2 0 … 0 0 3 2 …...
26
practice Set up a simple least-squares problem when the data have non-uniform variance Solve it: work out a formula for the least-squares estimate of the unknowns, and their variance as well. Interpret the results, e.g. write down 95% confidence intervals for the unknowns
27
Lecture 8 Advanced Topics in Least Squares - Part Two -
28
prior information assumptions about the behavior of the unknowns that ‘fill in’ the data gaps
29
Overall Strategy 1. Represent the observed data as a normal probability distribution with d=d obs, C d 2. Represent prior information as a probability distribution with m=m A, C m … … 5. Apply maximum likelihood to the combined distribution
30
Generalized least-squares solution m est = m A + M [ d obs – Gm A ] where M = [G T C d -1 G + C m -1 ] -1 G T C d -1
31
Special case: uncorrelated data and prior constraints C d = d 2 I and C m = m 2 I M = [ G T G + ( d / m ) 2 I ] -1 G T Called damped least-squares Unknown m’s filled in with their prior values m A
32
Another special case: Smoothness … Dm is a measure of roughness of m e.g. second derivative 1 -2 1 0 0 0 … 0 1 -2 1 0 0 … … 0 0 0 … 1 -2 1 D = d 2 m/dx 2 Dm
33
solution corresponds to generalized least-squares with the choices m A = 0 C m -1 = (D T D)
34
practice Set up a simple least-squares problem when prior information about the model parameters is available. Most importantly, specify m A and C m in sensible ways. Solve it: work out a formula for the estimate of the unknowns, and their variance as well. Interpret the results, e.g. write down 95% confidence intervals for the unknowns
35
Lecture 9 Interpolation and Splines
36
cubic splines – x xixi x i+1 yiyi y i+1 y cubic a+bx+cx 2 +dx 3 in this interval a different cubic in this interval
37
Properties curve goes thru point at end of its interval dy/dx match at interior points d 2 y/dx 2 match at interior points d 2 y/dx 2 =0 at end points
38
practice Memorize the properties of cubic splines
39
Lecture 10 Hypothesis Testing
40
The Null Hypothesis always a variant of this theme: the results of an experiment differs from the expected value only because of random variation
41
5 tests m obs = m prior when m prior and prior are known normal distribution obs = prior when m prior and prior are known chi-squared distribution m obs = m prior when m prior is known but prior is unknown t distribution 1 obs = obs when m 1 prior and m 2 prior are known F distribution m 1 obs = m obs when 1 prior and prior are unknown modified t distribution Not on midterm
42
practice Work through an example of each of the 4 tests identify which test is being used, and why indentify the Null hypothesis compute the probability that the results deviate from the Null Hypothesis only because of random noise interpret the results
43
Lecture 11 Linear Systems
44
output (“response”) of a linear system can be calculated by convolving its input (“forcing”) with its impulse response
45
t 0 h(t) 0 t (t)=g(t) h(t) t amplitude h( ) t (t) h( )g(t- ) Convolution integral (t) = - t g(t- ) h( ) d
46
how to do convolution by hand x=[x 0, x 1, x 2, x 3, x 4, …] T and y=[y 0, y 1, y 2, y 3, y 4, …] T x 0, x 1, x 2, x 3, x 4, … … y 4, y 3, y 2, y 1, y 0 x0y0x0y0 Reverse on time-series, line them up as shown, and multiply rows. This is first element of x * y
47
x 0, x 1, x 2, x 3, x 4, … … y 4, y 3, y 2, y 1, y 0 x 0 y 1 +x 1 y 0 Slide to increase the overlap by one, multiply rows and add products. This is the second element Slide again, multiply and add. This is the third element x 0, x 1, x 2, x 3, x 4, … … y 4, y 3, y 2, y 1, y 0 x 0 y 2 +x 1 y 1 +x 2 y 0 Repeat until time-series no longer overlap
48
Mathematical equivalent ways to write the convolution (t) = - t g(t- ) h( ) d or alternatively (t) = 0 g( ) h(t- ) d h( ) is “forward in time” g( ) is “forward in time”
49
01…N01…N h0h1…hNh0h1…hN g 0 0 0 0 0 0 g 1 g 0 0 0 0 0 … g N … g 3 g 2 g 1 g 0 = t = G h Matrix formulations 01…N01…N g0g1…gNg0g1…gN h 0 0 0 0 0 0 h 1 h 0 0 0 0 0 … h N … h 3 h 2 h 1 h 0 = t = G g and
50
practice Do some convolutions by hand Make sketch-plots of the input, output and impulse response
51
Lecture 12 Filter Theory
52
y k = p=- k f k-p x p y k is obtained from x k by convolving by filter f k input output “digital” filter a generic way to construct a time-series
53
the z-transform turn a timeseries into a polynomial and vice versa time-series x=[x 0, x 1, x 2, x 3, x 4, …] T polynomial x(z) = x 0 + x 1 z + x 2 z 2 + x 3 z 3 + x 4 z 4 + … Z-transform Convolving time-series is equivalent to multiplying their z-transforms
54
If f = [1, -f 1 ] T then f inv = [1, f 1, f 1 2, f 1 3, …] T The inverse filter only exists when |f 1 |<1, for otherwise the elements of f inv grow without bound
55
any filter of length N can be written as a cascade of N-1 length-2 filters f = [f 0, f 1, f 2, f 3, … f N-1 ] T = [-r 1, 1] T * [-r 2, 1] T * … * [-r N-1, 1] T where r i are the roots of f(z)
56
In the general case, an inverse filter only exists when the roots r i of the corresponding f(z) satisfy |r i |>1 such a filter is said to be “minimum phase”
57
practice Given a relatively short filter, f (3 or 4 coefficients) Factor it into a cascade of 2-element filters, by computing the roots of f(z) Determine whether the filter f has an inverse
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.