John Federici NJIT Physics Department

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Correlation and regression
Physics 114: Lecture 19 Least Squares Fit to 2D Data Dale E. Gary NJIT Physics Department.
Polynomial Regression and Transformations STA 671 Summer 2008.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Data Modeling and Parameter Estimation Nov 9, 2005 PSCI 702.
Physics 114: Lecture 16 Linear and Non-Linear Fitting Dale E. Gary NJIT Physics Department.
P M V Subbarao Professor Mechanical Engineering Department
CITS2401 Computer Analysis & Visualisation
Section 4.2 Fitting Curves and Surfaces by Least Squares.
x – independent variable (input)
Curve-Fitting Regression
Least Square Regression
The Islamic University of Gaza Faculty of Engineering Civil Engineering Department Numerical Analysis ECIV 3306 Chapter 17 Least Square Regression.
Nonlinear Regression Probability and Statistics Boris Gervits.
Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7.
Chapter 11 Multiple Regression.
Spreadsheet Problem Solving
Classification and Prediction: Regression Analysis
Calibration & Curve Fitting
Objectives of Multiple Regression
Physics 114: Lecture 17 Least Squares Fit to Polynomial
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Introduction to Error Analysis
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Physics 114: Exam 2 Review Lectures 11-16
MECN 3500 Inter - Bayamon Lecture 9 Numerical Methods for Engineering MECN 3500 Professor: Dr. Omar E. Meza Castillo
Physics 114: Lecture 18 Least Squares Fit to Arbitrary Functions Dale E. Gary NJIT Physics Department.
Lecture 16 - Approximation Methods CVEN 302 July 15, 2002.
Polynomials, Curve Fitting and Interpolation. In this chapter will study Polynomials – functions of a special form that arise often in science and engineering.
Math 4030 – 11b Method of Least Squares. Model: Dependent (response) Variable Independent (control) Variable Random Error Objectives: Find (estimated)
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Chapter 15 General Least Squares and Non- Linear.
Lecture 17 - Approximation Methods CVEN 302 July 17, 2002.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Physics 114: Lecture 16 Least Squares Fit to Arbitrary Functions
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 4 Basic Estimation Techniques
Physics 114: Lecture 18 Least Squares Fit to 2D Data
Chapter 7. Classification and Prediction
assignment 7 solutions ► office networks ► super staffing
Physics 114: Lecture 14 Linear Fitting
12. Principles of Parameter Estimation
Physics 114: Exam 2 Review Weeks 7-9
Physics 114: Lecture 15 Least Squares Fit to Polynomial
OPSE 301: Lab13 Data Analysis – Fitting Data to Arbitrary Functions
Higher-Order Linear Homogeneous & Autonomic Differential Equations with Constant Coefficients MAT 275.
Physics 114: Lecture 10 Error Analysis/ Propagation of Errors
Physics 114: Exam 2 Review Material from Weeks 7-11
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae
Prepared by Lee Revere and John Large
Physics 114: Lecture 14 Linear Fitting
Modelling data and curve fitting
Line Fit with Errors in Y
Chi Square Distribution (c2) and Least Squares Fitting
MatLab – Palm Chapter 5 Curve Fitting
Nonlinear regression.
Physics 114: Lecture 14-a Linear Fitting Using Matlab
5.4 General Linear Least-Squares
Nonlinear Fitting.
Least Square Regression
12. Principles of Parameter Estimation
Logistic Regression.
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.
Presentation transcript:

John Federici NJIT Physics Department Physics 114: Lecture 15 Least Squares Fit to Polynomials and Exponentials John Federici NJIT Physics Department

Star Wars Trivial Which of the following (according to IMBD.COM) is NOT a ‘tagline’ from Star Wars: The Force Awakens Every generation has a story. Coming to your galaxy this winter. The force is calling to you. Just let it in. I am out of commission awhile and everyone gets delusions of grandeur A long time ago in a galaxy far, far away...

Philosophy about ‘coding’ own fitting functions When I was your age and blue jeans were really blue and only cost $5, we HAD to write our on analysis code to fit data because MATLAB, MATCAD and other commercial software tools had not been developed yet. Now, unless you are specializing in data analysis as a field, you will GENERALLY USE Matlab, Mathcad, etc. and NEVER write your own C code (or Fortran or Basic depending on how old you are!) So its my philosophy that I will NOT make you write you own least squares fitting functions. Instead we will use the built in POWER OF THE FORCE…. Namely Matlab.

Philosophy … continued So… rather than have you write your own routines, we will ‘go over the math’ so that you understand the BASIC concepts of how a least squares optimization is done, but I will NOT expect you to write Matlab code for it…. I DO EXPECT that you will use the Curve Fitting App. So, we will keep the math to a minimum, emphasize the overall process, and then get into details when we show it is implemented in MATLAB.

Reminder, Linear Least Squares We start with a smooth line of the form which is the “curve” we want to fit to the data. The chi-square for this situation is To minimize any function, you know that you should take the derivative and set it to zero. But take the derivative with respect to what? Obviously, we want to find constants a and b that minimize , so we will form two equations:

Polynomial Least Squares Let’s now allow a curved line of polynomial form which is the curve we want to fit to the data. For simplicity, let’s consider a second-degree polynomial (quadratic). The chi-square for this situation is Following exactly the same approach as before, we end up with three equations in three unknowns (the parameters a, b and c):

Second-Degree Polynomial The solution, then, can be found from the same determinant technique we used before, except now we have 3 x 3 determinants: You can see that extending to arbitrarily high powers is straightforward, if tedious. SO LET’S NOT EXTEND IT IN THE MATH… Use the POWER OF THE FORCE LUKE! Note that LINEAR least squares fitting works because we are LINEAR with respect to fitting coefficients

MatLAB Example: 2nd-Degree Polynomial Fit First, create a set of points that follow a second degree polynomial, with some random errors, and plot them: x = -3:0.1:3; y = randn(1,61)*2 - 2 + 3*x + 1.5*x.^2; plot(x,y,'.') Now use polyfit to fit a second-degree polynomial: p = polyfit(x,y,2) prints p = 1.5174 3.0145 -2.5130 Now overplot the fit hold on plot(x,polyval(p,x),'r') And the original function plot(x,-2 + 3*x + 1.5*x.^2,'g') Notice that the points scatter about the fit. Look at the residuals.

MatLAB Example (cont’d): 2nd-Degree Polynomial Fit The residuals are the differences between the points and the fit: resid = y – polyval(p,x) figure plot(x,resid,'.') The residuals appear flat and random, which is good. Check the standard deviation of the residuals: std(resid) prints ans = 1.9475 This is close to the value of 2 we used when creating the points.

MatLAB Example (cont’d): Chi-Square for Fit We could take our set of points, generated from a 2nd order polynomial, and fit a 3rd order polynomial: p2 = polyfit(x,y,3) hold off plot(x,polyval(x,p2),'.') The fit looks the same, but there is a subtle difference due to the use of an additional parameter. Let’s look at the standard deviation of the new resid2 = y – polyval(x,p2) std(resid2) prints ans = 1.9312 Is this a better fit? The residuals are slightly smaller BUT check chi-square. chisq1 = sum((resid/std(resid)).^2) % prints 60.00 chisq2 = sum((resid2/std(resid2)).^2) % prints 60.00 They look identical, but now consider the reduced chi-square. sum((resid/std(resid)).^2)/58. % prints 1.0345 sum((resid2/std(resid2)).^2)/57. % prints 1.0526 => 2nd-order fit is preferred

Now use the FORCE LUKE Now let’s analysis similar data using the CURVE FITTING APP Linear model Poly1: f(x) = p1*x + p2 Coefficients (with 95% confidence bounds): p1 = 3.11 (2.39, 3.829) p2 = 2.138 (0.8715, 3.405) Goodness of fit: SSE: 1443 R-square: 0.559 Adjusted R-square: 0.5515 RMSE: 4.945 Linear model Poly2: f(x) = p1*x^2 + p2*x + p3 Coefficients (with 95% confidence bounds): p1 = 1.581 (1.381, 1.781) p2 = 3.11 (2.795, 3.425) p3 = -2.762 (-3.595, -1.929) Goodness of fit: SSE: 271.9 R-square: 0.9169 Adjusted R-square: 0.914 RMSE: 2.165

So 2 is better than 1, how about a cubic equation? Linear model Poly3: f(x) = p1*x^3 + p2*x^2 + p3*x + p4 Coefficients (with 95% confidence bounds): p1 = -0.04318 (-0.1735, 0.08712) p2 = 1.581 (1.38, 1.782) p3 = 3.351 (2.558, 4.143) p4 = -2.762 (-3.599, -1.925) Goodness of fit: SSE: 269.9 R-square: 0.9175 Adjusted R-square: 0.9132 RMSE: 2.176 Based on adjusted R-Squared value, a quadratic fit a LITTLE bit better fit.

9th order polynomial Linear model Poly9: f(x) = p1*x^9 + p2*x^8 + p3*x^7 + p4*x^6 + p5*x^5 + p6*x^4 + p7*x^3 + p8*x^2 + p9*x + p10 Coefficients (with 95% confidence bounds): p1 = 0.0004643 (-0.0103, 0.01122) p2 = 0.005786 (-0.01047, 0.02204) p3 = -0.02197 (-0.234, 0.19) p4 = -0.0778 (-0.3611, 0.2055) p5 = 0.2454 (-1.166, 1.657) p6 = 0.2856 (-1.289, 1.86) p7 = -0.9302 (-4.559, 2.699) p8 = 1.293 (-1.709, 4.295) p9 = 4.146 (1.198, 7.095) p10 = -2.675 (-4.084, -1.266) Goodness of fit: SSE: 251.9 R-square: 0.923 Adjusted R-square: 0.9094 RMSE: 2.222 If you use TOO many parameters you are ‘fitting the noise’ Slightly worse based on adjusted R-Square, but there is a bigger problem!

Linear Fits, Polynomial Fits, Nonlinear Fits When we talk about a fit being linear or nonlinear, we mean linear in the coefficients (parameters), not in the independent variable. Thus, a polynomial fit is linear in coefficients a, b, c, etc., even though those coefficients multiply non-linear terms in independent variable x, (i.e. cx2). Thus, polynomial fitting is still linear least-squares fitting, even though we are fitting a non-linear function of independent variable x. The reason this is considered linear fitting is because for n parameters we can obtain n linear equations in n unknowns, which can be solved exactly (for example, by the method of determinants using Cramer’s Rule as we have done). In general, this cannot be done for functions that are nonlinear in the parameters (i.e., fitting a Gaussian function f(x) = a exp{-[(x - b)/c]2}, or sine function f(x) = a sin[bx +c]). We will discuss nonlinear fitting next time, when we discuss Chapter 8. However, there is an important class of functions that are nonlinear in parameters, but can be linearized (cast in a form that becomes linear in coefficients). We will now take a look at that. Apr 12, 2010

Linearizing Non-Linear Fits Consider the equation where a and b are the unknown parameters. Rather than consider a and b, we can take the natural logarithm of both sides and consider instead the function This is linear in the parameters ln a and b, where chi-square is Notice, though, that we must use uncertainties si′, instead of the usual si to account for the transformation of the dependent variable: Apr 12, 2010

MatLAB Example: Linearizing An Exponential First, create a set of points that follow the exponential, with some random errors, and plot them: x = 1:10; y = 0.5*exp(-0.75*x); sig = 0.03*sqrt(y); % errors proportional to sqrt(y) dev = sig.*randn(1,10); errorbar(x,y+dev,sig) Now convert using log(yi) – MatLAB for ln(yi) logy = log(y+dev); plot(x,logy,’.’) As predicted, the points now make a pretty good straight line. What about the errors. You might think this will work: errorbar(x, logy, log(sig)) Try it! What is wrong? Apr 12, 2010

MatLAB Example (cont’d): Linearizing An Exponential The correct errors are as noted earlier: logsig = sig./y; errorbar(x, logy, logsig) This now gives the correct plot. Let’s go ahead and try a linear fit. Remember, to do a weighted linear fit we use glmfit(). p = glmfit(x,logy,’normal’,’weights’,logsig); p = circshift(p,1); % swap order of parameters hold on plot(x,polyval(p,x),’r’) To plot the line over the original data: hold off errorbar(x,y+dev,sig) plot(x,exp(polyval(p,x)),’r’) Note parameters a′ = ln a = -0.6931, b′ = b = -0.75 Apr 12, 2010

USE THE FORCE LUKE! DO same problem but using Curve Fitting Tool x = 1:10; yy = 0.5*exp(-0.75*x); sig = 0.03*sqrt(yy); % errors proportional to sqrt(y) dev = sig.*randn(1,10); y = 0.5*exp(-0.75*x)+dev; errorbar(x,y,sig); errorbar(x,y,sig); General model Exp1: f(x) = a*exp(b*x) Coefficients (with 95% confidence bounds): a = 0.4626 (0.3933, 0.5319) b = -0.6631 (-0.7758, -0.5504) Goodness of fit: SSE: 5.285e-06 R-square: 0.9867 Adjusted R-square: 0.9851 RMSE: 0.0008128

Let’s do the fit as a LINEAR fit logsig = sig./yy; logy=log(y); errorbar(x, logy, logsig) p = glmfit(x,logy,'normal','weights',logsig); p = circshift(p,1); % swap order of parameters hold on plot(x,polyval(p,x),'r') WHY does the generalized LINEAR fit look BETTER than the exponential fit (plotted on log scale) The EMPIRE STRIKES BACK!!! Bottom curve best fit done with LINEAR y scale… ie. fits EXPONENTIAL function directly… Does not LINEARIZE equation fit.

Summary Use THE FORCE LUKE… Curve Fitting tool. If you want to go ‘old school’, use polyfit() for polynomial fitting. A polynomial fit is still considered linear least-squares fitting, despite its dependence on powers of the independent variable, because it is linear in the coefficients (parameters). For some problems, such as exponentials, , one can linearize the problem. Another type that can be linearized is a power-law expression, When linearizing, the errors must be handled properly, using the usual error propagation equation, e.g. Apr 12, 2010

Linear Fit versus Non-Linear Fit To fit data to the following equation, One can EITHER do a NON-LINEAR fit or a LINEAR fit Both are using the Curve Fitting Application. So, both are equally ‘easy’ for the user. Why go through efforts to LINEARIZE a nonlinear equation when you can just as easily fit the non-linear equation? (1) As we just learned, you get BETTER Fits (generally) with LINEARIZED equation than a nonlinear equation (2) If you do not care about SPEED of fitting or EFFICIENCY of fitting (what’s a few seconds among friends….), then USE THE FORCE LUKE and use the nonlinear fitting. But if SPEED or TIME MATTERS…..

Example of WHY speed matters Sorting of objects on a conveyer belt Detect (optical images) using cameras objects on a conveyer belt Processing optical images to determine WHAT TYPE of object it is. Depending on the TYPE Of object, switch the object to another conveyer belt. Clearly, for this type of data analysis, SPEED matters. You have a LIMITED time to make a yes/no decision of sorting the object before it passes by the sorting location. Check out this YouTube video of the concept as applied to recycling…. https://www.youtube.com/watch?v=SIVKmwzWSuc View from 1:12 to 1:50. Pay attention to OPTICAL SORTER If speed matters, LINEARIZE a non-linear fit to increase speed of computation