GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Managerial Economics in a Global Economy
The Simple Regression Model
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Chapter 7 Statistical Data Treatment and Evaluation
1 Functions and Applications
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Econ 140 Lecture 81 Classical Regression II Lecture 8.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
GG 313 Lecture 7 Chapter 2: Hypothesis Testing Sept. 13, 2005.
GG 313 Geological Data Analysis # 18 On Kilo Moana at sea October 25, 2005 Orthogonal Regression: Major axis and RMA Regression.
9. SIMPLE LINEAR REGESSION AND CORRELATION
The Simple Regression Model
GG313 Lecture 8 9/15/05 Parametric Tests. Cruise Meeting 1:30 PM tomorrow, POST 703 Surf’s Up “Peak Oil and the Future of Civilization” 12:30 PM tomorrow.
SIMPLE LINEAR REGRESSION
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Lecture 5 Curve fitting by iterative approaches MARINE QB III MARINE QB III Modelling Aquatic Rates In Natural Ecosystems BIOL471 © 2001 School of Biological.
GG313 Lecture 3 8/30/05 Identifying trends, error analysis Significant digits.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Inferences About Process Quality
SIMPLE LINEAR REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Linear Simultaneous Equations
Getting Started with Hypothesis Testing The Single Sample.
LINEAR EQUATIONS IN TWO VARIABLES. System of equations or simultaneous equations – System of equations or simultaneous equations – A pair of linear.
Simple Linear Regression Analysis
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
5  Systems of Linear Equations: ✦ An Introduction ✦ Unique Solutions ✦ Underdetermined and Overdetermined Systems  Matrices  Multiplication of Matrices.
Lecture 5 Correlation and Regression
Correlation and Regression
Objectives of Multiple Regression
Chapter one Linear Equations
Linear Regression.
SIMPLE LINEAR REGRESSION
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Chapter 13: Inference in Regression
Intermediate Statistical Analysis Professor K. Leppel.
MATRICES AND DETERMINANTS
1 Preliminaries Precalculus Review I Precalculus Review II
GG 313 Lecture 11 Chapter 3 Linear (Matrix) Algebra Sept 27, 2005.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
CORRELATION & REGRESSION
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
WEEK 8 SYSTEMS OF EQUATIONS DETERMINANTS AND CRAMER’S RULE.
Biostatistics Lecture 17 6/15 & 6/16/2015. Chapter 17 – Correlation & Regression Correlation (Pearson’s correlation coefficient) Linear Regression Multiple.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
GG 313 Lecture 9 Nonparametric Tests 9/22/05. If we cannot assume that our data are at least approximately normally distributed - because there are a.
Interval Notation Interval Notation to/from Inequalities Number Line Plots open & closed endpoint conventions Unions and Intersections Bounded vs. unbounded.
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Example x y We wish to check for a non zero correlation.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Managerial Economics & Decision Sciences Department tyler realty  old faithful  business analytics II Developed for © 2016 kellogg school of management.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression.
Elementary Statistics
Correlation and Simple Linear Regression
6-1 Introduction To Empirical Models
Correlation and Simple Linear Regression
Regression Lecture-5 Additional chapters of mathematics
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Presentation transcript:

GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005

Homework discussion. People are having problems with the null hypothesis and what the solution to a hypothesis test means. Rejecting the null hypothesis is the positive result of our tests. We need to understand what this means, and the easiest way is by example.

Consider the Mann-Whitney Test and example 2-7 (page 45). Comparing grain sizes from two different locations on the moon, we want to see if the mean grain sizes differ in the two samples. Of course they won’t be the same, but is the difference statistically significant? Our hypothesis is that the the mean grain size is different, implying the two samples come from different populations. Our null hypothesis is thus that the mean grain sizes are the same with some statistical confidence (95%). In this test we combine the two samples and rank eache element of the sample. If the two samples were identical, then they would have the same W 1 and W 2, which would be (eqn. 2.33) and U 1 and U 2 would equal zero.

Since U=0 is certainly less than the critical value (U  ) obtained from the table. So as U increases, the two means are getting farther apart. We cannot disprove our null hypothesis if U is smaller than U . In the notes for this example, U=24 and U  =20. So we can conclude that the two means cannot be the same with 95% certainty. IN THE NOTES, Paul says “This (U=24) is larger than the critical value of 20, suggesting we cannot reject the null hypothesis.” He’s wrong. Don’t believe everything you read. Another example: Homework 5, problem 1. It states: “… do these data support the claim that on average higher concentrations were obtained before cleaning versus after cleaning?” What is the hypothesis?

What is the null hypothesis? These data are in PAIRS. We are trying to figure out if the “before” number is statistically larger than the “after” number. Using the “sign” test, what do you do first? -subtract the 2nd number of each pair from the first. If the first is larger, the result will be “+”, if the first is smaller, the result will be “-”. How many “+” are there? If there were as many + as -, what would that say about the null hypothesis? If the number of + is much larger than the number of -, what would that say?

You can calculate the probability of having n + using the binomial coefficients, but that’s a lot of work. Since, for this problem, np>5 and n(1-p)>5, you can use z-statistics (eqn. 2.32). If your z-value from the data is >2, what does that mean? These tests are not difficult, but you do need to think the logic through. You cannot expect to blindly use the notes and formulas and come up with the correct answer. Since there’s a 50% probability that you’ll get the answer right, the method is everything.

Linear algebra provides us with an easy method for solving systems of simultaneous equations. Consider the following set of four equations with four unknowns x 1,x 2,x 3,and x 4 : In matrix form, the above is just: (3.77)

Where: ( ) The solution to the equations is obtained by multiplying both sides of Eqn by A -1, to obtain: (3.82) (3.83)

We have thus solved for x and obtained the solution. EXAMPLE: Consider a simple example. We have three planes that are defined below. Any three planes cross at a point. At what point do they cross each other? x-y-2z=2 x+y+2z=10 -2x-2y-z=3 Setting up the matrix:

In Matlab: >> Ainv=inv(A) Ainv = Multiplying by B, >> X=Ainv*B X = =x =y =z Thus the planes cross at the point above.

I think this is an easy way to solve such sets of equations, particularly as the number of equations and unknowns increase. Other methods may be computationally more efficient, but the above method is easy to set up and solve. _______________ Try an example where you know the answer: Try the x-z plane at y=2 (0x+1y+0z=2), the x-y plane at z=0 (0x+0y+1z=0), and the y-z plane at x=-2 (1x+0y+0z=- 2). Where do these 3 planes cross? Solve graphically and using the matrix solution.

Solutions of this sort are relatively simple. We have no options - there is only one correct answer and no freedom to choose between possible answers. A more interesting case is where we have more data than we need to fit a model. For example, as we’ve said before, two points define a line. But what if we want to define a line a line with three points? A and B define a line, but added data (point C) sheds doubt on our original interpretation.

We would like to define a new line that is somehow the “best” line given the data we have. In addition, we would like to know just how likely it is that the new line, which is an estimate based on a sample of all points in the population, reflects the population. We must be careful! We are assuming that our MODEL (a straight line) reflects the shape of reality. A “best fit” does not validate the model. A good bet for finding the “best” fit to a model curve is to minimize the square of the errors of each point with respect to the model curve. Thus, we want to find the curve that minimizes these errors: This figure works well, but what if the line is nearly vertical?

The figure above shows the errors in the y-value (regression of y on x). Similarly, we could find the errors in x: Do these two methods yield the same answers? Consider the case below: In this case the errors in y are far smaller than the errors in x, and utilizing errors in y will likely yield a better result. If the curve had a steep slope, the opposite would be the case.

In any case, we can use a method that does not vary with slope of the curve by measuring perpendiculars to the curve at each point: This method, called orthogonal regression, is most useful when the slope of the line is unknown and it can be in any direction.

We wish to find a line of the form: y(x)=a 1 +a 2 (x-x 0 ). (3.89) Why does Paul use x 0 ? Note that y(x)=a 1 +a 2 x-a 2 x 0, so y(x)=(a 1 -a 2 x 0 )+a 2 x=a 3 +a 2 x. So why bother with x 0 ??? Let’s ignore it… We have two unknowns, a 1 and a 2, and we wnt to find values of these unknowns that minimize the square of the error: (3.90)

We have one equation for each data point: We only need two equations to solve for a 1 and a 2, but we have n equations. This situation is called over-determined. The only way these equations can have a unique solution is for n to equal 2 or if all the points lie exactly on a straight line. (3.91)

We can write the equations in matrix notation, A x=b : (3.92) Why can’t we just invert as we did before, and solve for x=A -1 b ? Unfortunately, this isn’t possible since A isn’t square, and it thus has no inverse. But all is not lost… Consider the equations:

e i is the error of the observed y minus the theoretical value, and we want to obtain the values of a 1 and a 2 that minimize the sum of the squares of the e i : (3.93) (3.94) Recall the definition of variance. Minimization of E will minimize the variance of the errors. Recall that if E is minimum then the slope of E must be equal to zero.

E is a function of two variables, and the slope must be zero for each, which says: (3.95) Evaluating: (3.96) (3.97)

With the results from the two equations (3.96 and 3.97) we have two equations and two unknowns (a 1 and a 2 ), and we can solve. Re-arranging: Note that the unknowns are a 1 and a 2, and that the x and y values are known, so the sumations in the equations above are all of known constants. We form the sums as follows (notation is poor here; S stands for sum, not covariance): (3.98) (3.99) (3.100)

Substituting: Solving for the y-intercept, a 1 in the first equation, Substituting into the first equation for a 1 yields: 3.107

In matrix notation: (3.109) This matrix is of the form N x = B, which can be solved for x by x = N -1 B.

In - class problem: Use the following data and calculate the least-squares fit to a line using eqn Data points: xy