Logical Line Fitting: One Step in the EDA Process by Shannon Guerrero Northern Arizona University NCTM 2008 Annual Meeting & Exposition Salt Lake City,

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

Residuals.
Correlation and regression
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
CHAPTER 3 Describing Relationships
Linear Regression Analysis
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
Relationship of two variables
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
CHAPTER 7: Exploring Data: Part I Review
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Regression Regression relationship = trend + scatter
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Notes Chapter 7 Bivariate Data. Relationships between two (or more) variables. The response variable measures an outcome of a study. The explanatory variable.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Lecture 9 Sections 3.3 Objectives:
CHAPTER 3 Describing Relationships
EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS
Sections Review.
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
LSRL Least Squares Regression Line
Cautions about Correlation and Regression
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
No notecard for this quiz!!
Section 3.3 Linear Regression
AP Statistics, Section 3.3, Part 1
CHAPTER 3 Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
GET OUT p.161 HW!.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
CHAPTER 3 Describing Relationships
Section 3.2: Least Squares Regressions
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Algebra Review The equation of a straight line y = mx + b
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Chapter 3: Describing Relationships
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Logical Line Fitting: One Step in the EDA Process by Shannon Guerrero Northern Arizona University NCTM 2008 Annual Meeting & Exposition Salt Lake City, UT April 2008

EDA (Exploratory Data Analysis)  Mostly graphical approach to data analysis  Emphasizes uncovering underlying structure of data, extract important variables, detect outliers/anomolies, test underlying assumptions, maximize insight into data set  Graph the data, graph the data, graph the data  Focus on sense-making rather than theory

Why curve fitting?  Applications in data analysis & algebra  “Analyses of the relationships between two sets of measurement data are central in high school mathematics” (p. 328 NCTM PSSM)  modeling, prediction, symbolic representation, correlation, regression, residuals

“Line of Best Fit”  Explains relationship between two variables with a straight line that “best fits” the data  Line may pass through some, none, or all of the points  Used to predict future values from existing values (interpolate vs extrapolate)

Outliers  An observation that lies outside the overall pattern of a distribution  For one variable, a convenient def’n is a point that falls more than 1.5 times the IQR above the 3 rd quartile or below the 1 st quartile  Examine outliers carefully and understand their appearance in your data set  Need to decide what to do with outliers – include or discard?

Curve Fitting vs. Regression  Power of curve fitting often lost as we revert right to regression calculations  Curve fitting is more general and an approximation  Equation found (using either method) can help uncover underlying structure of data, predict future values from past ones, model causal relationships, and maximize insight into a data set

Linear Regression  Statistical approach to finding relationship between two variables  Least squares regression attempts to minimize the squared residuals (residual – difference between observed value and value given by model)  Assumption: for a fixed value of x the value of y is normally distributed with equal variations across x

r 2 and residuals  residual – difference between an observed value and value predicted by regression line  residual plot is a scatterplot of regression residuals against the explanatory variable  helps us assess fit of regression line  r 2 is another way to assess how well the line fits the data (the closer to 1 the better the fit)