Math 15 Introduction to Scientific Data Analysis Lecture 5 Association Statistics & Regression Analysis University of California, Merced.

Slides:



Advertisements
Similar presentations
Correlation and regression
Advertisements

CHAPTER 8: LINEAR REGRESSION
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Scatter Diagrams and Linear Correlation
Correlation Chapter 9.
CORRELATON & REGRESSION
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Describing the Relation Between Two Variables
Chapter 12 Simple Regression
Correlation and Regression Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 12a Simple Linear Regression
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Correlation MARE 250 Dr. Jason Turner.
Linear Regression and Correlation Analysis
1 Business 260: Managerial Decision Analysis Professor David Mease Lecture 1 Agenda: 1) Course web page 2) Greensheet 3) Numerical Descriptive Measures.
BCOR 1020 Business Statistics Lecture 24 – April 17, 2008.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age.
Relationships Among Variables
Simple Linear Regression
Linear Regression Analysis
Lecture 3: Bivariate Data & Linear Regression 1.Introduction 2.Bivariate Data 3.Linear Analysis of Data a)Freehand Linear Fit b)Least Squares Fit c)Interpolation/Extrapolation.
Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Math 15 Introduction to Scientific Data Analysis Lecture 6 Interactive Excel University of California, Merced.
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
STAT 211 – 019 Dan Piett West Virginia University Lecture 2.
Correlation and regression 1: Correlation Coefficient
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Math 15 Lecture 12 University of California, Merced Scilab Programming – No. 3.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Math 15 Lecture 7 University of California, Merced Scilab A “Very” Short Introduction.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Chapter 10 Correlation and Regression
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Math 15 Introduction to Scientific Data Analysis Lecture 3 Working With Charts and Graphics.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:
CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Stat 13, Tue 5/29/ Drawing the reg. line. 2. Making predictions. 3. Interpreting b and r. 4. RMS residual. 5. r Residual plots. Final exam.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
EXCEL DECISION MAKING TOOLS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
ContentDetail  Two variable statistics involves discovering if two variables are related or linked to each other in some way. e.g. - Does IQ determine.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Correlation and Regression Ch 4. Why Regression and Correlation We need to be able to analyze the relationship between two variables (up to now we have.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Sit in your permanent seat
Chapter 5 STATISTICS (PART 4).
Presentation transcript:

Math 15 Introduction to Scientific Data Analysis Lecture 5 Association Statistics & Regression Analysis University of California, Merced

WeekDateConceptsProject Due 1 2January 28Introduction to the data analysis 3February 4Excel #1 – General Techniques 4February 11Excel #2 – Plotting Graphs/ChartsQuiz #1 5February 18Holiday 6February 25Excel #3 – Statistical AnalysisQuiz #2 7March 3Excel #4 – Regression Analysis 8March 10Excel #5 – Interactive ProgrammingQuiz #3 9March 17Introduction to Computer Programming - Part - I March 24Spring Recesses 10March 31Introduction to Computer Programming - Part - IIProject #1 11April 7Programming – #1Quiz #4 12April 14Programming – #2 13April 21Programming – #3Quiz #5 14April 28Programming – #4 15May 5Programming - #5Quiz #6 16May 12Movies / EvaluationsProject #2 FinalMay ???Final Examination Course Lecture Schedule Quiz Next Week!

UC Merced3 Project #1 – Due March 31 st, 2008  Projects can be performed individually or in groups of three, with following rules: Teams turn in one project report and get the same grade. A team consists of at most 3 people—no copying between teams! Team project report must include a title page, where a team describe each team member’s contribution. 10% bonus for projects done individually Individual projects must not be copied from anyone else No late project will be accepted! Project #1 will be posted at UCMCROP by Next Monday!

UC Merced4 Review: Measures of dispersion or variability  Variance or Standard Deviation The one on the left is more dispersed than the one on the right. It has a higher variance or standard deviation. Average Mode

UC Merced5 Which is more precise measurement?  Although the standard deviation is a good measure of the precision of a given set of data, it can be difficult to compare the standard deviation from two different types of measurements directly.  You might need to do such a comparison to determine the largest source of uncertainty in an experimentally determined answer Average mgml  (standard Deviation)= 23  = 4.5

UC Merced6 Get the Right Tool for the Job!

UC Merced7 Measures of dispersion or variability  One way to do this comparison A relative standard deviation, RSD, is simply the ratio of the standard deviation over the mean Average mgml  = 23  = 4.5 RSD = 100x(23/446) = 5.2 RSD = 100x(4.5/35.49) = 12.7

UC Merced8  Any Questions?

UC Merced9 Common Practice for Data Analysis  A common task in data analysis is to investigate an association between two variables. To see if two variables vary together To see how one variable affect another. Correlation Regression

UC Merced10 Correlation  A correlation tells us whether the two variables vary together. i.e. as one goes up the other goes up (or goes down) Correlation Coefficient (Pearson product-moment correlation coefficient or Pearson’s r) Correlation Coefficient (Pearson product-moment correlation coefficient or Pearson’s r)

UC Merced11 Correlation Coefficient  Vary from +1 (perfect correlation) through 0 (no correlation) to -1 (perfect negative correlation)

UC Merced12 Correlation Coefficient – cont.  Always draw a diagram to check There are no OUTLIERS. If there are outliers, the following may not apply. The relation is not curved ( r only refers to LINEAR correlation) r (approx.)strength of tendencywhat with what 0.9 to 1stronghigh y with high x and low y with low x 0.7 to 0.9somehigh y with high x and low y with low x 0.3 to 0.7littlehigh y with high x and low y with low x -0.3 to 0.3noneneither high nor low y with high or low x -0.3 to -0.7littlelow y with high x and high y with low x -0.7 to -0.9somelow y with high x and high y with low x -0.9 to -1stronglow y with high x and high y with low x

UC Merced13 Excel Function – Correlation Coefficient  = CORREL(array1,array2) or  = PEARSON(array1,array2) Positive Correlation Lengths of a leg bone (in cm ) in penguin mating pairs

UC Merced14 Ice cream sales vs. number of people who drown at sea Correlation Coefficient 0.927

UC Merced15 Wait! What kinds of conclusion can we make from the correlation relationship?

UC Merced16 Examples  Ice cream sales correlate with the number of people who drown at sea. Therefore, ice cream causes people to drown.  Since the 1950s, both the atmospheric CO 2 level and crime levels have increased sharply. Hence, atmospheric CO 2 causes crime. Not Good Ones!

UC Merced17 Ice cream sales vs. number of people who drown at sea Correlation Coefficient 0.927

UC Merced18 Correlation does not imply causation  There can be no conclusion made regarding the existence or the direction of a cause and effect relationship only from the fact that A is correlated with B. Correlation Coefficient only tells you whether the two variables vary together.  Determining whether there is an actual cause and effect relationship requires further investigation, even when the relationship between A and B is statistically significant, a large effect size is observed, or a large part of the variance is explained.

UC Merced19  Any Questions?

UC Merced20 Regression  Regression is used when we have some reasons to believe that changes in one variable cause changes in the other. Correlation coefficient is not evidence for a causal relationship.  The simplest kind of causal relationship is a straight-line (or linear) relationship. Linear regression

UC Merced21 Linear regression  Linear regression assumes a linear relationship between two variables: Dependent factor, y, and independent factor, x.  In a mathematical approach, this relationship can be described by the following linear equation: where a is called the slope and b is called the intercept. This equation, which allows you to calculate y (dependent) based on x (independent), is based on the least square method.

UC Merced22 Review - Math  Linear Equation Slope and Intercept 8 3 y = 3x + 8

UC Merced23 Slope & Intercept formula Y-values X-values Lengths of a leg bone (in cm ) in penguin mating pairs

UC Merced24 y = ax + b  a – slope & b - intercept X-values Predicted Y-values =$C$10*B3+$C$ B C X-value Don’t forget $ sign!

UC Merced25 Plot a linear regression (or trend) line – Part 1 You can add a linear regression line

UC Merced26 Plot a linear regression (or trend) line – Part 2  Right-click on any data point on the graph  Choose Add Trendline  Click on Options tab, and select Display equation and Display R-squared.  Click “ Ok ” Don ’ t forget to check these two parts!

UC Merced27 Plot a linear regression (or trend) line – Part 2 – cont.  R 2 Value (R-squared value – RSQ) “ measure of scatter ”  The closer this value comes to 1, the more accurate the prediction.

UC Merced28 Let’s review the process! Lengths of a leg bone (in cm ) in penguin mating pairs If there are some reasons to believe some causalities between two variables, then, plot a graph! Regression To see if two variables vary together To see how one variable affect another.

UC Merced29  Any Questions?