MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 5.

Slides:



Advertisements
Similar presentations
1-4 curve fitting with linear functions
Advertisements

MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 3.
MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 2.
Business Statistics - QBM117 Scatter diagrams and measures of association.
MA-250 Probability and Statistics
Chapter 4 Describing the Relation Between Two Variables
Describing the Relation Between Two Variables
Chapter 10 Relationships between variables
BIVARIATE DATA: CORRELATION AND REGRESSION Two variables of interest: X, Y. GOAL: Quantify association between X and Y: correlation. Predict value of Y.
Topic 2 Bivariate Data. Data for a single variable is univariate data Many or most real world models have more than one variable … multivariate data In.
Correlation A correlation exists between two variables when one of them is related to the other in some way. A scatterplot is a graph in which the paired.
SIMPLE LINEAR REGRESSION
Describing Relationships: Scatterplots and Correlation
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Business Statistics - QBM117 Least squares regression.
Correlation and Regression Analysis
Scatter Diagrams and Correlation
Prediction. Prediction is a scientific guess of unknown by using the known data. Statistical prediction is based on correlation. If the correlation among.
Linear Regression Modeling with Data. The BIG Question Did you prepare for today? If you did, mark yes and estimate the amount of time you spent preparing.
STATISTICS ELEMENTARY C.M. Pascual
SIMPLE LINEAR REGRESSION
Relationship of two variables
Chapters 8 and 9: Correlations Between Data Sets Math 1680.
Covariance and correlation
Correlation.
1 Chapter 9. Section 9-1 and 9-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Examining Relationships Prob. And Stat. 2.2 Correlation.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
The Correlation Coefficient. Social Security Numbers.
WELCOME TO THETOPPERSWAY.COM.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 1 – Slide 1 of 30 Chapter 4 Section 1 Scatter Diagrams and Correlation.
CORRELATION. Bivariate Distribution Observations are taken on two variables Two characteristics are measured on n individuals e.g : The height (x) and.
©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.
1 Everyday is a new beginning in life. Every moment is a time for self vigilance.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Objectives (IPS Chapter 2.1)
STA291 Statistical Methods Lecture 10. Measuring something over time… o Like a company’s stock value: o It kind of just sits there, making your eyes glaze.
Chapter 4 Describing the Relation Between Two Variables 4.1 Scatter Diagrams; Correlation.
Statistics Class 7 2/11/2013. It’s all relative. Create a box and whisker diagram for the following data(hint: you need to find the 5 number summary):
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Scatter Diagrams and Correlation Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Statistics for Psychology CHAPTER SIXTH EDITION Statistics for Psychology, Sixth Edition Arthur Aron | Elliot J. Coups | Elaine N. Aron Copyright © 2013.
Chapter 10 Correlation and Regression Lecture 1 Sections: 10.1 – 10.2.
Correlation. Correlation Analysis Correlations tell us to the degree that two variables are similar or associated with each other. It is a measure of.
Chapter 4 Summary Scatter diagrams of data pairs (x, y) are useful in helping us determine visually if there is any relation between x and y values and,
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Scatter Diagram of Bivariate Measurement Data. Bivariate Measurement Data Example of Bivariate Measurement:
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Example: set E #1 p. 175 average ht. = 70 inchesSD = 3 inches average wt. = 162 lbs.SD = 30 lbs. r = 0.47 a)If ht. = 73 inches, predict wt. b)If wt. =
MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 4.
Math 3680 Lecture #18 Correlation. The Correlation Coefficient: Intuition.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapter 5 Summarizing Bivariate Data Correlation.
1 MVS 250: V. Katch S TATISTICS Chapter 5 Correlation/Regression.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
The Normal Approximation for Data. History The normal curve was discovered by Abraham de Moivre around Around 1870, the Belgian mathematician Adolph.
Measurement Error In practice, if the same thing is measured several times, each result is thrown off by chance error, and the error changes from measurement.
1 Chapter 9. Section 9-1 and 9-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman Correlation and Regression.
Correlation Definition: Correlation - a mutual relationship or connection between two or more things. (google.com) When two set of data appear to be connected.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
STATS DAY First a few review questions.
Correlation and Regression Lecture 1 Sections: 10.1 – 10.2
CORRELATION & REGRESSION compiled by Dr Kunal Pathak
Presentation transcript:

MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 5

Measurement Error In an ideal world, if the same thing is measured several times, the same result would be obtained each time. In reality, there are differences. – Each result is thrown off by chance error. Individual measurement = exact value + chance error

Measurement Error No matter how carefully it is made, a measurement could have been different than it is. If repeated, it will be different. But how much different? – Simple answer: Repeat the measurements. Consider the SD

Measurement Error Variability in measurements reflects the variability in the chance errors Individual measurement = exact value + chance error SD(Measurements) = exact value + SD(chance error)

Measurement Error An outlier can affect the – Mean – Standard Deviation What if the majority data follows a normal curve? – The outliers will affect the mean and SD such that the rule might not be followed. Solution: remove the outliers and then do the normal approximation.

Outliers 1SD is covering ~86% of the data, so the normal approximation cannot be used.

Outliers 1SD is covering ~68% of the data, so the normal approximation can be used now. Outliers Removed

Bias Chance error changes from measurement to measurement – sometimes positive and sometimes negative. Bias affects all measurements in the same way. Individual measurement = exact value + chance error + bias

below.

Dealing with bi-variate data So far, we have dealt with uni-variate data – One variable only – Age, Height, Income, Family Size, etc. How can we study relationships between 2 variables? – Relationship between height of father and height of son – Relationship between income and education Answer: scatter diagrams

Can we summarize the scatter diagram?

Summarizing a Scatter Diagram Mean Horizontal SD Vertical SD But these statistics do not measure the strength of the association between the 2 variables. How can we summarize the strength of association? Same mean and horizontal and vertical SDs but the left figure shows more association between the 2 variables.

Correlation Correlation measures the strength of association between 2 variables – As one increases, what happens to the other? Denoted by r r=average(x in standard units* y in standard units) Average = 0.4

How does r measure association strength? r=average(x in standard units* y in standard units) When both x and y are simultaneously above or below their means, their product in standard units is +ve. When +ve products dominate, the average of products is +ve (i.e., correlation r is +ve). Similarly for –ve products.

Correlation r is always between 1 and -1. r=0 implies no association between x and y. |r|=1 implies strong linear association. – r=1 implies perfectly linear, positive association. – r=-1 implies perfectly linear, negative association.

Very hard to predict y from x

Easy to predict y from x

Negative association between x and y

Some Properties of the Correlation Coefficient r has no units. (Why?) – The correlation between June temperatures for Lahore and Karachi will be the same in Celcius and Fahrenheit. r(x,y)=r(y,x) (Why?)

Exceptions! Strong linear association without outlier but outlier brings r down to almost 0 r measures linear association only, not all kinds of association.

Association is not Causation! Correlation measures association but association is not causation. – In kids, shoe-size and reading skills have a strong positive linear association. Does a larger foot improve your reading skills?

Summary Measurement Errors – Chance Error – Bias SD(chance errors) = SD(measurements) Let’s us determine if an error is by chance or not. Correlation measures strength of linear association between 2 variables. – Between -1 and 1 Not useful for summarizing scatter diagrams with – Outliers, or – Non-linear association. Association is not causation.