EE, NCKU Tien-Hao Chang (Darby Chang)

Slides:



Advertisements
Similar presentations
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Advertisements

Lesson 10: Linear Regression and Correlation
Hypothesis Testing Steps in Hypothesis Testing:
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Describing Relationships Using Correlation and Regression
© The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
The Simple Regression Model
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Linear Regression Modeling with Data. The BIG Question Did you prepare for today? If you did, mark yes and estimate the amount of time you spent preparing.
Leedy and Ormrod Ch. 11 Gray Ch. 14
Lecture 16 Correlation and Coefficient of Correlation
Correlation.
Correlation and Regression
Learning Objective Chapter 14 Correlation and Regression Analysis CHAPTER fourteen Correlation and Regression Analysis Copyright © 2000 by John Wiley &
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
© The McGraw-Hill Companies, Inc., Chapter 11 Correlation and Regression.
Research Project Statistical Analysis. What type of statistical analysis will I use to analyze my data? SEM (does not tell you level of significance)
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Simple linear regression Tron Anders Moger
Correlation & Regression Analysis
© Copyright McGraw-Hill 2004
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
CORRELATION ANALYSIS.
Chapter 13 Understanding research results: statistical inference.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Understand.
© The McGraw-Hill Companies, Inc., Chapter 10 Correlation and Regression.
Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Inference about the slope parameter and correlation
Correlation.
Regression and Correlation
Correlation & Regression
Correlation and Simple Linear Regression
Chapter 11: Simple Linear Regression
Correlation – Regression
Chapter 5 STATISTICS (PART 4).
Simple Linear Regression
Elementary Statistics
CHAPTER fourteen Correlation and Regression Analysis
CHAPTER 10 Correlation and Regression (Objectives)
Correlation and Simple Linear Regression
Correlation and Regression
CHAPTER 26: Inference for Regression
CORRELATION ANALYSIS.
Correlation and Simple Linear Regression
I. Statistical Tests: Why do we use them? What do they involve?
Correlation and Regression
Correlation and Regression
Simple Linear Regression and Correlation
Product moment correlation
Correlation and the Pearson r
Review I am examining differences in the mean between groups How many independent variables? OneMore than one How many groups? Two More than two ?? ?
Warsaw Summer School 2017, OSU Study Abroad Program
Correlation & Regression
Introduction to Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

EE, NCKU Tien-Hao Chang (Darby Chang) Numerical Analysis EE, NCKU Tien-Hao Chang (Darby Chang)

Correlation coefficient What we need is a single summary number that answers the following questions: does a relationship exist? if so, is it a positive or a negative relationship? and is it a strong or a weak relationship? Correlation coefficient: A single summary number that gives you a good idea about how closely one variable is related to another variable

Correlation coefficient Two-way scatter plot Suppose that we are interested in a pair of continuous random variables For example, relationship between the percentage of children who have been immunized against the infectious DPT and mortality rate Data for a random sample of 20 countries are shown in the next slide X: the percentage of children immunized by age on year Y: the under-five mortality rate Before we do any analysis, we should create a two-way scatter plot of the data relationship exists between x and y? The mortality rate tends to decrease as the percentage of children immunized increase

Pearson’s CC In the underlying population form which the sample of points (xi,yi) is selected, the population correlation between the variables X and Y The quantifies the strength of the linear relationship between the outcomes x and y The estimator of ρ or r is known as Pearson’s coefficient of correlation or correlation coefficient

The correlation coefficient is dimensionless number; it has no units of measurement. the value r=1 and r=-1 occur when there is an exact linear relationship between x and y if y tends to increase in magnitude as x increases, r is greater than 0; x any y are said to be positively correlated if y decreases as x increases, r is less than 0 and the two variables are negatively correlated if r=0, there is no linear relationship between x and y and the variables are uncorrelated http://cclearn.npue.edu.tw/tuition/ccchen-web/教育統計學/7.pdf

http://upload. wikimedia http://upload.wikimedia.org/wikipedia/commons/0/02/Correlation_examples.png

CC is not a percent In addition to telling you whether two variables are related to one another, whether the relationship is positive or negative and how large the relationship is, The correlation coefficient tells you one more important bit of information—it tells you exactly how much variation in one variable is related to changes in the other variable A correlation coefficient is a “ratio” not a percent many students tend to think when r = .90 it means that 90% of the changes in one variable are accounted for or related to the other variable even worse, some think that this means that any predictions you make will be 90% accurate both are not correct!

Correlation Coefficient Coefficient of determination However it is very easy to translate the correlation coefficient into a percentage All you have to do is “square the correlation coefficient” which means that you multiply it by itself So, if the symbol for a correlation coefficient is “r”, then the symbol for this new statistic is simply “r2” which can be called “r squared” r2, also called the “Coefficient of Determination”, tells you how much variation in one variable is directly related to (or accounted for) by the variation in the other variable

The correlation coefficient is r = 0. 80 The correlation coefficient is r = 0.80. By squaring r to get r2, you fully 64% of the variation in scores on Variable B is directly related to how they scored on Variable A.

Statistical test

Correlation coefficient Statistical inference To test a significant correlation between two variables H0:r = 0 H1:r ≠ 0 The statistic (under H0): with n-2 degrees of freedom http://zoro.ee.ncku.edu.tw/mlb2009/res/14-ch5.pdf (pp. 9-14)

Step 1: State the hypotheses Step 2: Find the critical values Test the significance of the correlation coefficient for the age and blood pressure data suppose that n=6, r=0.897 and α=0.05 Step 1: State the hypotheses H0:r = 0 H1:r ≠ 0 Step 2: Find the critical values since α=0.05 and there are 6–2=4 degrees of freedom, the critical values are t = +2.776 and t = –2.776. Step 3: Compute the test value t = 4.059 Step 4: Make the decision reject the null hypothesis, since the test value falls in the critical region (4.059 > 2.776) Step 5: Summarize the results there is a significant relationship between the variables of age and blood pressure

Correlation coefficient Limitations It quantifies only the strength of the linear relationship between two variables Care must be taken when the data contain any outliers, or pairs of observations that lie considerably outside the range of the other data points A high correlation between two variables does not imply a cause-and-effect relationship

Four sets of data with the same correlation of 0.816 http://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Anscombe%27s_quartet_3.svg/2000px-Anscombe%27s_quartet_3.svg.png Four sets of data with the same correlation of 0.816

Spearman’s Rank CC Pearson’s correlation coefficient is very sensitive to outlying values We may be interested in calculating a measure of association that is more robust One approach is to rank the two sets of outcomes x and y separately and known as Spearman’s rank correlation coefficient where xri and yri are the rank associated the ith subject rather than the actual observations

About Correlation Coefficient

Statistical inference Basic tests tests about proportions tests about one mean tests of the equality of two means tests for variances references http://zoro.ee.ncku.edu.tw/mlb2009/res/14-ch5.pdf (pp. 27-33) http://www.math.isu.edu.tw/finance/course/sta/ch8.ppt http://www.tnb.org.tw/Image/ttest.ppt http://www.mis.ncyu.edu.tw/course/download/cftai/Chapter%206.%20Continuous%20Probability%20Distribution.PPT More advanced tests ANOVA (analysis of variance) goodness of fit (Wilcoxon test, Kolmogorov-Smirnov test, …)

Multivariate analysis Statistics ANOVA Multiple linear regression http://www.sjsu.edu/faculty/gerstman/biostat-text/Gerstman_PP15.ppt http://www.stat.nuk.edu.tw/Ray-Bing/regression/regression/Chapter3.ppt PCA (principle component analysis) ICA (independent component analysis) LDA (linear discriminant analysis) So far, all techniques belong to statistics. You could find them in most statistical software, such as MATLAB, R (http://www.r-project.org/), SPSS… Machine learning Naïve Bayes (http://zoro.ee.ncku.edu.tw/mlb2009/res/11-ch4.pdf pp. 13-27) LIBSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) RVKDE (http://mbi.ee.ncku.edu.tw/wiki/doku.php?id=rvkde)