Sampling Distribution of Pearson Correlation

Slides:



Advertisements
Similar presentations
STATISTICS Joint and Conditional Distributions
Advertisements

Generating Correlated Random Variables Kriss Harris Senior Statistician
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Exercise session # 1 Random data generation Jan Matuska November, 2006 Labor Economics.
How do we generate the statistics of a function of a random variable? – Why is the method called “Monte Carlo?” How do we use the uniform random number.
Correlation analysis Model constructs Brand knowledge, Brand attitude 1 Within construct corr coefficients Across constructs corr coefficients >.650.
Calculations in the Bivariate Normal Distribution James H. Steiger.
1 Def: Let and be random variables of the discrete type with the joint p.m.f. on the space S. (1) is called the mean of (2) is called the variance of (3)
Regression vs. Correlation Both: Two variables Continuous data Regression: Change in X causes change in Y Independent and dependent variables or Predict.
Correlations and Copulas Chapter 10 Risk Management and Financial Institutions 2e, Chapter 10, Copyright © John C. Hull
Correlation. The sample covariance matrix: where.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Covariance and correlation
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
Chapter 14 Monte Carlo Simulation Introduction Find several parameters Parameter follow the specific probability distribution Generate parameter.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Data Handling & Analysis BD7054 Scatter Plots Andrew Jackson
3-2 Random Variables In an experiment, a measurement is usually denoted by a variable such as X. In a random experiment, a variable whose measured.
Nonparametric Hypothesis tests The approach to explore the small-sized sample and the unspecified population.
A Casual Tutorial on Sample Size Planning for Multiple Regression Models D. Keith Williams M.P.H. Ph.D. Department of Biostatistics.
Relationship between two variables Two quantitative variables: correlation and regression methods Two qualitative variables: contingency table methods.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
SAS Interactive Matrix Language Computing for Research I Spring 2012 Ramesh.
Haas MFE SAS Workshop Lecture 3: Peng Liu Haas School.
Distributions, Iteration, Simulation Why R will rock your world (if it hasn’t already)
Normal Curves Often good representation of real data Often good approximations of chance outcomes.
Standard Deviation and the Normally Distributed Data Set
Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.
Chapter Bivariate Data (x,y) data pairs Plotted with Scatter plots x = explanatory variable; y = response Bivariate Normal Distribution – for.
Section 5.1: Correlation. Correlation Coefficient A quantitative assessment of the strength of a relationship between the x and y values in a set of (x,y)
Correlation. Correlation is a measure of the strength of the relation between two or more variables. Any correlation coefficient has two parts – Valence:
Estimation of covariance matrix under informative sampling Julia Aru University of Tartu and Statistics Estonia Tartu, June 25-29, 2007.
Principles of Biostatistics Chapter 17 Correlation 宇传华 网上免费统计资源(八)
 2012 Pearson Education, Inc. Slide Chapter 12 Statistics.
 Start with the coefficient of 5x 3 y 2.  Cube your result.  Add the digits of your answer together.  Take the cube root of your answer.  Add the.
Ch5.2 Covariance and Correlation
Basic simulation methodology
Figure Legend: From: Bayesian inference for psychometric functions
STATISTICS Joint and Conditional Distributions
Chapter 12 Statistics 2012 Pearson Education, Inc.
PRODUCT MOMENTS OF BIVARIATE RANDOM VARIABLES
Correlation A bit about Pearson’s r.
Math 4030 – 12a Correlation.
Lecture 17 Rank Correlation Coefficient
Correlations and Copulas
Tutorial 8: Further Topics on Random Variables 1
6-1 Introduction To Empirical Models
Physics-based simulation for visual computing applications
3-2 Random Variables denoted by a variable such as X. In an experiment, a measurement is usually denoted by a variable such as X. In a random experiment,
The sampling distribution of a statistic
Image filtering Images by Pawan Sinha.
4-1 Statistical Inference
Tutorial 9 Suppose that a random sample of size 10 is drawn from a normal distribution with mean 10 and variance 4. Find the following probabilities:
Confidence Ellipse for Bivariate Normal Data
Ch11 Curve Fitting II.
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Chapter 14 Monte Carlo Simulation
Using Simulation to Evaluate Statistical Techniques.
(Approximately) Bivariate Normal Data and Inference Based on Hotelling’s T2 WNBA Regular Season Home Point Spread and Over/Under Differentials
Sampling Distribution of the Mean in IML
The lognormal distribution
The Multivariate Normal Distribution, Part I
Making Inferences about Slopes
Wicklin, Rick. Simulating data with SAS. SAS Institute, 2013.
Let’s continue to review some of the statistics you’ve learned in your first class: Bivariate analyses (two variables measured at a time on each observation)
Professor Ke-sheng Cheng
Chapter 4. Supplementary Questions
See Table and let’s do it in R…
What’s your New Year’s Resolution?
Presentation transcript:

Sampling Distribution of Pearson Correlation

IMLMLIB Some IML functions and a subroutine The RANDNORMAL function The corr function The QNTL subroutine

Generate random samples bivariate normal and calculate correlation for each sample %let obs = 20; /* size of each sample */ %let reps = 1000; /* number of samples*/ proc iml; call randseed(54321); mu = {0 0}; /*mean*/ print mu; Sigma = {1 0.3, 0.3 1};/*covariance*/ print sigma; rho = j(&reps, 1); /* allocate vector for results*/ do i = 1 to &reps; /* simulation loop*/ /* simulated data */ x = RandNormal(&obs, mu, Sigma); /* corr returns a matrix, get Pearson correlation for ith sample*/ rho[i] = corr(x)[1,2]; end; print x; print (rho[1:5,]);

Compute quantiles, create data set call qntl(q, rho, {0.05 0.25 0.5 0.75 0.95}); print (q`)[colname={"P5" "P25" "Median" "P75" "P95"}]; create corr var {"Rho"}; append; close; quit;

Visualize approx. sampling distribution proc univariate data=Corr; label Rho = "Pearson Correlation Coefficient"; histogram Rho / kernel; ods select Histogram; run;

Find the percentage of negative correlations in the approximate sampling distribution. proc sql; select sum(Rho<0)/count(*) as pctneg "Percent negative" format=percent6.1 from corr ; quit;