Random Variable X, with pmf p(x) or pdf f(x) POPULATION Random Variable X, with pmf p(x) or pdf f(x) Recall… PARAMETERS “population characteristics” Mean: X 2 measures how much X varies about its mean . Variance: Proof: See PowerPoint section 3.1-3.3, slides 56, 57 for discrete X.
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Variances:
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Variances: Covariance: Proof:
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Variances: Covariance: Properties: Covariance of two random variables measures how they vary together about their respective means. Variance is 0, but covariance is unrestricted in sign. Cov(X, X) = Other properties based on expected value… Var(X)
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Covariance: Is there an association between X and Y, and if so, how is it measured?
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Covariance: Is there an association between X and Y, and if so, how is it measured?
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Covariance: Is there an association between X and Y, and if so, how is it measured?
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Covariance:
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Covariance:
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Covariance: … but what does it mean????
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Y X y1 y2 y3 y4 y5 x1 p(x1, y1) p(x1, y2) p(x1, y3) p(x1, y4) p(x1, y5) pX(x1) x2 p(x2, y1) p(x2, y2) p(x2, y3) p(x2, y4) p(x2, y5) pX(x2) x3 p(x3, y1) p(x3, y2) p(x3, y3) p(x3, y4) p(x3, y5) pX(x3) x4 p(x4, y1) p(x4, y2) p(x4, y3) p(x4, y4) p(x4, y5) pX(x4) x5 p(x5, y1) p(x5, y2) p(x5, y3) p(x5, y4) p(x5, y5) pX(x5) pY(y1) pY(y2) pY(y3) pY(y4) pY(y5) 1
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Example: Y X 1 2 3 4 5 .04 .20 In a uniform population, each of the points {(1,1), (1, 2),…, (5, 5)} has the same density. A scatterplot would reveal no particular association between X and Y. In fact, i.e., X and Y are statistically independent! It is easy to see that Cov(X, Y) = 0.
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Exercise: Y X 1 2 3 4 5 .04 .12 .20 .28 .36 .10 .15 .25 .30 Fill in the table so that X and Y are statistically independent. Then show that Cov(X, Y) = 0. THEOREM. If X and Y are statistically independent, then Cov(X, Y) = 0. However, the converse does not necessarily hold! Exception: The Bivariate Normal Distribution
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Example: Y X 1 2 3 4 5 .08 .04 .03 .02 .01 .18 .21 .22
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Example: Y X 1 2 3 4 5 .08 .04 .03 .02 .01 .18 .21 .22
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? Example: Y X 1 2 3 4 5 .08 .04 .03 .02 .01 .18 .21 .22 X large high prob Y large As X increases, Y also has a tendency to increase; thus, X and Y are said to be positively correlated. Likewise, two negatively correlated variables have a tendency for Y to decrease as X increases. The simplest mathematical object to have this property is a straight line. X small high prob Y small
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Variances: Covariance: Linear Correlation Coefficient: Always between –1 and +1 (“rho”)
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Linear Correlation Coefficient: JAMA. 2003;290:1486-1493 Scatterplot ρ measures the strength of linear association between X and Y. Always between –1 and +1.
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Linear Correlation Coefficient: IQ vs. Head circumference strong moderate weak moderate strong -1 +1 -0.75 -0.5 +0.5 +0.75 positive linear correlation negative linear correlation
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Linear Correlation Coefficient: Body Temp vs. Age strong moderate weak moderate strong -1 +1 -0.75 -0.5 +0.5 +0.75 positive linear correlation negative linear correlation
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? A strong positive correlation exists between ice cream sales and drowning. Cause & Effect? A strong positive correlation exists between ice cream sales and drowning. Cause & Effect? NOT LIKELY… “Temp (F)” is a confounding variable. PARAMETERS Linear Correlation Coefficient: Linear Profit vs. Price strong moderate weak moderate strong -1 +1 -0.75 -0.5 +0.5 +0.75 positive linear correlation negative linear correlation
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Variances: Covariance: Definition Theorem Special case: Y = constant c Theorem
Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Variances: Covariance: Theorem Proof: (WLOG)
Converse is not necessarily true!!! POPULATION(S) Random Variables X, Y with joint pmf p(x,y) or pdf f(x, y) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Variances: Covariance: Theorem (WLOG) Theorem If X and Y are independent, then Cov(X, Y) = 0. Proof: Exercise… Hint: See slide 4 above. Converse is not necessarily true!!! If X and Y are independent, then Corollary
BIVARIATE NORMAL DISTRIBUTION Important Exception BIVARIATE NORMAL DISTRIBUTION