10-3 Inferences
What about making inferences? r, a and b are the sample test statistics Sample r becomes population ρ and ŷ = ax + b becomes y = αx + β The Idealized model…. Certain assumptions must be met: (x,y) is a random sample from the population For each fixed x, the y has a normal distributions. All the y distributions have the same variance Other books use β0 and β1 for a and b
Testing ρ 2. Compute the test statistic i.e. Testing whether there is or is not a linear correlation 1. Set your hypotheses HO: ρ = 0 HA: ρ > 0, ρ < 0, ρ ≠ 0 2. Compute the test statistic
Testing ρ 2. Compute the test statistic i.e. Testing whether there is or is not a linear correlation 1. Set your hypotheses HO: ρ = 0 HA: ρ > 0, ρ < 0, ρ ≠ 0 2. Compute the test statistic d.f. = n – 2 Logic dictates how many data sets (n)?
Testing ρ 2. Compute the test statistic 3. Find the P-value i.e. Testing whether there is or is not a linear correlation 1. Set your hypotheses HO: ρ = 0 HA: ρ > 0, ρ < 0, ρ ≠ 0 2. Compute the test statistic 3. Find the P-value 4. Compare to α and conclude. 5. State your conclusion. What is the purpose? Seeing if there is a linear correlation of any type, NOT determining causation…
Measuring Spread What if you feel there is a linear correlation, but are not sure how “good” it is. That is, while the line maybe touches two points, some other points are substantially off…. The line is a model. The model assumes that the means of the distributions fall on the line even though individual data points may not…
Measuring Spread Error can be determined a number of ways Method 1: Using residual
Measuring Spread Error can be determined a number of ways Method 1: Using residual Where ŷ = ax + b, and n > 3 Use this one!!
Measuring Spread (cont) Method 2: Find a confidence interval for y True y, for a population, has a population slope, a population y intercept, plus some sort of random error. Therefore, we can create a confidence interval for y that allows us to predict true y.
Measuring Spread (cont) Method 2: This will look familiar…. Based on n ≥ 3 data pairs, after finding ŷ use
Measuring Spread (cont) Method 2: This will look familiar…. Based on n ≥ 3 data pairs, after finding ŷ use Where ŷ = ax + b, c = confidence level, n = number of data pairs, and Se is the standard error of estimate
Measuring Spread (cont) Method 3: Using a student t distribution, you will test the population slope (β) Based on a null hypotheses of β = 0 and an alternate like before (greater, less than or not equal to..) What will the test statistic simplify to be?
Measuring Spread (cont) Method 3: Using a student t distribution, you will test the population slope (β) Based on a null hypotheses of β = 0 and an alternate like before (greater, less than or not equal to..) What will the test statistic simplify to be? d.f. = n-2
Measuring Spread (cont) Method 3: Therefore, the confidence interval will be
Calculator Your calculator can find standard error, and can also do a linear regression test (callced a LinRegTTest)
Problem Let x be a random variable that represents the batting average of a professional baseball player. Let y be a random variable that represents the percentage of strikeouts of a professional baseball player. A random sample of 6 professional baseball players gave the following information: x 0.328 0.290 0.340 0.248 0.367 0.269 y 3.2 7.6 4.0 8.6 3.1 11.1
Given the following a) use a 5% confidence level to test the claim that ρ≠0. b) Find Se, a and b c) Find the predicted percentage of strikeouts for a player with an x = 0.300 batting average d) Find an 80% confidence interval for y when x = 0.300 e) Use a 5% level of significance ot test the claim that β≠0 f) Find a 90% confidence interval for β and interpret its meaning.