Download presentation
Presentation is loading. Please wait.
1
Supplemental Lecture Notes
1 - Introduction 2 - Exploratory Data Analysis 3 - Probability Theory 4 - Classical Probability Distributions 5 - Sampling Distribs / Central Limit Theorem 6 - Statistical Inference 7 - Correlation and Regression (8 - Survival Analysis)
2
Population Distribution of X
Suppose X ~ N(μ, σ), then… X = Age of women in U.S. at first birth X Density Each of these individual ages x is a particular value of the random variable X. Most are in the neighborhood of μ, but there are occasional outliers in the tails of the distribution. x4 x5 x1 x2 x3 … etc…. σ = 1.5 x x x x x μ = 25.4
3
Population Distribution of X
Suppose X ~ N(μ, σ), then… Sample, n = 400 Sample, n = 400 X = Age of women in U.S. at first birth Sample, n = 400 Sample, n = 400 X Density Sample, n = 400 How are these values distributed? … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… σ = 1.5 μ = 25.4
4
Population Distribution of X
Sampling Distribution of for any sample size n. Suppose X ~ N(μ, σ), then… Suppose X ~ N(μ, σ), then… X = Age of women in U.S. at first birth Density μ = X Density μ = σ = 1.5 “standard error” The vast majority of sample means are extremely close to μ, i.e., extremely small variability. How are these values distributed? … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… μ = 25.4
5
Population Distribution of X
Sampling Distribution of Suppose X ~ N(μ, σ), then… Suppose X ~ N(μ, σ), then… for large sample size n. for any sample size n. X = Age of women in U.S. at first birth Density μ = X Density μ = σ = 2.4 Suppose X ~ N(μ, σ), then… “standard error” The vast majority of sample means are extremely close to μ, i.e., extremely small variability. … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… μ = 25.4
6
Population Distribution of X
Sampling Distribution of X ~ Anything with finite μ and σ Suppose X N(μ, σ), then… for large sample size n. for any sample size n. X = Age of women in U.S. at first birth Density μ = X Density μ = σ = 2.4 Suppose X ~ N(μ, σ), then… “standard error” The vast majority of sample means are extremely close to μ, i.e., extremely small variability. … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… μ = 25.4
7
Density Density “standard error”
8
Probability that a single house selected at random costs less than $300K = ?
= Cumulative area under density curve for X up to 300. = Z-score Density Density “standard error” Example: X = Cost of new house ($K) 300
9
Probability that a single house selected at random costs less than $300K = ?
0.6554 = Z-score Density Density “standard error” Example: X = Cost of new house ($K) 300
10
Probability that a single house selected at random costs less than $300K = ?
0.6554 = Z-score Probability that the sample mean of n = 36 houses selected at random is less than $300K = ? = Cumulative area under density curve for up to 300. Density Density “standard error” Example: X = Cost of new house ($K) $12.5K 300 300
11
Probability that a single house selected at random costs less than $300K = ?
0.6554 = Z-score Probability that the sample mean of n = 36 houses selected at random is less than $300K = ? 0.9918 = Z-score Density Density “standard error” Example: X = Cost of new house ($K) $12.5K 300 300
12
large Density Density “standard error” mild skew
approximately large Density Density “standard error” mild skew
13
~ CENTRAL LIMIT THEOREM ~
approximately continuous or discrete, as n , large Density Density “standard error”
14
~ CENTRAL LIMIT THEOREM ~
continuous or discrete, approximately as n , large Density Example: X = Cost of new house ($K) Density “standard error”
15
Probability that a single house selected at random costs less than $300K = ?
= Cumulative area under density curve for X up to 300. Probability that the sample mean of n = 36 houses selected at random is less than $300K = ? 0.9918 = Z-score Density “standard error” Example: X = Cost of new house ($K) Density $12.5K 300 300
17
x p(x) 0.5 10 0.3 20 0.2
18
.25 5 .30 = 10 .29 = 15 .12 = 20 .04
21
possibly log-normal… More on CLT…
but remember Cauchy and 1/x2, both of which had nonexistent … CLT may not work! More on CLT… heavily skewed tail each based on 1000 samples
22
Population Distribution of X
Density Population Distribution of X X ~ Dist(μ, σ) Random Variable More on CLT… X = Age of women in U.S. at first birth If this first individual has been randomly chosen, and the value of X measured, then the result is a fixed number x1, with no random variability… and likewise for x2, x3, etc. DATA! BUT…
23
Population Distribution of X
Density Population Distribution of X X ~ Dist(μ, σ) Random Variable More… X = Age of women in U.S. at first birth If this first individual has been randomly chosen, and the value of X measured, then the result is a fixed number x1, with no random variability… and likewise for x2, x3, etc. DATA! However, if this is not the case, then this first “value” of X is unknown, thus can be considered as a random variable X1 itself… and likewise for X2, X3, etc. BUT… The collection {X1, X2, X3, …, Xn} of “independent, identically-distributed” (i.i.d.) random variables is said to be a random sample.
24
Population Distribution of X
Density Population Distribution of X X ~ Dist(μ, σ) Random Variable Sample, size n More… X = Age of women in U.S. at first birth CENTRAL LIMIT THEOREM Sampling Distribution of etc…… Density Claim: for any n Proof:
25
Population Distribution of X
Density Population Distribution of X X ~ Dist(μ, σ) Random Variable More… X = Age of women in U.S. at first birth CENTRAL LIMIT THEOREM Sampling Distribution of etc…… Density Claim: for any n Proof:
26
Recall… More on CLT… Normal Approximation to the Binomial Distribution
continuous discrete Recall… More on CLT… Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n. P(Success) = P(Failure) = 1 – Discrete random variable X = # Successes (0, 1, 2,…, n) in a random sample of size n
27
Normal Approximation to the Binomial Distribution CLT
continuous discrete Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n. P(Success) = P(Failure) = 1 – CLT See Prob 5.3/7 Discrete random variable X = # Successes (0, 1, 2,…, n) in a random sample of size n
28
Normal Approximation to the Binomial Distribution CLT
continuous discrete ?? Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n. P(Success) = P(Failure) = 1 – CLT Discrete random variable X = # Successes (0, 1, 2,…, n) in a random sample of size n
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.