Supplemental Lecture Notes

Supplemental Lecture Notes
1 - Introduction 2 - Exploratory Data Analysis 3 - Probability Theory 4 - Classical Probability Distributions 5 - Sampling Distribs / Central Limit Theorem 6 - Statistical Inference 7 - Correlation and Regression (8 - Survival Analysis)

Population Distribution of X
Suppose X ~ N(μ, σ), then… X = Age of women in U.S. at first birth X Density Each of these individual ages x is a particular value of the random variable X. Most are in the neighborhood of μ, but there are occasional outliers in the tails of the distribution.     x4 x5 x1 x2  x3 … etc…. σ = 1.5 x x x x x μ = 25.4

Suppose X ~ N(μ, σ), then… Sample, n = 400 Sample, n = 400 X = Age of women in U.S. at first birth Sample, n = 400 Sample, n = 400 X Density Sample, n = 400 How are these values distributed? … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… σ = 1.5 μ = 25.4

Sampling Distribution of for any sample size n. Suppose X ~ N(μ, σ), then… Suppose X ~ N(μ, σ), then… X = Age of women in U.S. at first birth Density μ = X Density μ = σ = 1.5 “standard error” The vast majority of sample means are extremely close to μ, i.e., extremely small variability. How are these values distributed? … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… μ = 25.4

Sampling Distribution of Suppose X ~ N(μ, σ), then… Suppose X ~ N(μ, σ), then…   for large sample size n. for any sample size n. X = Age of women in U.S. at first birth Density μ = X Density μ = σ = 2.4 Suppose X ~ N(μ, σ), then… “standard error” The vast majority of sample means are extremely close to μ, i.e., extremely small variability. … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… μ = 25.4

Sampling Distribution of X ~ Anything with finite μ and σ Suppose X  N(μ, σ), then…  for large sample size n. for any sample size n. X = Age of women in U.S. at first birth Density μ = X Density μ = σ = 2.4 Suppose X ~ N(μ, σ), then… “standard error” The vast majority of sample means are extremely close to μ, i.e., extremely small variability. … etc…. Each of these sample mean values is a “point estimate” of the population mean μ… μ = 25.4

Density   Density  “standard error”

Probability that a single house selected at random costs less than $300K = ?
= Cumulative area under density curve for X up to 300. = Z-score Density   Density  “standard error” Example: X = Cost of new house ($K)  300

0.6554 = Z-score Density   Density  “standard error” Example: X = Cost of new house ($K)  300

0.6554 = Z-score Probability that the sample mean of n = 36 houses selected at random is less than $300K = ? = Cumulative area under density curve for up to 300. Density   Density  “standard error” Example: X = Cost of new house ($K) $12.5K  300 300

0.6554 = Z-score Probability that the sample mean of n = 36 houses selected at random is less than $300K = ? 0.9918 = Z-score Density   Density  “standard error” Example: X = Cost of new house ($K) $12.5K  300 300

     large Density Density “standard error” mild skew
approximately  large Density  Density “standard error” mild skew   

~ CENTRAL LIMIT THEOREM ~
approximately continuous or discrete,  as n  , large   Density  Density “standard error”  

~ CENTRAL LIMIT THEOREM ~
continuous or discrete, approximately  as n  , large   Density  Example: X = Cost of new house ($K) Density “standard error”  

= Cumulative area under density curve for X up to 300. Probability that the sample mean of n = 36 houses selected at random is less than $300K = ? 0.9918 = Z-score Density  “standard error” Example: X = Cost of new house ($K) Density $12.5K  300 300 

x p(x) 0.5 10 0.3 20 0.2

.25 5 .30 = 10 .29 = 15 .12 = 20 .04

possibly log-normal… More on CLT…
but remember Cauchy and 1/x2, both of which had nonexistent … CLT may not work! More on CLT… heavily skewed tail each based on 1000 samples

Density Population Distribution of X X ~ Dist(μ, σ) Random Variable More on CLT… X = Age of women in U.S. at first birth  If this first individual has been randomly chosen, and the value of X measured, then the result is a fixed number x1, with no random variability… and likewise for x2, x3, etc. DATA! BUT…

Density Population Distribution of X X ~ Dist(μ, σ) Random Variable More… X = Age of women in U.S. at first birth  If this first individual has been randomly chosen, and the value of X measured, then the result is a fixed number x1, with no random variability… and likewise for x2, x3, etc. DATA! However, if this is not the case, then this first “value” of X is unknown, thus can be considered as a random variable X1 itself… and likewise for X2, X3, etc. BUT… The collection {X1, X2, X3, …, Xn} of “independent, identically-distributed” (i.i.d.) random variables is said to be a random sample.

Density Population Distribution of X X ~ Dist(μ, σ) Random Variable Sample, size n More… X = Age of women in U.S. at first birth CENTRAL LIMIT THEOREM Sampling Distribution of etc…… Density Claim: for any n Proof:

Density Population Distribution of X X ~ Dist(μ, σ) Random Variable More… X = Age of women in U.S. at first birth CENTRAL LIMIT THEOREM Sampling Distribution of etc…… Density Claim: for any n Proof:

Recall… More on CLT… Normal Approximation to the Binomial Distribution
continuous discrete Recall… More on CLT… Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n. P(Success) =  P(Failure) = 1 –  Discrete random variable X = # Successes (0, 1, 2,…, n) in a random sample of size n

Normal Approximation to the Binomial Distribution CLT
continuous discrete Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n. P(Success) =  P(Failure) = 1 –  CLT See Prob 5.3/7 Discrete random variable X = # Successes (0, 1, 2,…, n) in a random sample of size n

Normal Approximation to the Binomial Distribution CLT
continuous discrete ?? Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n. P(Success) =  P(Failure) = 1 –  CLT Discrete random variable X = # Successes (0, 1, 2,…, n) in a random sample of size n

Supplemental Lecture Notes

Similar presentations

Presentation on theme: "Supplemental Lecture Notes"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Supplemental Lecture Notes

Similar presentations

Presentation on theme: "Supplemental Lecture Notes"— Presentation transcript:

Similar presentations

About project

Feedback