Part A: Concepts & binomial distributions Part B: Normal distributions

Part A: Concepts & binomial distributions Part B: Normal distributions
11/13/2018 4: Probability Part A: Concepts & binomial distributions Part B: Normal distributions 11/13/2018 Unit 4: Intro to probability Biostat

Unit 4: Intro to probability
Definitions Random variable  a numerical quantity that takes on different values depending on chance Population  the set of all possible values for a random variable Event  an outcome or set of outcomes for a random variable Probability  the proportion of times an event occurs in the population; (long-run) expected proportion 11/13/2018 Unit 4: Intro to probability

Probability (definition #1)
The probability of an event is its relative frequency (proportion) in the population. Example: Let A  selecting a female at random from an HIV+ population There are 600 people in the population. There are 159 females. Therefore, Pr(A) = 159 ÷ 600 = 0.265 11/13/2018 Unit 4: Intro to probability

The probability of an event is its expected proportion when the process in repeated again and again under the same conditions Select 100 individuals at random 24 are female Pr(A)  24 ÷ 100 = 0.24 This is only an estimate (unless n is very very big) 11/13/2018 Unit 4: Intro to probability

The probability of an event is a quantifiable level of belief between 0 and 1 Probability Verbal expression 0.00 Never 0.05 Seldom 0.20 Infrequent 0.50 As often as not 0.80 Very frequent 0.95 Highly likely 1.00 Always Example: Prior experience suggests a quarter of population is female. Therefore, Pr(A) ≈ 0.25 11/13/2018 Unit 4: Intro to probability

Some rules of probability
11/13/2018 Unit 4: Intro to probability

Types of random variables
Discrete have a finite set of possible outcomes, e.g. number of females in a sample of size n (0, 1, 2, …, n) We cover binomial random variables Continuous have a continuum of possible outcomes e.g., average body weight (lbs) in a sample (160, 160.5, , , …) We cover Normal random variables There are other random variable families, but only binomial and Normal RVs are covered for now. 11/13/2018 Unit 4: Intro to probability

Binomial distributions
Most popular type of discrete RV Based on Bernoulli trial  random event characterized by “success” or “failure” Examples Coin flip (heads or tails) Survival (yes or no) 11/13/2018 Unit 4: Intro to probability

Binomial random variables
Binomial random variable  random number of successes in n independent Bernoulli trials A family of distributions identified by two parameters n  number of trials p  probability of success for each trial Notation: X~b(n,p) X  random variable ~  “distributed as” b(n, p)  binomial RV with parameters n and p 11/13/2018 Unit 4: Intro to probability

“Four patients” example
A treatment is successful 75% of time We treat 4 patients X  random number of successes, which varies  0, 1, 2, 3, or 4 depending on binomial distribution X~b(4, 0.75) 11/13/2018 Unit 4: Intro to probability

The probability of i successes is …
Binomial formula The probability of i successes is … Where nCi = the binomial coefficient (next slide) p = probability of success for each trial q = probability of failure = 1 – p 11/13/2018 Unit 4: Intro to probability

Binomial coefficient (“choose function”)
where !  the factorial function: x! = x  (x – 1)  (x – 2)  …  1 Example: 4! = 4  3  2  1 = 24 By definition 1! = 1 and 0! = 1 nCi  the number of ways to choose i items out of n Example: “4 choose 2”: 11/13/2018 Unit 4: Intro to probability

“Four patients” example
n = 4 and p = 0.75 (so q = = 0.25) Question: What is probability of 0 successes?  i = 0 Pr(X = 0) =nCi pi qn–i = 4C0 · · 0.254–0 = 1 · · = 11/13/2018 Unit 4: Intro to probability

X~b(4,0.75), continued Pr(X = 1) = 4C1 · · –1 = 4 · · = Pr(X = 2) = 4C2 · · –2 = 6 · · = (Do not demonstrate all calculations. Students should prove to themselves they derive and interpret these values.) 11/13/2018 Unit 4: Intro to probability

X~b(4, 0.75) continued Pr(X = 3) = 4C3 · · –3 = 4 · · 0.25 = Pr(X = 4) = 4C4 · · –4 = 1 · · 1 = 11/13/2018 Unit 4: Intro to probability

The distribution X~b(4, 0.75)
Probability table for X~b(4,.75) Probability curve for X~b(4,.75) Successes Probability 0.0039 1 0.0469 2 0.2109 3 0.4210 4 0.3164 11/13/2018 Unit 4: Intro to probability

Area under the curve (AUC) concept
The area under a probability curve (AUC) = probability! Get it? Pr(X = 2) = .2109 11/13/2018 Unit 4: Intro to probability

Cumulative probability (left tail)
Cumulative probability = Pr(X  i) = probability less than or equal to i Illustrative example: X~b(4, .75) Pr(X  0) = Pr(X = 0) = .0039 Pr(X  1) = Pr(X  0) + Pr(X = 1) = = Pr(X  2) = Pr(X  1) + Pr(X = 2) = = Pr(X  3) = Pr(X  2) + Pr(X = 3) = = Pr(X  4) = Pr(X  3) + Pr(X = 4) = = 11/13/2018 Unit 4: Intro to probability

X~b(4, 0.75) Probability function Cumulative probability Pr(X  0) 0.0039 Pr(X  1) 0.0469 0.0508 Pr(X  2) 0.2109 0.2617 Pr(X  3) 0.4210 0.6836 Pr(X  4) 0.3164 1.0000 11/13/2018 Unit 4: Intro to probability

Cumulative probability
left tail = cumulative probability Area under shaded bars in left tail sums to , i.e., Pr(X  2) = Area under “curve” = probability Bring it on! 11/13/2018 Unit 4: Intro to probability

Reasoning Use probability model to reasoning about chance. I hypothesize p = 0.75, but observe only 2 successes. Should I doubt my hypothesis? ANS: No. When p = 0.75, you’ll see 2 or fewer successes 25% of the time (not that unusual). 11/13/2018 Unit 4: Intro to probability

StaTable probability calculator
Link on course homepage Three versions Java (browser) Windows Palm Probability Cumulative probability 11/13/2018 Unit 4: Intro to probability

Intro to Probability, Part B
The Normal distributions 11/13/2018 Unit 4: Intro to probability

The Normal distributions
Most popular continuous model Recognized by de Moivre (1667– 1754) Extended by Laplace (1749 – 1827) How’s my hair? Looks good. 11/13/2018 Unit 4: Intro to probability

Probability density function (curve)
11/13/2018 Probability density function (curve) Example: vocabulary scores of 947 seventh graders Smooth curve drawn over histogram is a model of the actual distribution Mathematical model is the Normal probability density function (pdf) 11/13/2018 Unit 4: Intro to probability Biostat

11/13/2018 Area under curve The area under the curve (AUC) concepts applies The shaded bars (left tail) represent scores ≤ 6.0 = 30.3% of scores Pr(X ≤ 6) = 0.303 11/13/2018 Unit 4: Intro to probability Biostat

Areas under curve (cont.)
11/13/2018 Areas under curve (cont.) Now translate this to the area under the curve (AUC) The scale of the Y-axis is adjusted so the total AUC = 1 The AUC to the left of 6.0 (shaded) = 0.293 Therefore, the AUC “models” the area in proportion area in the bars of the histogram, i.e., probabilities of associated ranges 11/13/2018 Unit 4: Intro to probability Biostat

11/13/2018 Density Curves 11/13/2018 Unit 4: Intro to probability Biostat

Arrows indicate points of inflection
11/13/2018 Normal distributions Normal distributions = a family of distributions with common characteristics Normal distributions have two parameters Mean µ locates center of the curve Standard deviation  quantifies spread (at points of inflection) Arrows indicate points of inflection 11/13/2018 Unit 4: Intro to probability Biostat

11/13/2018 rule for Normal RVs 68% of AUC falls within 1 standard deviation of the mean (µ  ) 95% fall within 2 (µ  2) 99.7% fall within 3 (µ  3) 11/13/2018 Unit 4: Intro to probability Biostat

Illustrative example: WAIS
Wechsler adult intelligence scores (WAIS) vary according to a Normal distribution with μ = 100 and σ = 15 11/13/2018 Unit 4: Intro to probability

Another example (male height)
11/13/2018 Another example (male height) Adult male height is approximately Normal with µ = 70.0 inches and  = 2.8 inches (NHANES, 1980) Shorthand: X ~ N(70, 2.8) Therefore: 68% of heights = µ   = 70.0  2.8 = 67.2 to 72.8 95% of heights = µ  2 = 70.0  2(2.8) = 64.4 to 75.6 99.7% of heights = µ  3 = 70.0  3(2.8) = 61.6 to 78.4 11/13/2018 Unit 4: Intro to probability Biostat

Another example (male height)
11/13/2018 Another example (male height) What proportion of men are less than 72.8 inches tall? (Note: 72.8 is one σ above μ) ? (height) 68% (by Rule) -1 +1 16% 16% 84% 11/13/2018 Unit 4: Intro to probability Biostat

Male Height Example ? 68 70 (height)
11/13/2018 Male Height Example What proportion of men are less than 68 inches tall? ? (height) 68 does not fall on a ±σ marker. To determine the AUC, we must first standardize the value. 11/13/2018 Unit 4: Intro to probability Biostat

Standardized value = z score
11/13/2018 Standardized value = z score To standardize a value, simply subtract μ and divide by σ This is now a z-score The z-score tells you the number of standard deviations the value falls from μ 11/13/2018 Unit 4: Intro to probability Biostat

Example: Standardize a male height of 68”
11/13/2018 Example: Standardize a male height of 68” Recall X ~ N(70,2.8) Therefore, the value 68 is 0.71 standard deviations below the mean of the distribution 11/13/2018 Unit 4: Intro to probability Biostat

Men’s Height (NHANES, 1980) ? 68 70 (height values)
11/13/2018 Men’s Height (NHANES, 1980) What proportion of men are less than 68 inches tall? = What proportion of a Standard z curve is less than –0.71? (height values) ? (standardized values) You can now look up the AUC in a Standard Normal “Z” table. 11/13/2018 Unit 4: Intro to probability Biostat

Using the Standard Normal table
11/13/2018 Using the Standard Normal table z .00 .01 .02 0.8 .2119 .2090 .2061 0.7 .2420 .2389 .2358 0.6 .2743 .2709 .2676 Pr(Z ≤ −0.71) = .2389 11/13/2018 Unit 4: Intro to probability Biostat

Summary (finding Normal probabilities)
Draw curve w/ landmarks Shade area Standardize value(s) Use Z table to find appropriate AUC (standardized values) (height values) .2389 11/13/2018 Unit 4: Intro to probability

Right-”tail” 68 70 (height values)
11/13/2018 Right-”tail” What proportion of men are greater than 68” tall? Greater than  look at right “tail” Area in right tail = 1 – (area in left tail) (standardized values) (height values) .2389 = .7611 Therefore, 76.11% of men are greater than 68 inches tall. 11/13/2018 Unit 4: Intro to probability Biostat

Z percentiles zp  the z score with cumulative probability p What is the 50th percentile on Z? ANS: z.5 = 0 What is the 2.5th percentile on Z? ANS: z.025 = 2 What is the 97.5th percentile on Z? ANS: z.975 = 2 11/13/2018 Unit 4: Intro to probability

Finding Z percentile in the table
11/13/2018 Finding Z percentile in the table Look up the closest entry in the table Find corresponding z score e.g., What is the 1st percentile on Z? z.01 = -2.33 closest cumulative proportion is .0099 z .02 .03 .04 2.3 .0102 .0099 .0096 11/13/2018 Unit 4: Intro to probability Biostat

Unstandardizing a value
11/13/2018 Unstandardizing a value How tall must a man be to place in the lower 10% for men aged 18 to 24? .10 ? (height values) 11/13/2018 Unit 4: Intro to probability Biostat

Table A: Standard Normal Table
11/13/2018 Table A: Standard Normal Table Use Table A Look up the closest proportion in the table Find corresponding standardized score Solve for X (“un-standardize score”) 11/13/2018 Unit 4: Intro to probability Biostat

Table A: Standard Normal Proportion
11/13/2018 Table A: Standard Normal Proportion z .07 .09 1.3 .0853 .0838 .0823 .1020 .0985 1.1 .1210 .1190 .1170 .08 1.2 .1003 Pr(Z < -1.28) = .1003 11/13/2018 Unit 4: Intro to probability Biostat

Men’s Height Example (NHANES, 1980)
11/13/2018 Men’s Height Example (NHANES, 1980) How tall must a man be to place in the lower 10% for men aged 18 to 24? .10 ? (height values) (standardized values) 11/13/2018 Unit 4: Intro to probability Biostat

Observed Value for a Standardized Score
11/13/2018 Observed Value for a Standardized Score “Unstandardize” z-score to find associated x : 11/13/2018 Unit 4: Intro to probability Biostat

Observed Value for a Standardized Score
11/13/2018 Observed Value for a Standardized Score x = μ + zσ = 70 + (-1.28 )(2.8) = 70 + (3.58) = 66.42 A man would have to be approximately inches tall or less to place in the lower 10% of the population 11/13/2018 Unit 4: Intro to probability Biostat

Part A: Concepts & binomial distributions Part B: Normal distributions

Similar presentations

Presentation on theme: "Part A: Concepts & binomial distributions Part B: Normal distributions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Part A: Concepts & binomial distributions Part B: Normal distributions

Similar presentations

Presentation on theme: "Part A: Concepts & binomial distributions Part B: Normal distributions"— Presentation transcript:

Similar presentations

About project

Feedback