Part A: Concepts & binomial distributions Part B: Normal distributions

Slides:



Advertisements
Similar presentations
Sampling Distributions
Advertisements

HS 67 - Intro Health Stat The Normal Distributions
Normal distribution. An example from class HEIGHTS OF MOTHERS CLASS LIMITS(in.)FREQUENCY
Topic 3 The Normal Distribution. From Histogram to Density Curve 2 We used histogram in Topic 2 to describe the overall pattern (shape, center, and spread)
Chapter 3 (Introducing density curves) When given a Histogram or list of data, we often are asked to estimate the relative position of a particular data.
DENSITY CURVES and NORMAL DISTRIBUTIONS. The histogram displays the Grade equivalent vocabulary scores for 7 th graders on the Iowa Test of Basic Skills.
Biostatistics Unit 4 Probability.
Biostatistics Unit 4 - Probability.
Chapter 5: Probability Concepts
HS 1674B: Probability Part B1 4B: Probability part B Normal Distributions.
Definitions Uniform Distribution is a probability distribution in which the continuous random variable values are spread evenly over the range of possibilities;
Basic Biostat8: Intro to Statistical Inference1. In Chapter 8: 8.1 Concepts 8.2 Sampling Behavior of a Mean 8.3 Sampling Behavior of a Count and Proportion.
CHAPTER 3: The Normal Distributions Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
BPS - 5th Ed. Chapter 31 The Normal Distributions.
Chapter 7: Normal Probability Distributions
Chapter 6 The Normal Probability Distribution
3.3 Density Curves and Normal Distributions
1 If we can reduce our desire, then all worries that bother us will disappear.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Chapter 6: Probability Distributions
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 PROBABILITIES FOR CONTINUOUS RANDOM VARIABLES THE NORMAL DISTRIBUTION CHAPTER 8_B.
Theory of Probability Statistics for Business and Economics.
Stat 1510: Statistical Thinking and Concepts 1 Density Curves and Normal Distribution.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
CHAPTER 3: The Normal Distributions ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
4.3 NORMAL PROBABILITY DISTRIBUTIONS The Most Important Probability Distribution in Statistics.
Essential Statistics Chapter 31 The Normal Distributions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 Normal Probability Distributions 6-1 Review and Preview 6-2 The Standard Normal.
CHAPTER 3: The Normal Distributions
Modular 11 Ch 7.1 to 7.2 Part I. Ch 7.1 Uniform and Normal Distribution Recall: Discrete random variable probability distribution For a continued random.
October 15. In Chapter 6: 6.1 Binomial Random Variables 6.2 Calculating Binomial Probabilities 6.3 Cumulative Probabilities 6.4 Probability Calculators.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.2 Probabilities for Bell-Shaped Distributions.
Chapter 3b (Normal Curves) When is a data point ( raw score) considered unusual?
BPS - 5th Ed. Chapter 31 The Normal Distributions.
Essential Statistics Chapter 31 The Normal Distributions.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
NORMAL DISTRIBUTION Chapter 3. DENSITY CURVES Example: here is a histogram of vocabulary scores of 947 seventh graders. BPS - 5TH ED. CHAPTER 3 2 The.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter The Normal Probability Distribution 7.
Barnett/Ziegler/Byleen Finite Mathematics 11e1 Chapter 11 Review Important Terms, Symbols, Concepts Sect Graphing Data Bar graphs, broken-line graphs,
Chapter 6 The Normal Distribution.  The Normal Distribution  The Standard Normal Distribution  Applications of Normal Distributions  Sampling Distributions.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
1 Keep Life Simple! We live and work and dream, Each has his little scheme, Sometimes we laugh; sometimes we cry, And thus the days go by.
February 16. In Chapter 5: 5.1 What is Probability? 5.2 Types of Random Variables 5.3 Discrete Random Variables 5.4 Continuous Random Variables 5.5 More.
THE NORMAL DISTRIBUTION
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.1 Summarizing Possible Outcomes and Their Probabilities.
The Normal Distributions.  1. Always plot your data ◦ Usually a histogram or stemplot  2. Look for the overall pattern ◦ Shape, center, spread, deviations.
Theoretical distributions: the Normal distribution.
Chapter 4: The Normal Distribution
The Normal Probability Distribution
Lecture Slides Elementary Statistics Twelfth Edition
Lecture Slides Elementary Statistics Twelfth Edition
Random Variables Random variables assigns a number to each outcome of a random circumstance, or equivalently, a random variable assigns a number to each.
BIOS 501 Lecture 3 Binomial and Normal Distribution
CHAPTER 3: The Normal Distributions
Density Curves and Normal Distribution
The Normal Probability Distribution
Chapter 2 Data Analysis Section 2.2
4A: Probability Concepts and Binomial Probability Distributions
Basic Practice of Statistics - 3rd Edition The Normal Distributions
Section 2.1 Density Curves & the Normal Distributions
Section 2.1 Density Curves & the Normal Distributions
Chapter 6: Random Variables
CHAPTER 3: The Normal Distributions
Basic Practice of Statistics - 3rd Edition The Normal Distributions
6: Binomial Probability Distributions
HS 67 - Intro Health Stat The Normal Distributions
CHAPTER 3: The Normal Distributions
Basic Practice of Statistics - 3rd Edition The Normal Distributions
Presentation transcript:

Part A: Concepts & binomial distributions Part B: Normal distributions 11/13/2018 4: Probability Part A: Concepts & binomial distributions Part B: Normal distributions 11/13/2018 Unit 4: Intro to probability Biostat

Unit 4: Intro to probability Definitions Random variable  a numerical quantity that takes on different values depending on chance Population  the set of all possible values for a random variable Event  an outcome or set of outcomes for a random variable Probability  the proportion of times an event occurs in the population; (long-run) expected proportion 11/13/2018 Unit 4: Intro to probability

Probability (definition #1) The probability of an event is its relative frequency (proportion) in the population. Example: Let A  selecting a female at random from an HIV+ population There are 600 people in the population. There are 159 females. Therefore, Pr(A) = 159 ÷ 600 = 0.265 11/13/2018 Unit 4: Intro to probability

Probability (definition #2) The probability of an event is its expected proportion when the process in repeated again and again under the same conditions Select 100 individuals at random 24 are female Pr(A)  24 ÷ 100 = 0.24 This is only an estimate (unless n is very very big) 11/13/2018 Unit 4: Intro to probability

Probability (definition #3) The probability of an event is a quantifiable level of belief between 0 and 1 Probability Verbal expression 0.00 Never 0.05 Seldom 0.20 Infrequent 0.50 As often as not 0.80 Very frequent 0.95 Highly likely 1.00 Always Example: Prior experience suggests a quarter of population is female. Therefore, Pr(A) ≈ 0.25 11/13/2018 Unit 4: Intro to probability

Some rules of probability 11/13/2018 Unit 4: Intro to probability

Types of random variables Discrete have a finite set of possible outcomes, e.g. number of females in a sample of size n (0, 1, 2, …, n) We cover binomial random variables Continuous have a continuum of possible outcomes e.g., average body weight (lbs) in a sample (160, 160.5, 160.75, 160.825, …) We cover Normal random variables There are other random variable families, but only binomial and Normal RVs are covered for now. 11/13/2018 Unit 4: Intro to probability

Binomial distributions Most popular type of discrete RV Based on Bernoulli trial  random event characterized by “success” or “failure” Examples Coin flip (heads or tails) Survival (yes or no) 11/13/2018 Unit 4: Intro to probability

Binomial random variables Binomial random variable  random number of successes in n independent Bernoulli trials A family of distributions identified by two parameters n  number of trials p  probability of success for each trial Notation: X~b(n,p) X  random variable ~  “distributed as” b(n, p)  binomial RV with parameters n and p 11/13/2018 Unit 4: Intro to probability

“Four patients” example A treatment is successful 75% of time We treat 4 patients X  random number of successes, which varies  0, 1, 2, 3, or 4 depending on binomial distribution X~b(4, 0.75) 11/13/2018 Unit 4: Intro to probability

The probability of i successes is … Binomial formula The probability of i successes is … Where nCi = the binomial coefficient (next slide) p = probability of success for each trial q = probability of failure = 1 – p 11/13/2018 Unit 4: Intro to probability

Binomial coefficient (“choose function”) where !  the factorial function: x! = x  (x – 1)  (x – 2)  …  1 Example: 4! = 4  3  2  1 = 24 By definition 1! = 1 and 0! = 1 nCi  the number of ways to choose i items out of n Example: “4 choose 2”: 11/13/2018 Unit 4: Intro to probability

“Four patients” example n = 4 and p = 0.75 (so q = 1 - 0.75 = 0.25) Question: What is probability of 0 successes?  i = 0 Pr(X = 0) =nCi pi qn–i = 4C0 · 0.750 · 0.254–0 = 1 · 1 · 0.0039 = 0.0039 11/13/2018 Unit 4: Intro to probability

Unit 4: Intro to probability X~b(4,0.75), continued Pr(X = 1) = 4C1 · 0.751 · 0.254–1 = 4 · 0.75 · 0.0156 = 0.0469 Pr(X = 2) = 4C2 · 0.752 · 0.254–2 = 6 · 0.5625 · 0.0625 = 0.2106 (Do not demonstrate all calculations. Students should prove to themselves they derive and interpret these values.) 11/13/2018 Unit 4: Intro to probability

Unit 4: Intro to probability X~b(4, 0.75) continued Pr(X = 3) = 4C3 · 0.753 · 0.254–3 = 4 · 0.4219 · 0.25 = 0.4219 Pr(X = 4) = 4C4 · 0.754 · 0.254–4 = 1 · 0.3164 · 1 = 0.3164 11/13/2018 Unit 4: Intro to probability

The distribution X~b(4, 0.75) Probability table for X~b(4,.75) Probability curve for X~b(4,.75) Successes Probability 0.0039 1 0.0469 2 0.2109 3 0.4210 4 0.3164 11/13/2018 Unit 4: Intro to probability

Area under the curve (AUC) concept The area under a probability curve (AUC) = probability! Get it? Pr(X = 2) = .2109 11/13/2018 Unit 4: Intro to probability

Cumulative probability (left tail) Cumulative probability = Pr(X  i) = probability less than or equal to i Illustrative example: X~b(4, .75) Pr(X  0) = Pr(X = 0) = .0039 Pr(X  1) = Pr(X  0) + Pr(X = 1) = .0039 + .0469 = 0.0508 Pr(X  2) = Pr(X  1) + Pr(X = 2) = .0508 + .2109 = 0.2617 Pr(X  3) = Pr(X  2) + Pr(X = 3) = .2617 + .4219 = 0.6836 Pr(X  4) = Pr(X  3) + Pr(X = 4) = .6836 + .3164 = 1.0000 11/13/2018 Unit 4: Intro to probability

Unit 4: Intro to probability X~b(4, 0.75) Probability function Cumulative probability Pr(X  0) 0.0039 Pr(X  1) 0.0469 0.0508 Pr(X  2) 0.2109 0.2617 Pr(X  3) 0.4210 0.6836 Pr(X  4) 0.3164 1.0000 11/13/2018 Unit 4: Intro to probability

Cumulative probability left tail = cumulative probability Area under shaded bars in left tail sums to 0.2617, i.e., Pr(X  2) = 0.2617 Area under “curve” = probability Bring it on! 11/13/2018 Unit 4: Intro to probability

Reasoning Use probability model to reasoning about chance. I hypothesize p = 0.75, but observe only 2 successes. Should I doubt my hypothesis? ANS: No. When p = 0.75, you’ll see 2 or fewer successes 25% of the time (not that unusual). 11/13/2018 Unit 4: Intro to probability

StaTable probability calculator Link on course homepage Three versions Java (browser) Windows Palm Probability Cumulative probability 11/13/2018 Unit 4: Intro to probability

Intro to Probability, Part B The Normal distributions 11/13/2018 Unit 4: Intro to probability

The Normal distributions Most popular continuous model Recognized by de Moivre (1667– 1754) Extended by Laplace (1749 – 1827) How’s my hair? Looks good. 11/13/2018 Unit 4: Intro to probability

Probability density function (curve) 11/13/2018 Probability density function (curve) Example: vocabulary scores of 947 seventh graders Smooth curve drawn over histogram is a model of the actual distribution Mathematical model is the Normal probability density function (pdf) 11/13/2018 Unit 4: Intro to probability Biostat

Unit 4: Intro to probability 11/13/2018 Area under curve The area under the curve (AUC) concepts applies The shaded bars (left tail) represent scores ≤ 6.0 = 30.3% of scores Pr(X ≤ 6) = 0.303 11/13/2018 Unit 4: Intro to probability Biostat

Areas under curve (cont.) 11/13/2018 Areas under curve (cont.) Now translate this to the area under the curve (AUC) The scale of the Y-axis is adjusted so the total AUC = 1 The AUC to the left of 6.0 (shaded) = 0.293 Therefore, the AUC “models” the area in proportion area in the bars of the histogram, i.e., probabilities of associated ranges 11/13/2018 Unit 4: Intro to probability Biostat

Unit 4: Intro to probability 11/13/2018 Density Curves 11/13/2018 Unit 4: Intro to probability Biostat

Arrows indicate points of inflection 11/13/2018 Normal distributions Normal distributions = a family of distributions with common characteristics Normal distributions have two parameters Mean µ locates center of the curve Standard deviation  quantifies spread (at points of inflection) Arrows indicate points of inflection 11/13/2018 Unit 4: Intro to probability Biostat

Unit 4: Intro to probability 11/13/2018 68-95-99.7 rule for Normal RVs 68% of AUC falls within 1 standard deviation of the mean (µ  ) 95% fall within 2 (µ  2) 99.7% fall within 3 (µ  3) 11/13/2018 Unit 4: Intro to probability Biostat

Illustrative example: WAIS Wechsler adult intelligence scores (WAIS) vary according to a Normal distribution with μ = 100 and σ = 15 11/13/2018 Unit 4: Intro to probability

Another example (male height) 11/13/2018 Another example (male height) Adult male height is approximately Normal with µ = 70.0 inches and  = 2.8 inches (NHANES, 1980) Shorthand: X ~ N(70, 2.8) Therefore: 68% of heights = µ   = 70.0  2.8 = 67.2 to 72.8 95% of heights = µ  2 = 70.0  2(2.8) = 64.4 to 75.6 99.7% of heights = µ  3 = 70.0  3(2.8) = 61.6 to 78.4 11/13/2018 Unit 4: Intro to probability Biostat

Another example (male height) 11/13/2018 Another example (male height) What proportion of men are less than 72.8 inches tall? (Note: 72.8 is one σ above μ) ? 70 72.8 (height) 68% (by 68-95-99.7 Rule) -1 +1 16% 16% 84% 11/13/2018 Unit 4: Intro to probability Biostat

Male Height Example ? 68 70 (height) 11/13/2018 Male Height Example What proportion of men are less than 68 inches tall? ? 68 70 (height) 68 does not fall on a ±σ marker. To determine the AUC, we must first standardize the value. 11/13/2018 Unit 4: Intro to probability Biostat

Standardized value = z score 11/13/2018 Standardized value = z score To standardize a value, simply subtract μ and divide by σ This is now a z-score The z-score tells you the number of standard deviations the value falls from μ 11/13/2018 Unit 4: Intro to probability Biostat

Example: Standardize a male height of 68” 11/13/2018 Example: Standardize a male height of 68” Recall X ~ N(70,2.8) Therefore, the value 68 is 0.71 standard deviations below the mean of the distribution 11/13/2018 Unit 4: Intro to probability Biostat

Men’s Height (NHANES, 1980) ? 68 70 (height values) 11/13/2018 Men’s Height (NHANES, 1980) What proportion of men are less than 68 inches tall? = What proportion of a Standard z curve is less than –0.71? 68 70 (height values) ? -0.71 0 (standardized values) You can now look up the AUC in a Standard Normal “Z” table. 11/13/2018 Unit 4: Intro to probability Biostat

Using the Standard Normal table 11/13/2018 Using the Standard Normal table z .00 .01 .02 0.8 .2119 .2090 .2061 0.7 .2420 .2389 .2358 0.6 .2743 .2709 .2676 Pr(Z ≤ −0.71) = .2389 11/13/2018 Unit 4: Intro to probability Biostat

Summary (finding Normal probabilities) Draw curve w/ landmarks Shade area Standardize value(s) Use Z table to find appropriate AUC -0.71 0 (standardized values) 68 70 (height values) .2389 11/13/2018 Unit 4: Intro to probability

Right-”tail” 68 70 (height values) 11/13/2018 Right-”tail” What proportion of men are greater than 68” tall? Greater than  look at right “tail” Area in right tail = 1 – (area in left tail) -0.71 0 (standardized values) 68 70 (height values) .2389 1- .2389 = .7611 Therefore, 76.11% of men are greater than 68 inches tall. 11/13/2018 Unit 4: Intro to probability Biostat

Unit 4: Intro to probability Z percentiles zp  the z score with cumulative probability p What is the 50th percentile on Z? ANS: z.5 = 0 What is the 2.5th percentile on Z? ANS: z.025 = 2 What is the 97.5th percentile on Z? ANS: z.975 = 2 11/13/2018 Unit 4: Intro to probability

Finding Z percentile in the table 11/13/2018 Finding Z percentile in the table Look up the closest entry in the table Find corresponding z score e.g., What is the 1st percentile on Z? z.01 = -2.33 closest cumulative proportion is .0099 z .02 .03 .04 2.3 .0102 .0099 .0096 11/13/2018 Unit 4: Intro to probability Biostat

Unstandardizing a value 11/13/2018 Unstandardizing a value How tall must a man be to place in the lower 10% for men aged 18 to 24? .10 ? 70 (height values) 11/13/2018 Unit 4: Intro to probability Biostat

Table A: Standard Normal Table 11/13/2018 Table A: Standard Normal Table Use Table A Look up the closest proportion in the table Find corresponding standardized score Solve for X (“un-standardize score”) 11/13/2018 Unit 4: Intro to probability Biostat

Table A: Standard Normal Proportion 11/13/2018 Table A: Standard Normal Proportion z .07 .09 1.3 .0853 .0838 .0823 .1020 .0985 1.1 .1210 .1190 .1170 .08 1.2 .1003 Pr(Z < -1.28) = .1003 11/13/2018 Unit 4: Intro to probability Biostat

Men’s Height Example (NHANES, 1980) 11/13/2018 Men’s Height Example (NHANES, 1980) How tall must a man be to place in the lower 10% for men aged 18 to 24? .10 ? 70 (height values) -1.28 0 (standardized values) 11/13/2018 Unit 4: Intro to probability Biostat

Observed Value for a Standardized Score 11/13/2018 Observed Value for a Standardized Score “Unstandardize” z-score to find associated x : 11/13/2018 Unit 4: Intro to probability Biostat

Observed Value for a Standardized Score 11/13/2018 Observed Value for a Standardized Score x = μ + zσ = 70 + (-1.28 )(2.8) = 70 + (3.58) = 66.42 A man would have to be approximately 66.42 inches tall or less to place in the lower 10% of the population 11/13/2018 Unit 4: Intro to probability Biostat