Supplemental Lecture Notes

Slides:



Advertisements
Similar presentations
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions  C.
Advertisements

CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions  C.
Probability Densities
Review.
Chapter 6 Continuous Random Variables and Probability Distributions
Probability Distributions Random Variables: Finite and Continuous Distribution Functions Expected value April 3 – 10, 2003.
Continuous Random Variables and Probability Distributions
Chapter 4 Continuous Random Variables and Probability Distributions
Probability The definition – probability of an Event Applies only to the special case when 1.The sample space has a finite no.of outcomes, and 2.Each.
Theory of Probability Statistics for Business and Economics.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions 
STA347 - week 31 Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5’s in the 6 rolls. Let X = number of.
CS433 Modeling and Simulation Lecture 03 – Part 01 Probability Review 1 Dr. Anis Koubâa Al-Imam Mohammad Ibn Saud University
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Chapter 3 Discrete Random Variables and Probability Distributions  Random Variables.2 - Probability Distributions for Discrete Random Variables.3.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Chapter 3 Discrete Random Variables and Probability Distributions  Random Variables.2 - Probability Distributions for Discrete Random Variables.3.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
1 Keep Life Simple! We live and work and dream, Each has his little scheme, Sometimes we laugh; sometimes we cry, And thus the days go by.
Chapter 4 Continuous Random Variables and Probability Distributions
Chapter 4 Continuous Random Variables and Probability Distributions  Probability Density Functions.2 - Cumulative Distribution Functions and E Expected.
Continuous Random Variables and Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions  Random Variables.2 - Probability Distributions for Discrete Random Variables.3.
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions 
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Theoretical distributions: the other distributions.
Theoretical distributions: the Normal distribution.
Continuous Probability Distributions
Introductory Statistics and Data Analysis
MECH 373 Instrumentation and Measurements
Supplemental Lecture Notes
ASV Chapters 1 - Sample Spaces and Probabilities
Chapter 4 Continuous Random Variables and Probability Distributions
Random Variables Random variables assigns a number to each outcome of a random circumstance, or equivalently, a random variable assigns a number to each.
STAT 311 Chapter 1 - Overview and Descriptive Statistics
Random Variables Random variables assigns a number to each outcome of a random circumstance, or equivalently, a random variable assigns a number to each.
STAT 311 REVIEW (Quick & Dirty)
Chapter 6. Continuous Random Variables
Example: X = Cholesterol level (mg/dL)
Chapter 5 Joint Probability Distributions and Random Samples
Discrete random variable X Examples: shoe size, dosage (mg), # cells,…
Aim – How do we analyze a Discrete Random Variable?
Chapter 3 Discrete Random Variables and Probability Distributions
ENGR 201: Statistics for Engineers
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Chapter 4 Continuous Random Variables and Probability Distributions
Chapter 4 Continuous Random Variables and Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions
Classical Continuous Probability Distributions
ASV Chapters 1 - Sample Spaces and Probabilities
x1 p(x1) x2 p(x2) x3 p(x3) POPULATION x p(x) ⋮ Total 1 “Density”
Discrete random variable X Examples: shoe size, dosage (mg), # cells,…
Chapter 5 Sampling Distributions
ASV Chapters 1 - Sample Spaces and Probabilities
POPULATION (of “units”)
6.1: Discrete and Continuous Random Variables
Chapter 3 Discrete Random Variables and Probability Distributions
Chapter 6: Random Variables
ASV Chapters 1 - Sample Spaces and Probabilities
3. Random Variables Let (, F, P) be a probability model for an experiment, and X a function that maps every to a unique point.
POPULATION (of “units”)
Lecture 12: Normal Distribution
Probability.
M248: Analyzing data Block A UNIT A3 Modeling Variation.
Presentation transcript:

Supplemental Lecture Notes 1 - Introduction 2 - Exploratory Data Analysis 3 - Probability Theory 4 - Classical Probability Distributions 5 - Sampling Distrbns / Central Limit Theorem 6 - Statistical Inference 7 - Correlation and Regression (8 - Survival Analysis)

What is the connection between probability and random variables What is the connection between probability and random variables? Events (and their corresponding probabilities) that involve experimental measurements can be described by random variables.

Example: X = Cholesterol level (mg/dL) POPULATION random variable X Pop values xi Probabilities p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Discrete Example: X = Cholesterol level (mg/dL) Data values xi Relative Frequencies p(xi ) = fi /n x1 p(x1) x2 p(x2) x3 p(x3) ⋮ xk p(xk) Total 1 x1 x2 x3 x4 x5 x6 …etc…. xn SAMPLE of size n

Probability Histogram POPULATION Pop values x Probabilities p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete Probability Histogram “Density” Total Area = 1 X p(x) = Probability that the random variable X is equal to a specific value x, i.e., p(x) = P(X = x) “probability mass function” (pmf) | x

Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Probability Table x p(x) 1 1/6 2 3 4 5 6 Probability Histogram X P(X = x) Total Area = 1 Density f(x) “What is the probability of rolling a 4?”

Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Probability Table x p(x) 1 1/6 2 3 4 5 6 Probability Histogram X P(X = x) Total Area = 1 Density f(x) “What is the probability of rolling a 4?”

Probability Histogram POPULATION Pop values x Probabilities p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete Probability Histogram X Total Area = 1 F(x) = Probability that the random variable X is less than or equal to a specific value x, i.e., F(x) = P(X  x) “cumulative distribution function” (cdf) | x

Motivation ~ Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Cumulative distribution P(X = x) x p(x) 1 1/6 2 3 4 5 6 P(X  x) F(x) 1/6 2/6 3/6 4/6 5/6 1

Motivation ~ Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Cumulative distribution P(X = x) x p(x) 1 1/6 2 3 4 5 6 P(X  x) F(x) 1/6 2/6 3/6 4/6 5/6 1 “staircase graph” from 0 to 1

Example: X = Cholesterol level (mg/dL) POPULATION Pop vals x pmf p(x) cdf F(x) = P(X  x) x1 p(x1) F(x1) = p(x1) x2 p(x2) F(x2) = p(x1) + p(x2) x3 p(x3) F(x3) = p(x1) + p(x2) + p(x3) ⋮ Total 1 increases from 0 to 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete Calculating “interval probabilities”… X F(b) = P(X  b) F(a–) = P(X  a–) F(b) – F(a–) = P(X  b) – P(X  a–) = P(a  X  b) p(x) | a– | a | b

FUNDAMENTAL THEOREM OF CALCULUS POPULATION Pop vals x pmf p(x) cdf F(x) = P(X  x) x1 p(x1) F(x1) = p(x1) x2 p(x2) F(x2) = p(x1) + p(x2) x3 p(x3) F(x3) = p(x1) + p(x2) + p(x3) ⋮ Total 1 increases from 0 to 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete Calculating “interval probabilities”… X F(b) = P(X  b) F(a–) = P(X  a–) F(b) – F(a–) = P(X  b) – P(X  a–) FUNDAMENTAL THEOREM OF CALCULUS (discrete form) = P(a  X  b) p(x) | a– | a | b

FUNDAMENTAL THEOREM OF CALCULUS POPULATION Pop vals x pmf p(x) cdf F(x) = P(X  x) x1 p(x1) F(x1) = p(x1) x2 p(x2) F(x2) = p(x1) + p(x2) x3 p(x3) F(x3) = p(x1) + p(x2) + p(x3) ⋮ Total 1 increases from 0 to 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete Calculating “interval probabilities”… X F(b) = P(X  b) Hey!!! What about the population mean  population variance  2 ??? and the F(a–) = P(X  a–) F(b) – F(a–) = P(X  b) – P(X  a–) FUNDAMENTAL THEOREM OF CALCULUS (discrete form) = P(a  X  b) p(x) | a– | a | b

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete Just as the sample mean and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean  and population variance  2, using probabilities. Population mean Also denoted by E[X], the “expected value” of the variable X. Population variance

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete Just as the sample mean and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean  and population variance  2, using probabilities. Population mean Also denoted by E[X], the “expected value” of the variable X. Population variance

Example: X = Cholesterol level (mg/dL) POPULATION 1/6 1/3 1/2 Example: X = Cholesterol level (mg/dL) random variable X Discrete Pop values xi Probabilities p(xi ) 210 1/6 240 1/3 270 1/2 Total 1 250 500

Equally likely outcomes result in a “uniform distribution.” Example 2: POPULATION 1/3 1/3 1/3 Example: X = Cholesterol level (mg/dL) random variable X Discrete Equally likely outcomes result in a “uniform distribution.” Pop values xi Probabilities p(xi ) 180 1/3 210 240 Total 1 210 (clear from symmetry) 600

To summarize…

Probability Histogram Probability Table Probability Histogram X Total Area = 1 POPULATION Pop xi Probabilities pmf p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ 1 Discrete random variable X Frequency Table Density Histogram X Total Area = 1 Data xi Relative Frequencies p(xi ) = fi /n x1 p(x1) x2 p(x2) x3 p(x3) ⋮ xk p(xk) 1 SAMPLE of size n x1 x2 x3 x4 x5 x6 …etc…. xn

Probability Histogram Probability Table Probability Histogram X Total Area = 1 POPULATION Pop xi Probabilities pmf p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ 1 ? Discrete random variable X Continuous Frequency Table Density Histogram X Total Area = 1 Data xi Relative Frequencies p(xi ) = fi /n x1 p(x1) x2 p(x2) x3 p(x3) ⋮ xk p(xk) 1 SAMPLE of size n x1 x2 x3 x4 x5 x6 …etc…. xn

Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 210 1/6 240 1/3 270 1/2 Total 1 1 = 250 12 = 500 x p2(x) 180 1/3 210 240 Total 1 2 = 210 22 = 600 NOTE: By definition, this is the sample space of the experiment! What are the probabilities of the corresponding events “D = d” for d = -30, 0, 30, 60, 90? NOTE: By definition, this is the sample space of the experiment! D = X1 – X2 ~ ??? d Outcomes -30 (210, 240) (210, 210), (240, 240) +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180)

NO!!! Example 3: TWO INDEPENDENT POPULATIONS x p1(x) 210 1/6 240 1/3 X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 210 1/6 240 1/3 270 1/2 Total 1 1 = 250 12 = 500 x p2(x) 180 1/3 210 240 Total 1 2 = 210 22 = 600 The outcomes of D are NOT EQUALLY LIKELY!!! D = X1 – X2 ~ ??? d Probabilities p(d) -30 1/9 ? 2/9 ? +30 3/9 ? +60 +90 d Outcomes -30 (210, 240) (210, 210), (240, 240) +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) NO!!!

Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 210 1/6 240 1/3 270 1/2 Total 1 1 = 250 12 = 500 x p2(x) 180 1/3 210 240 Total 1 2 = 210 22 = 600 D = X1 – X2 ~ ??? d Probabilities p(d) -30 (1/6)(1/3) = 1/18 via independence (210, 210), (240, 240) +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) d Outcomes -30 (210, 240) (210, 210), (240, 240) +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180)

Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 210 1/6 240 1/3 270 1/2 Total 1 1 = 250 12 = 500 x p2(x) 180 1/3 210 240 Total 1 2 = 210 22 = 600 D = X1 – X2 ~ ??? d Probabilities p(d) -30 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) d Probabilities p(d) -30 (1/6)(1/3) = 1/18 via independence (210, 210), (240, 240) +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180)

Example 3: TWO INDEPENDENT POPULATIONS 1/18 3/18 6/18 5/18 Probability Histogram X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 210 1/6 240 1/3 270 1/2 Total 1 1 = 250 12 = 500 x p2(x) 180 1/3 210 240 Total 1 2 = 210 22 = 600 What happens if the two populations are dependent? Later… D = X1 – X2 ~ ??? d Probabilities p(d) -30 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) d Probabilities p(d) -30 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18 +60 (1/3)(1/3) + (1/2)(1/3) = 5/18 +90 (1/2)(1/3) = 3/18

Example 3: TWO INDEPENDENT POPULATIONS 1/18 3/18 6/18 5/18 Probability Histogram X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 210 1/6 240 1/3 270 1/2 Total 1 1 = 250 12 = 500 1 = 250 12 = 500 x p2(x) 180 1/3 210 240 Total 1 2 = 210 22 = 600 2 = 210 22 = 600 D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40 D = X1 – X2 ~ ??? d Probabilities f(d) -30 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18 +60 (1/3)(1/3) + (1/2)(1/3) = 5/18 +90 (1/2)(1/3) = 3/18 d Probabilities f(d) -30 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) D = 1 – 2 D2 = (-70) 2(1/18) + (-40) 2(3/18) + (-10) 2(6/18) + (20) 2(5/18) + (50) 2(3/18) = 1100 D 2 = 1 2 + 2 2

General: TWO INDEPENDENT POPULATIONS IF the two populations are dependent… 1/18 3/18 6/18 5/18 Probability Histogram X1 X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) X2 x f1(x) 210 1/6 240 1/3 270 1/2 Total 1 1 = 250 12 = 500 1 = 250 12 = 500 …then this formula still holds, BUT…… x f2(x) 180 1/3 210 240 Total 1 2 = 210 22 = 600 2 = 210 22 = 600 Mean (X1 – X2) = Mean (X1) – Mean (X2) D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40 D = X1 – X2 ~ ??? d Probabilities f(d) -30 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) d Probabilities f(d) -30 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18 +60 (1/3)(1/3) + (1/2)(1/3) = 5/18 +90 (1/2)(1/3) = 3/18 D = 1 – 2 Var (X1 – X2) = Var (X1) + Var (X2) D2 = (-70) 2(1/18) + (-40) 2(3/18) + (-10) 2(6/18) + (20) 2(5/18) + (50) 2(3/18) = 1100 – 2 Cov (X1, X2) These two formulas are valid for continuous as well as discrete distributions. D 2 = 1 2 + 2 2

NOTICE TO STAT 324 Slides 29-41 contain more details on properties of Expected Values. They are not required for Stat 324, but if you are experiencing difficulty with the formulas, you may find them of some benefit. Special note regarding Slide 41: Similar to the “alternate computational formula” for sample variance s2, such a formula also exists for population variance σ 2, derived there. Stat 324 material picks up with the Binomial Distribution.

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Suppose X is transformed to another random variable, say h(X). Then by def,

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Suppose X is constant, say b, throughout entire population… b Then by def,

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Suppose X is constant, say b, throughout entire population… Then…

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Multiply X by any constant a… a Then by def,

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Multiply X by any constant a… Then… i.e.,…

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Multiply X by any constant a… Add any constant b to X… Then… i.e.,…

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Multiply X by any constant a… Add any constant b to X… Then… i.e.,…

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Multiply X by any constant a… then X is also multiplied by a.

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Multiply X by any constant a… then X is also multiplied by a. i.e.,… i.e.,…

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Add any constant b to X… then b is also added to X .

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X Add any constant b to X… then b is also added to X . i.e.,… i.e.,…

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X

Example: X = Cholesterol level (mg/dL) POPULATION Pop values x Probabilities pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ Total 1 Example: X = Cholesterol level (mg/dL) random variable X Discrete General Properties of “Expectation” of X This is the analogue of the “alternate computational formula” for the sample variance s2.

~ The Binomial Distribution ~ Used only when dealing with binary outcomes (two categories: “Success” vs. “Failure”), with a fixed probability of Success () in the population. Calculates the probability of obtaining any given number of Successes in a random sample of n independent “Bernoulli trials.” Has many applications and generalizations, e.g., multiple categories, variable probability of Success, etc.

How can we calculate the probability of POPULATION 40% Male, 60% Female For any randomly selected individual, define a binary random variable: RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) x p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ 1 F(x) F(x1) F(x2) ⋮ 1 How can we calculate the probability of How can we calculate the probability of P(X = x), for x = 0, 1, 2, 3, …,100? p(x) = P(X = x), for x = 0, 1, 2, 3, …,100? P(X = 0), P(X = 1), P(X = 2), …, P(X = 99), P(X = 100)? p(x) = F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?

How can we calculate the probability of POPULATION 40% Male, 60% Female For any randomly selected individual, define a binary random variable: RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Example: How can we calculate the probability of F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? p(25) = P(X = 25)? P(X = x), for x = 0, 1, 2, 3, …,100? p(x) = Solution: Solution: Model the sample as a sequence of independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female), where P(H) = 0.4, P(T) = 0.6 .… etc….

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 … X = 25 Heads: { H1, H2, H3,…, H25 } HOWEVER… permutations of 25 among 100 There are 100 possible open slots for H1 to occupy. For each one of them, there are 99 possible open slots left for H2 to occupy. For each one of them, there are 98 possible open slots left for H3 to occupy. …etc…etc…etc… For each one of them, there are 77 possible open slots left for H24 to occupy. For each one of them, there are 76 possible open slots left for H25 to occupy. Hence, there are ?????????????????????? possible outcomes. 100  99  98  …  77  76 This value is the number of permutations of the coins, denoted 100P25.

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 HOWEVER… permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. For example: We would not want to count this as a distinct outcome. 1 2 3 4 5 . . . . . . 97 98 99 100

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 HOWEVER… permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. How many is that? By the same logic…... 25  24  23  …  3  2  1 “25 factorial” - denoted 25! 100  99  98  …  77  76 25  24  23  …  3  2  1 100!_ 25! 75! = R: choose(100, 25) Calculator: 100 nCr 25 “100-choose-25” - denoted or 100C25 This value counts the number of combinations of 25 Heads among 100 coins.

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 0.4 0.6 . . . . . . Answer: What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = 0.4 P(Tails) = 1 –  = 0.6 Answer: Via independence in binary outcomes between any two coins, 0.4  0.6  0.6  0.4  0.6  …  0.6  0.4  0.4  0.6 = . Therefore, the probability P(X = 25) is equal to……. R: dbinom(25, 100, .4)

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 0.5 . . . . . . 0.4 0.6 . . . . . . Answer: This is the “equally likely” scenario! What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = 0.4 P(Tails) = 1 –  = 0.6  = 0.5 1 –  = 0.5 Answer: Via independence in binary outcomes between any two coins, 0.4  0.6  0.6  0.4  0.6  …  0.6  0.4  0.4  0.6 = . 0.5  0.5  0.5  0.5  0.5  …  0.5  0.5  0.5  0.5 = Therefore, the probability P(X = 25) is equal to……. Question: What if the coin were “fair” (unbiased), i.e.,  = 1 –  = 0.5 ?

independent, with constant probability () per trial POPULATION 40% Male, 60% Female For any randomly selected individual, define a binary random variable: “Success” vs. “Failure” “Failure” “Success”  1 –  RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, n) Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Discrete random variable X = # “Successes” in sample (0, 1, 2, 3, …, n) size n Example: What is the probability P(X = 25)? F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? x x = 0, 1, 2, 3, …,100 n Solution: Model the sample as a sequence of n = 100 independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female). Solution: n Bernoulli trials with P(“Success”) = , P(“Failure”) = 1 – . independent, with constant probability () per trial Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability mass function” p(x) = , x = 0, 1, 2, …, n. .… etc….

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Check: 1. Independent outcomes? Reasonably assume that outcomes “Type O” vs. “Not Type O” between two individuals are independent of each other.  2. Constant probability  ? Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) From table,  = P(Type O) = .461 throughout population.  Binomial model applies?

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461)

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461)

n = 10 p = .461 pmf = function(x)(dbinom(x, n, p)) N = 100000 x = 0:10 bin.dat = rep(x, N*pmf(x)) hist(bin.dat, freq = F, breaks = c(-.5, x+.5), col = "green") axis(1, at = x) axis(2)

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 4.61 = (10)(.461) n (1 – ) = 2.48

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 4.61 n (1 – ) = 2.48

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. RARE EVENT! Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to Binomial model applies. X ~ Bin(10, .461) X ~ Bin(1500, .007) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 10.5 n (1 – ) 2.48 = 10.43

Example: Blood Type probabilities, revisited Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? RARE EVENT! Long positive skew as x  1500 …but contribution  0

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? Poisson distribution x = 0, 1, 2, …, where mean and variance are  = n and  2 = n RARE EVENT!  Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to = 10.5 Binomial model applies. X ~ Bin(1500, .007) X ~ Poisson(10.5) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 10.5 Notation: Sometimes the symbol  (“lambda”) is used instead of  (“mu”). n (1 – ) = 10.43

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? Poisson distribution x = 0, 1, 2, …, where mean and variance are  = n and  2 = n RARE EVENT! Suppose n = 1500 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) = 10.5 X ~ Poisson(10.5) Ex: Probability of exactly X = 15 Type(AB–) individuals = ? Poisson: Binomial: (both ≈ .0437)

Example: Deaths in Wisconsin

Example: Deaths in Wisconsin Assuming deaths among young adults are relatively rare, we know the following: Average 584 deaths per year λ = Mortality rate (α) seems constant. Therefore, the Poisson distribution can be used as a good model to make future predictions about the random variable X = “# deaths” per year, for this population (15-24 yrs)… assuming current values will still apply. Probability of exactly X = 600 deaths next year P(X = 600) = 0.0131 R: dpois(600, 584) Probability of exactly X = 1200 deaths in the next two years Mean of 584 deaths per yr  Mean of 1168 deaths per two yrs, so let λ = 1168: P(X = 1200) = 0.00746 Probability of at least one death per day: λ = = 1.6 deaths/day P(X ≥ 1) = P(X = 1) + P(X = 2) + P(X = 3) + … True, but not practical. P(X ≥ 1) = 1 – P(X = 0) = 1 – = 1 – e–1.6 = 0.798

Poisson Distribution (discrete) For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials coming from a population with rare P(Event) = . But it may also be used to calculate P(x Events) within a random interval of time units, for a “Poisson process” having a known “Poisson rate” α. Recall… T X = # “clicks” on a Geiger counter in normal background radiation.

Poisson Distribution (discrete) For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials coming from a population with rare P(Event) = . But it may also be used to calculate P(x Events) within a random interval of time units, for a “Poisson process” having a known “Poisson rate” α. T X = time between “clicks” on a Geiger counter in normal background radiation. X = # “clicks” on a Geiger counter in normal background radiation. “Time-to-Event Analysis” “Time-to-Failure Analysis” “Reliability Analysis” “Survival Analysis” failures, deaths, births, etc. Time between events is often modeled by the Exponential Distribution (continuous).

Classical Discrete Probability Distributions Binomial ~ X = # Successes in n trials, P(Success) =  Poisson ~ As above, but n large,  small, i.e., Success RARE Negative Binomial ~ X = # trials for k Successes, P(Success) =  Geometric ~ As above, but specialized to k = 1 Hypergeometric ~ As Binomial, but  changes between trials Multinomial ~ As Binomial, but for multiple categories, with 1 + 2 + … + last = 1 and x1 + x2 + … + xlast = n

POPULATION Continuous Discrete random variable X “In the limit…” Time intervals = 0.5 secs Time intervals = 5.0 secs Time intervals = 2.0 secs Time intervals = 1.0 secs Example: X = Cholesterol level (mg/dL) Example: X = “reaction time” “Pain Threshold” Experiment: Volunteers place one hand on metal plate carrying low electrical current; measure duration till hand withdrawn. we obtain a density curve Total Area = 1 SAMPLE In principle, as # individuals in samples increase without bound, the class interval widths can be made arbitrarily small, i.e, the scale at which X is measured can be made arbitrarily fine, since it is continuous.

Cumulative probability F(x) = P(X  x) = Area under density curve up to x “In the limit…” we obtain a density curve 00 f(x) = probability density function (pdf) f(x)  0 Area = 1 F(x) increases continuously from 0 to 1. x x x As with discrete variables, the density f(x) is the height, NOT the probability p(x) = P(X = x). In fact, the zero area “limit” argument would seem to imply P(X = x) = 0 ??? (Later…) However, we can define “interval probabilities” of the form P(a  X  b), using cdf F(x).

F(b) F(b)  F(a) F(a) However, Cumulative probability F(x) = P(X  x) = Area under density curve up to x “In the limit…” we obtain a density curve F(b) f(x) = probability density function (pdf) F(b)  F(a) F(a) f(x)  0 Area = 1 F(x) increases continuously from 0 to 1. a b a b As with discrete variables, the density f(x) is the height, NOT the probability p(x) = P(X = x). In fact, the zero area “limit” argument would seem to imply P(X = x) = 0 ??? (Later…) However, we can define “interval probabilities” of the form P(a  X  b), using cdf F(x).

Cumulative probability F(x) = P(X  x) = Area under density curve up to x “In the limit…” we obtain a density curve F(b) f(x) = probability density function (pdf) F(b)  F(a) F(a) f(x)  0 Area = 1 F(x) increases continuously from 0 to 1. a b a b An “interval probability” P(a  X  b) can be calculated as the amount of area under the curve f(x) between a and b, or the difference P(X  b)  P(X  a), i.e., F(b)  F(a). (Ordinarily, finding the area under a general curve requires calculus techniques… unless the “curve” is a straight line, for instance. Examples to follow…)

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. > 0  X Total Area = 1 Density Check? Base = 6 – 1 = 5 5  0.2 = 1  Height = 0.2 “What is the probability of rolling a 4?” that a random child is 4 years old?” doesn’t mean….. = 0 !!!!! The probability that a continuous random variable is exactly equal to any single value is ZERO! A single value is one point out of an infinite continuum of points on the real number line.

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. X Density “What is the probability of rolling a 4?” that a random child is 4 years old?” between 4 and 5 years old?” actually means.... = (5 – 4)(0.2) = 0.2 NOTE: Since P(X = 5) = 0, no change for P(4  X  5), P(4 < X  5), or P(4 < X < 5).

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x X For any x, the area under the curve is Density F(x) = 0.2 (x – 1). x

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x F(x) = 0.2 (x – 1) For any x, the area under the curve is F(x) increases continuously from 0 to 1. Density F(x) = 0.2 (x – 1). (compare with “staircase graph” for discrete case) X x

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x X F(x) = 0.2 (x – 1) F(5) = 0.8 Density “What is the probability of rolling a 4?” that a random child is under 5 years old? 0.8

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x X F(x) = 0.2 (x – 1) Density F(4) = 0.6 “What is the probability of rolling a 4?” that a random child is under 4 years old? 0.6

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x X F(x) = 0.2 (x – 1) F(5) = 0.8 Density F(4) = 0.6 “What is the probability of rolling a 4?” that a random child is between 4 and 5 years old?”

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x X F(x) = 0.2 (x – 1) F(5) = 0.8 0.2 Density F(4) = 0.6 “What is the probability of rolling a 4?” that a random child is between 4 and 5 years old?” = F(5)  F(4) = 0.8 – 0.6 = 0.2

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6].  0  Area = Base  Height = 1  Density

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Cumulative probability F(x) = P(X  x) = Area under density curve up to x Cumulative Distribution Function F(x) Density x x

Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Cumulative probability F(x) = P(X  x) = Area under density curve up to x Cumulative Distribution Function F(x) Density x “What is the probability that a child is under 4 years old?” “What is the probability that a child is under 5 years old?” “What is the probability that a child is between 4 and 5?”

Fundamental Theorem of Calculus A continuous random variable X corresponds to a probability density function (pdf) f(x), whose graph is a density curve. f(x) is NOT a pmf! Cumulative probability function (cdf) In summary… Fundamental Theorem of Calculus F(x) increases continuously from 0 to 1. Moreover…

Fundamental Theorem of Calculus A continuous random variable X corresponds to a probability density function (pdf) f(x), whose graph is a density curve. f(x) is NOT a pmf! Cumulative probability function (cdf) In summary… Fundamental Theorem of Calculus F(x) increases continuously from 0 to 1. Moreover…

SECTION 4.3 IN POSTED LECTURE NOTES

Four Examples: 1 For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) For any x < 0, it follows that… For any it follows that…

Four Examples: 1 For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) For any x < 0, it follows that For any it follows that…

Four Examples: 1  For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) For any x < 0, it follows that For any it follows that… Note: For any it follows that…

Determine the cumulative distribution function (cdf) Four Examples: 1 For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) Monotonic and continuous from 0 to 1

Four Examples: 2 For any b > a > 0, consider the probability density function (pdf)... Determine the cumulative distrib function (cdf) For any it follows that For any it follows that For any it follows that For any it follows that

Four Examples: 2 For any b > a > 0, consider the probability density function (pdf)... Determine the mean Determine the cumulative distrib function (cdf) Determine the variance

WARNING: “IMPROPER INTEGRAL” Four Examples: 3 Consider the following probability density function (pdf)... Confirm pdf WARNING: “IMPROPER INTEGRAL” 

WARNING: “IMPROPER INTEGRAL” Four Examples: 4 Four Examples: 3 Consider the following probability density function (pdf)... Confirm pdf WARNING: “IMPROPER INTEGRAL” 

WARNING: “IMPROPER INTEGRAL” Four Examples: 4 Four Examples: 3 Consider the following probability density function (pdf)... Confirm pdf WARNING: “IMPROPER INTEGRAL”   does not exist!

Time intervals = 0.5 secs Time intervals = 5.0 secs Time intervals = 1.0 secs Time intervals = 2.0 secs DISCRETE CONTINUOUS “Density” Interval widths can be made arbitrarily small, i.e, the scale at which X is measured can be made arbitrarily fine, since it is continuous. As x  0 and # rectangles  ∞, this “Riemann sum” approaches the area under the density curve f(x), expressed as a definite integral.

~ The Normal Distribution ~ (a.k.a. “The Bell Curve”) X Johann Carl Friedrich Gauss 1777-1855 standard deviation X ~ N(μ, σ) σ Symmetric, unimodal Models many (but not all) natural systems Mathematical properties make it useful to work with mean μ

Standard Normal Distribution Z ~ N(0, 1) SPECIAL CASE Total Area = 1 1 Z The cumulative distribution function (cdf) is denoted by (z). It is not expressible in explicit, closed form, but is tabulated, and computable in R via the command pnorm.

Standard Normal Distribution Example Standard Normal Distribution Z ~ N(0, 1) Find (1.2) = P(Z  1.2). Total Area = 1 1 Z 1.2 “z-score”

Standard Normal Distribution Example Standard Normal Distribution Z ~ N(0, 1) Find (1.2) = P(Z  1.2). Use the included table. Total Area = 1 1 Z 1.2 “z-score”

Lecture Notes Appendix…

Standard Normal Distribution Example Standard Normal Distribution Z ~ N(0, 1) Find (1.2) = P(Z  1.2). Use the included table. Use R: > pnorm(1.2) [1] 0.8849303 Total Area = 1 1 0.88493 P(Z > 1.2) 0.11507 Z 1.2 “z-score” Note: Because this is a continuous distribution, P(Z = 1.2) = 0, so there is no difference between P(Z > 1.2) and P(Z  1.2), etc.

Standard Normal Distribution Z ~ N(0, 1) μ σ X ~ N(μ, σ) 1 Z Why be concerned about this, when most “bell curves” don’t have mean = 0, and standard deviation = 1? Any normal distribution can be transformed to the standard normal distribution via a simple change of variable.

Random Variable X = Age at first birth POPULATION Example Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? Year 2010 X ~ N(25.4, 1.5) μ = 25.4 σ = 1.5 27.2

Random Variable POPULATION Example X ~ N(25.4, 1.5) X = Age at first birth POPULATION Example Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? The x-score = 27.2 must first be transformed to a corresponding z-score. Year 2010 X ~ N(25.4, 1.5) σ = 1.5 μ = 25.4 μ = 25.4 μ = 27.2 33

Random Variable X = Age at first birth POPULATION Example Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? P(Z < 1.2) = 0.88493 Year 2010 X ~ N(25.4, 1.5) σ = 1.5 Using R: > pnorm(27.2, 25.4, 1.5) [1] 0.8849303 μ = 25.4 μ = 27.2 33

Standard Normal Distribution Z ~ N(0, 1) 1 Z What symmetric interval about the mean 0 contains 95% of the population values? That is…

Standard Normal Distribution Z ~ N(0, 1) Use the included table. 0.95 0.025 0.025 Z -z.025 = ? +z.025 = ? What symmetric interval about the mean 0 contains 95% of the population values? That is…

Lecture Notes Appendix…

Standard Normal Distribution Z ~ N(0, 1) Use the included table. Use R: > qnorm(.025) [1] -1.959964 > qnorm(.975) [1] 1.959964 0.95 0.025 0.025 Z -z.025 = -1.96 -z.025 = ? “.025 critical values” +z.025 = +1.96 +z.025 = ? What symmetric interval about the mean 0 contains 95% of the population values?

Standard Normal Distribution Z ~ N(0, 1) X ~ N(25.4, 1.5) X ~ N(μ, σ) What symmetric interval about the mean age of 25.4 contains 95% of the population values? 22.46  X  28.34 yrs > areas = c(.025, .975) > qnorm(areas, 25.4, 1.5) [1] 22.46005 28.33995 0.95 0.025 0.025 Z -z.025 = -1.96 -z.025 = ? “.025 critical values” +z.025 = +1.96 +z.025 = ? What symmetric interval about the mean 0 contains 95% of the population values?

Standard Normal Distribution Z ~ N(0, 1) Use the included table. 0.90 0.05 0.05 Z -z.05 = ? +z.05 = ? Similarly… What symmetric interval about the mean 0 contains 90% of the population values?

…so average 1.64 and 1.65 0.95  average of 0.94950 and 0.95053…

Standard Normal Distribution Z ~ N(0, 1) Use the included table. Use R: > qnorm(.05) [1] -1.644854 > qnorm(.95) [1] 1.644854 0.90 0.05 0.05 Z -z.05 = ? -z.05 = -1.645 +z.05 = +1.645 +z.05 = ? “.05 critical values” Similarly… What symmetric interval about the mean 0 contains 90% of the population values?

Standard Normal Distribution Z ~ N(0, 1) In general…. 1 –  0.90 0.05  / 2  / 2 0.05 Z -z / 2 -z.05 = ? -z.05 = -1.645 +z.05 = +1.645 +z / 2 +z.05 = ? “ / 2 critical values” “.05 critical values” Similarly… What symmetric interval about the mean 0 contains 100(1 – )% of the population values?

Normal Approximation to the Binomial Distribution continuous discrete Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Discrete random variable X = # Successes in sample (0, 1, 2, 3, …,, n) Discrete random variable X = # Successes in sample (0, 1, 2, 3, …,, n) P(Success) =  P(Failure) = 1 –  Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n.

> dbinom(10, 100, .2) [1] 0.00336282 Area

> pbinom(10, 100, .2) [1] 0.005696381 Area

“Sampling Distribution” of Therefore, if… X ~ Bin(n, ) with n  15 and n (1 – )  15, then… That is… “Sampling Distribution” of

Classical Continuous Probability Distributions Normal distribution Log-Normal ~ X is not normally distributed (e.g., skewed), but Y = “logarithm of X” is normally distributed Student’s t-distribution ~ Similar to normal distr, more flexible F-distribution ~ Used when comparing multiple group means Chi-squared distribution ~ Used extensively in categorical data analysis Others for specialized applications ~ Gamma, Beta, Weibull…