CHAPTER 4 4 4.1 - Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc. 4 4.2 - Continuous Models  G eneral distributions 

CHAPTER 4 4 4.1 - Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc. 4 4.2 - Continuous Models  G eneral distributions  C lassical: Normal, etc.

What is the connection between probability and random variables? Events (and their corresponding probabilities) that involve experimental measurements can be described by random variables (e.g., “X = # Males” in previous gender equity example). 2

POPULATION random variable X 3 x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 …etc…. xnxn Data values x i Relative Frequencies f (x i ) = f i /n x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ xkxk f (x k ) Total1 Pop values x i Probabilities f (x i ) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ Total1 SAMPLE of size n Example: X = Cholesterol level (mg/dL)

POPULATION Pop values x Probabilities f (x) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ Total1 Example: X = Cholesterol level (mg/dL) random variable X Probability Histogram X Total Area = 1 f(x) = Probability that the random variable X is equal to a specific value x, i.e., |x|x “probability mass function” (pmf) f(x) = P(X = x)

X POPULATION Pop values x Probabilities f (x) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ Total1 Example: X = Cholesterol level (mg/dL) random variable X Probability Histogram Total Area = 1 F(x) = Probability that the random variable X is less than or equal to a specific value x, i.e., “cumulative distribution function” (cdf) F(x) = P(X  x) |x|x

POPULATION Pop values x Probabilities f (x) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ Total1 Example: X = Cholesterol level (mg/dL) random variable X Probability Histogram X Hey!!! What about the population mean  and the population variance  2 ??? Calculating probabilities… P( a  X  b ) = ???????? f (x) |a|a |x|x |b|b = F( b ) – F( a )

7 POPULATION Pop values x Probabilities f (x) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ Total1 Population mean Also denoted by E[X], the “expected value” of the variable X. Population variance Example: X = Cholesterol level (mg/dL) random variable X Just as the sample mean and sample variance s 2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean  and population variance  2, using probabilities.

Pop values x Probabilities f (x) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ Total1 8 POPULATION Population mean Also denoted by E[X], the “expected value” of the variable X. Population variance Example: X = Cholesterol level (mg/dL) random variable X Just as the sample mean and sample variance s 2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean  and population variance  2, using probabilities.

Pop values x i Probabilities f (x i ) 2101/6 2401/3 2701/2 Total1 Example 1: 9 POPULATION 250 500 1/6 1/3 1/2 Example: X = Cholesterol level (mg/dL) random variable X

Example 2: 10 POPULATION 210 600 Pop values x i Probabilities f (x i ) 1801/3 2101/3 2401/3 Total1 1/3 1/3 1/3 Equally likely outcomes result in a “uniform distribution.” ( clear from symmetry) Example: X = Cholesterol level (mg/dL) random variable X

To summarize… 11

12 POPULATION SAMPLE of size n x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 …etc…. xnxn Data x i Relative Frequencies f (x i ) = f i /n x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ xkxk f (x k ) 1 Pop x i Probabilities f (x i ) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ 1 Frequency Table Probability Table Probability Histogram X Total Area = 1 Density Histogram X Total Area = 1 Discrete random variable X

13 POPULATION SAMPLE of size n x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 …etc…. xnxn Data x i Relative Frequencies f (x i ) = f i /n x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ xkxk f (x k ) 1 Pop x i Probabilities f (x i ) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ 1 Frequency Table Probability Table Probability Histogram X Total Area = 1 Density Histogram X Total Area = 1 Discrete random variable X Continuous

15 Example 3: TWO INDEPENDENT POPULATIONS X 1 = Cholesterol level (mg/dL) xf 1 (x) 2101/6 2401/3 2701/2 Total1 X 2 = Cholesterol level (mg/dL) xf 2 (x) 1801/3 2101/3 2401/3 Total1  1 = 250  1 2 = 500  2 = 210  2 2 = 600 D = X 1 – X 2 ~ ??? dOutcomes -30(210, 240) 0(210, 210), (240, 240) +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) NOTE: By definition, this is the sample space of the experiment! N O T E : B y d e f i n i t i o n, t h i s i s t h e s a m p l e s p a c e o f t h e e x p e r i m e n t ! What are the probabilities of the corresponding events “D = d” for d = -30, 0, 30, 60, 90? N O T E : B y d e f i n i t i o n, t h i s i s t h e s a m p l e s p a c e o f t h e e x p e r i m e n t ! W h a t a r e t h e p r o b a b i l i t i e s o f t h e c o r r e s p o n d i n g e v e n t s “ D = d ” f o r d = - 3 0, 0, 3 0, 6 0, 9 0 ?

dOutcomes -30(210, 240) 0(210, 210), (240, 240) +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) dProbabilities f(d) -301/9 ? 02/9 ? +303/9 ? +602/9 ? +901/9 ? 16 Example 3: TWO INDEPENDENT POPULATIONS X 1 = Cholesterol level (mg/dL) xf 1 (x) 2101/6 2401/3 2701/2 Total1 X 2 = Cholesterol level (mg/dL) xf 2 (x) 1801/3 2101/3 2401/3 Total1  1 = 250  1 2 = 500  2 = 210  2 2 = 600 D = X 1 – X 2 ~ ??? The outcomes of D are NOT EQUALLY LIKELY!!!

dOutcomes -30(210, 240) 0(210, 210), (240, 240) +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(210, 210), (240, 240) +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) 17 Example 3: TWO INDEPENDENT POPULATIONS X 1 = Cholesterol level (mg/dL) xf 1 (x) 2101/6 2401/3 2701/2 Total1 X 2 = Cholesterol level (mg/dL) xf 2 (x) 1801/3 2101/3 2401/3 Total1  1 = 250  1 2 = 500  2 = 210  2 2 = 600 D = X 1 – X 2 ~ ???

dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(210, 210), (240, 240) +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(1/6)(1/3) + (1/3)(1/3) = 3/18 +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) 18 Example 3: TWO INDEPENDENT POPULATIONS X 1 = Cholesterol level (mg/dL) xf 1 (x) 2101/6 2401/3 2701/2 Total1 X 2 = Cholesterol level (mg/dL) xf 2 (x) 1801/3 2101/3 2401/3 Total1  1 = 250  1 2 = 500  2 = 210  2 2 = 600 D = X 1 – X 2 ~ ???

dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(1/6)(1/3) + (1/3)(1/3) = 3/18 +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) 19 Example 3: TWO INDEPENDENT POPULATIONS X 1 = Cholesterol level (mg/dL) xf 1 (x) 2101/6 2401/3 2701/2 Total1 X 2 = Cholesterol level (mg/dL) xf 2 (x) 1801/3 2101/3 2401/3 Total1  1 = 250  1 2 = 500  2 = 210  2 2 = 600 D = X 1 – X 2 ~ ??? dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(1/6)(1/3) + (1/3)(1/3) = 3/18 +30(1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18 +60(1/3)(1/3) + (1/2)(1/3) = 5/18 +90(1/2)(1/3) = 3/18 1/18 3/18 6/18 5/18 3/18 Probability Histogram

dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(1/6)(1/3) + (1/3)(1/3) = 3/18 +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) 20 Example 3: TWO INDEPENDENT POPULATIONS X 1 = Cholesterol level (mg/dL) xf 1 (x) 2101/6 2401/3 2701/2 Total1 X 2 = Cholesterol level (mg/dL) xf 2 (x) 1801/3 2101/3 2401/3 Total1  1 = 250  1 2 = 500  2 = 210  2 2 = 600 D = X 1 – X 2 ~ ??? dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(1/6)(1/3) + (1/3)(1/3) = 3/18 +30(1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18 +60(1/3)(1/3) + (1/2)(1/3) = 5/18 +90(1/2)(1/3) = 3/18  1 = 250  1 2 = 500  2 = 210  2 2 = 600 1/18 3/18 6/18 5/18 3/18 Probability Histogram  D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40  D 2 = (-70) 2 (1/18) + (-40) 2 (3/18) + (-10) 2 (6/18) + (20) 2 (5/18) + (50) 2 (3/18) = 1100  D =  1 –  2  D 2 =  1 2 +  2 2

General: TWO INDEPENDENT POPULATIONS dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(1/6)(1/3) + (1/3)(1/3) = 3/18 +30(210, 180), (240, 210), (270, 240) +60(240, 180), (270, 210) +90(270, 180) 21 X1X1 xf 1 (x) 2101/6 2401/3 2701/2 Total1 X2X2 xf 2 (x) 1801/3 2101/3 2401/3 Total1  1 = 250  1 2 = 500  2 = 210  2 2 = 600 dProbabilities f(d) -30(1/6)(1/3) = 1/18 via independence 0(1/6)(1/3) + (1/3)(1/3) = 3/18 +30(1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18 +60(1/3)(1/3) + (1/2)(1/3) = 5/18 +90(1/2)(1/3) = 3/18  1 = 250  1 2 = 500  2 = 210  2 2 = 600 1/18 3/18 6/18 5/18 3/18 Probability Histogram  D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40  D 2 = (-70) 2 (1/18) + (-40) 2 (3/18) + (-10) 2 (6/18) + (20) 2 (5/18) + (50) 2 (3/18) = 1100 D = X 1 – X 2 ~ ???  D =  1 –  2  D 2 =  1 2 +  2 2 – 2 Cov (X 1, X 2 ) X 1 = Cholesterol level (mg/dL)X 2 = Cholesterol level (mg/dL) These two formulas are valid for continuous as well as discrete distributions. IF the two populations are dependent… …then this formula still holds, BUT……

CHAPTER 4 4 4.1 - Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc. 4 4.2 - Continuous Models  G eneral distributions 

Similar presentations

Presentation on theme: "CHAPTER 4 4 4.1 - Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc. 4 4.2 - Continuous Models  G eneral distributions "— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CHAPTER 4 4 4.1 - Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc. 4 4.2 - Continuous Models  G eneral distributions 

Similar presentations

Presentation on theme: "CHAPTER 4 4 4.1 - Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc. 4 4.2 - Continuous Models  G eneral distributions "— Presentation transcript:

Similar presentations

About project

Feedback