Sections 4.1, 4.2, 4.3 Important Definitions in the Text:

Slides:



Advertisements
Similar presentations
MOMENT GENERATING FUNCTION AND STATISTICAL DISTRIBUTIONS
Advertisements

1. (a) (b) The random variables X 1 and X 2 are independent, and each has p.m.f.f(x) = (x + 2) / 6 if x = –1, 0, 1. Find E(X 1 + X 2 ). E(X 1 ) = E(X 2.
Chapter 5 Discrete Random Variables and Probability Distributions
Laws of division of casual sizes. Binomial law of division.
Section 2.1 Important definitions in the text: The definition of random variable and space of a random variable Definition The definition of probability.
Chapter 4 Discrete Random Variables and Probability Distributions
Short review of probabilistic concepts
Section 5.1 Let X be a continuous type random variable with p.d.f. f(x) where f(x) > 0 for a < x < b, where a = – and/or b = + is possible; we also let.
Probability Densities
Section 2.3 Suppose X is a discrete-type random variable with outcome space S and p.m.f f(x). The mean of X is The variance of X is The standard deviation.
Section 6.1 Let X 1, X 2, …, X n be a random sample from a distribution described by p.m.f./p.d.f. f(x ;  ) where the value of  is unknown; then  is.
Class notes for ISE 201 San Jose State University
Chapter 6 Continuous Random Variables and Probability Distributions
Section 5.3 Suppose X 1, X 2, …, X n are independent random variables (which may be either of the discrete type or of the continuous type) each from the.
Section 8.3 Suppose X 1, X 2,..., X n are a random sample from a distribution defined by the p.d.f. f(x)for a < x < b and corresponding distribution function.
1 Multivariate Distributions ch4. 2 Multivariable Distributions  It may be favorable to take more than one measurement on a random experiment. –The data.
Suppose an ordinary, fair six-sided die is rolled (i.e., for i = 1, 2, 3, 4, 5, 6, there is one side with i spots), and X = “the number of spots facing.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Chapter 4: Joint and Conditional Distributions
Chris Morgan, MATH G160 February 3, 2012 Lecture 11 Chapter 5.3: Expectation (Mean) and Variance 1.
Review of Probability and Statistics
1A.1 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Review Some Basic Statistical Concepts Appendix 1A.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
Joint Probability distribution
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
Joint Probability Distributions
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Chapter6 Jointly Distributed Random Variables
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Econ 482 Lecture 1 I. Administration: Introduction Syllabus Thursday, Jan 16 th, “Lab” class is from 5-6pm in Savery 117 II. Material: Start of Statistical.
Sections 6.7, 6.8, 7.7 (Note: The approach used here to present the material in these sections is substantially different from the approach used in the.
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.
Section 8 – Joint, Marginal, and Conditional Distributions.
Chapter 5 Discrete Random Variables and Probability Distributions ©
PBG 650 Advanced Plant Breeding
Section 15.8 The Binomial Distribution. A binomial distribution is a discrete distribution defined by two parameters: The number of trials, n The probability.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Section 2.5 Important definition in the text: The definition of the moment generating function (m.g.f.) Definition If S is the space for a random.
Section 3.7 Suppose the number of occurrences in a “unit” interval follows a Poisson distribution with mean. Recall that for w > 0, P(interval length to.
Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Jointly Distributed Random Variables.
Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x.
Probability Refresher. Events Events as possible outcomes of an experiment Events define the sample space (discrete or continuous) – Single throw of a.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
Math 4030 – 6a Joint Distributions (Discrete)
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
Section 10.5 Let X be any random variable with (finite) mean  and (finite) variance  2. We shall assume X is a continuous type random variable with p.d.f.
Section 1.3 Each arrangement (ordering) of n distinguishable objects is called a permutation, and the number of permutations of n distinguishable objects.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Chapter 31 Conditional Probability & Conditional Expectation Conditional distributions Computing expectations by conditioning Computing probabilities by.
Virtual University of Pakistan Lecture No. 26 Statistics and Probability Miss Saleha Naghmi Habibullah.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
Function of a random variable Let X be a random variable in a probabilistic space with a probability distribution F(x) Sometimes we may be interested in.
Chapter 9: Joint distributions and independence CIS 3033.
Statistics Lecture 19.
Discrete Probability Distributions
Review of Probability and Estimators Arun Das, Jason Rebello
Multinomial Distribution
The least squares line is derived in Section 4.2 by minimizing
How accurately can you (1) predict Y from X, and (2) predict X from Y?
Probability overview Event space – set of possible outcomes
Chapter 2. Random Variables
IE 360: Design and Control of Industrial Systems I
Discrete Random Variables and Probability Distributions
Mathematical Expectation
Presentation transcript:

Sections 4.1, 4.2, 4.3 Important Definitions in the Text: The definition of joint probability mass function (joint p.m.f.) Definition 4.1-1 The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables Definition 4.1-2 If the joint p.m.f. of (X, Y) is f(x,y), and S is the corresponding outcome space, then the mathematical expectation, or expected value, of u(X,Y) is If the marginal p.m.f. of X is f1(x), and S1 is the corresponding outcome space, then E[v(X)] can be calculated from either An analogous statement can be made about E[v(Y)] .

1. Twelve bags each contain two pieces of candy, one red and one green. In two of the bags each piece of candy weighs 1 gram; in three of the bags the red candy weighs 2 grams and the green candy weighs 1 gram; in three of the bags the red candy weighs 1 gram and the green candy weighs 2 grams; in the remaining four bags each piece of candy weighs 2 grams. One bag is selected at random and the following random variables are defined: X = weight of the red candy , Y = weight of the green candy . 2 y 1 1/4 1/3 1/6 1/4 The space of (X, Y) is {(1,1) (1,2) (2,1) (2,2)}. 1 2 x 1 — 6 if (x, y) = (1, 1) 1 — 4 The joint p.m.f. of (X, Y) is f(x, y) = if (x, y) = (1, 2) , (2, 1) 1 — 3 if (x, y) = (2, 2)

5 / 12 if x = 1 if x = 2 The marginal p.m.f. of X is f1(x) = 7 / 12 5 / 12 if y = 1 if y = 2 The marginal p.m.f. of Y is f2(y) = 7 / 12 A formula for the joint p.m.f. of (X,Y) is f(x, y) = x + y —— if (x, y) = (1, 1) , (1, 2) , (2, 1) , (2, 2) 12 A formula for the marginal p.m.f. of X is f1(x) = x + 1 x + 2 —— + —— = 12 12 2x + 3 ——— if x = 1, 2 12 f(x, 1) + f(x, 2) = A formula for the marginal p.m.f. of Y is f2(y) = 1 + y 2 + y —— + —— = 12 12 2y + 3 ——— if y = 1, 2 12 f(1, y) + f(2, y) =

   Sections 4.1, 4.2, 4.3 Important Definitions in the Text: The definition of joint probability mass function (joint p.m.f.) Definition 4.1-1 The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables Definition 4.1-2 If the joint p.m.f. of (X, Y) is f(x,y), and S is the corresponding outcome space, then the mathematical expectation, or expected value, of u(X,Y) is  E[u(X,Y)] = u(x,y)f(x,y) (x,y)  S If the marginal p.m.f. of X is f1(x), and S1 is the corresponding outcome space, then E[v(X)] can be calculated from either   or v(x)f1(x) v(x)f(x,y) x  S (x,y)  S 1 An analogous statement can be made about E[v(Y)] .

1. - continued E(X) = (1)(5/12) + (2)(7/12) = 19/12 E(X2) = (1)2(5/12) + (2)2(7/12) = 11/4 Var(X) = 11/4 – (19/12)2 = 35/144 E(Y) = (1)(5/12) + (2)(7/12) = 19/12 E(Y2) = (1)2(5/12) + (2)2(7/12) = 11/4 Var(Y) = 11/4 – (19/12)2 = 35/144 Since _________________________, then the random variables X and Y _______________ independent. f(x, y)  f1(x)f2(y) are not Using the joint p.m.f., E(X + Y) = (1+1)(1/6) + (1+2)(1/4) + (2+1)(1/4) + (2+2)(1/3) = 19 / 6 Alternatively, E(X + Y) = E(X) + E(Y) = 19/12 + 19/12 = 19 / 6

Using the joint p.m.f., E(X – Y) = (1–1)(1/6) + (1–2)(1/4) + (2–1)(1/4) + (2–2)(1/3) = Alternatively, E(X – Y) = 19/12 – 19/12 = 0 E(X + Y) can be interpreted as the mean of the total weight of candy in the bag. E(X – Y) can be interpreted as the mean of how much more the red candy in the bag weighs than the green candy. E(XY) = (1)(1)(1/6) + (1)(2)(1/4) + (2)(1)(1/4) + (2)(2)(1/3) = 5/2 Cov(X,Y) =

1. - continued  = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is

The conditional p.m.f. of Y | X = 1 is Y | X = 2 is For x = 1, 2, a formula for the conditional p.m.f. of Y | X = x is

1. - continued The conditional p.m.f. of X | Y = 1 is X | Y = 2 is For y = 1, 2, a formula for the conditional p.m.f. of X | Y = y is

E(Y | X = 1) = E(Y2 | X = 1) = Var(Y | X = 1) = E(Y | X = 2) = E(Y2 | X = 2) = Var(Y | X = 2) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ?

1. - continued E(X | Y = 1) = E(X2 | Y = 1) = Var(X | Y = 1) = E(X | Y = 2) = E(X2 | Y = 2) = Var(X | Y = 2) = Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

2. An urn contains six chips, one $1 chip, two $2 chips, and three $3 chips. Two chips are selected at random and without replacement. The following random variables are defined: X = dollar value of the first chip selected , Y = dollar value of the second chip selected . The space of (X, Y) is {(1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3)}. 3 y 2 1 1/10 1/5 1/5 1/15 1/15 1/5 1/15 1/10 1 — 5 if (x, y) = (2, 3) , (3, 2) , (3, 3) 1 2 3 x 1 — 10 The joint p.m.f. of (X, Y) is f(x, y) = if (x, y) = (1, 3) , (3, 1) 1 — 15 if (x, y) = (1, 2) , (2, 1) , (2, 2)

2. - continued 1 / 6 if x = 1 if x = 2 if x = 3 The marginal p.m.f. of X is f1(x) = 1 / 3 1 / 2 1 / 6 if y = 1 if y = 2 if y = 3 The marginal p.m.f. of Y is f2(y) = 1 / 3 1 / 2 A formula for the joint p.m.f. of (X,Y) is f(x, y) = (There seems to be no easy formula.) A formula for the marginal p.m.f. of X is f1(x) = x / 6 if x = 1, 2, 3 A formula for the marginal p.m.f. of Y is f2(y) = y / 6 if y = 1, 2, 3

E(X) = 7 / 3 E(X2) = 6 Var(X) = 6 – (7 / 3)2 = 5 / 9 E(Y) = 7 / 3 E(Y2) = 6 Var(Y) = 6 – (7 / 3)2 = 5 / 9 Since _________________________, then the random variables X and Y _______________ independent. f(x, y)  f1(x)f2(y) are not P(X + Y < 4) = P[(X,Y) = (1,2)] + P[(X,Y) = (2,1)] = 1 / 15 + 1 / 15 = 2 / 15 Using the joint p.m.f., E(XY) = (1)(2)(2/30) + (1)(3)(3/30) + (2)(1)(2/30) + (2)(2)(2/30) + (2)(3)(6/30) + (3)(1)(3/30) + (3)(2)(6/30) + (3)(3)(6/30) = 16 / 3

2. - continued Cov(X,Y) =  = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is

The conditional p.m.f. of Y | X = 1 is Y | X = 2 is Y | X = 3 is For x = 1, 2, 3, a formula for the conditional p.m.f. of Y | X = x is

2. - continued The conditional p.m.f. of X | Y = 1 is X | Y = 2 is X | Y = 3 is For y = 1, 2, 3, a formula for the conditional p.m.f. of X | Y = y is

E(Y | X = 1) = E(X | Y = 1) = E(Y2 | X = 1) = E(X2 | Y = 1) = Var(Y | X = 1) = Var(X | Y = 1) = E(Y | X = 2) = E(X | Y = 2) = E(Y2 | X = 2) = E(X2 | Y = 2) = Var(Y | X = 2) = Var(X | Y = 2) = E(Y | X = 3) = E(X | Y = 3) = E(Y2 | X = 3) = E(X2 | Y = 3) = Var(Y | X = 3) = Var(X | Y = 3) =

2. - continued Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

3. An urn contains six chips, one $1 chip, two $2 chips, and three $3 chips. Two chips are selected at random and with replacement. The following random variables are defined: X = dollar value of the first chip selected , Y = dollar value of the second chip selected . The space of (X, Y) is {(1,1) (1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3)}. 3 y 2 1 1/12 1/6 1/4 1/18 1/9 1/6 1/36 1/18 1/12 1 2 3 x xy — 36 The joint p.m.f. of (X, Y) is f(x, y) = x = 1, 2, 3 if y = 1, 2, 3

3. - continued 1 / 6 if x = 1 if x = 2 if x = 3 The marginal p.m.f. of X is f1(x) = 1 / 3 1 / 2 1 / 6 if y = 1 if y = 2 if y = 3 The marginal p.m.f. of Y is f2(y) = 1 / 3 1 / 2 A formula for the joint p.m.f. of (X,Y) is f(x, y) = (The formula was found previously) A formula for the marginal p.m.f. of X is f1(x) = x / 6 if x = 1, 2, 3 A formula for the marginal p.m.f. of Y is f2(y) = y / 6 if y = 1, 2, 3

E(X) = 7 / 3 E(X2) = 6 Var(X) = 6 – (7 / 3)2 = 5 / 9 E(Y) = 7 / 3 E(Y2) = 6 Var(Y) = 6 – (7 / 3)2 = 5 / 9 Since _________________________, then the random variables X and Y _______________ independent. f(x, y) = f1(x)f2(y) are P(X + Y < 4) = P[(X,Y) = (1,1)] + P[(X,Y) = (1,2)] + P[(X,Y) = (2,1)] = 1 / 36 + 1 / 18 + 1 / 18 = 5 / 36

The least squares lines for predicting Y from X is 3. - continued 3 3   (xy) (xy / 36) = x = 1 y = 1 3 3   (xy) (x / 6) (y / 6) = x = 1 y = 1 E(XY) = 3 3  (x) (x / 6)  (y) (y / 6) = x = 1 y = 1 E(X) E(Y) = (7/3)(7/3) = 49 / 9 Cov(X,Y) =  = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is

For x = 1, 2, 3, the conditional p.m.f. of Y | X = x is E(Y | X = x) = Var(Y | X = x) = For y = 1, 2, 3, the conditional p.m.f. of X | Y = y is E(X | Y = y) = Var(X | Y = y) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

For continuous type random variables (X, Y), the definitions of joint probability density function (joint p.d.f.), independence of X and Y, and mathematical expectation are each analogous to those for discrete type random variables, with summation signs replaced by integral signs. The covariance between random variables X and Y is The correlation between random variables X and Y is y Consider the equation of a line y = a + bx which comes “closest” to predicting the values of the random variable Y from the random variable X in the sense that E{[Y – (a + bX)]2} is minimized. x

We let k(a,b) = E{[Y – (a + bX)]2} = To minimize k(a,b) , we set the partial derivatives with respect to a and b equal to zero. (Note: This is textbook exercise 4.2-5.) k — = a k — = b (Multiply the first equation by X , subtract the resulting equation from the second equation, and solve for b. Then substitute in place of b in the first equation to solve for a.)

b = The least squares line for predicting Y from X is a = The least squares line for predicting Y from X can be written The least squares line for predicting X from Y can be written

The conditional p.m.f./p.d.f. of Y given X = x is defined to be The conditional p.m.f./p.d.f. of X given Y = y is defined to be The conditional mean of Y given X = x is defined to be The conditional variance of Y given X = x is defined to be

The conditional mean of X given Y = y and the conditional variance of X given Y = y are each defined similarly. For continuous type random variables (X, Y), the definitions of conditional mean and variance are each analogous to those for discrete type random variables, with summation signs replaced by integral signs. Suppose X and Y are two discrete type random variables, and E(Y | X = x) = a + bx. Then, for each possible value of x, Multiplying each side by f1(x), Summing each side over all x,

Now, multiplying each side of by x f1(x), Summing each side over all x,

The two equations and are essentially the same as those in the derivation of the least squares line for predicting Y from X. This derivation is analogous for continuous type random variables with summation signs replaced by integral signs. Consequently, if E(Y | X = x) = a + bx (i.e., if E(Y | X = x) is a linear function of x), then a and b must be respectively the intercept and slope in the least squares line for predicting Y from X. Similarly, if E(X | Y = y) = c + dy (i.e., if E(X | Y = y) is a linear function of y), then c and d must be respectively the intercept and slope in the least squares line for predicting X from Y.

Suppose a set contains N = N1 + N2 + N3 items, where N1 items are of one type, N2 items are of a second type, and N3 items are of a third type; n items are selected from the N items at random and without replacement. If the random variable X1 is defined to be the number of selected n items that are of the first type, the random variable X2 is defined to be the number of selected n items that are of the second type, and the random variable X3 is defined to be the number of selected n items that are of the third type, then the joint distribution of (X1 , X2 , X3) is called a trivariate hypergeometric distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is Each Xi has a distribution. If the number of types of items is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multivariate hypergeometric distribution.

Suppose each in a sequence of independent trials must result in one of outcome 1, outcome 2, or outcome 3. The probability of outcome 1 on each trial is p1 , the probability of outcome 2 on each trial is p2 , and the probability of outcome 3 on each trial is p3 = 1 – p1 – p2 . If the random variable X1 is defined to be the number of the n trials resulting in outcome 1, the random variable X2 is defined to be the number of the n trials resulting in outcome 2, and the random variable X3 is defined to be the number of the n trials resulting in outcome 3, then the joint distribution of (X1 , X2 , X3) is called a trinomial distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is Each Xi has a distribution. If the number of outcomes is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multinomial distribution.

4. (a) An urn contains 15 red chips, 10 blue chips, and 5 white chips. Eight chips are selected at random and without replacement. The following random variables are defined: X1 = number of red chips selected , X2 = number of blue chips selected , X3 = number of white chips selected . Find the joint p.m.f. of (X1 , X2 , X3) . (X1 , X2 , X3) have a trivariate hypergeometric distribution, and X3 = 8 – X1 – X2 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is 15 x1 10 x2 5 8 – x1 – x2 x1 = 0, 1, …, 8 if x2 = 0, 1, …, 8 f(x1, x2) = 30 8 3  x1 + x2  8

(b) Find the marginal p.m.f. for each of X1 , X2 , and X3 . Each of X1 , X2 , and X3 has a hypergeometric distribution. 15 x1 15 8 – x1 f1(x1) = if x1 = 0, 1, …, 8 30 8 10 x2 20 8 – x2 f2(x2) = if x2 = 0, 1, …, 8 30 8

4. - continued (c) 5 x3 25 8 – x3 f3(x3) = if x3 = 0, 1, …, 5 30 8 Are X1 , X2 , and X3 independent? Why or why not? X1, X2, X3 cannot possibly be independent, because any one of these random variables is totally determined by the other two.

(d) Find the probability that at least two of the selected chips are blue or at least two chips are white. P({X2  2}  {X3  2}) = 1 – P({X2  1}  {X3  1}) = 1 – [P(X2 = 0 , X3 = 0) + P(X2 = 1 , X3 = 0) + P(X2 = 0 , X3 = 1) + P(X2 = 1 , X3 = 1)] = 15 8 15 7 10 1 15 7 5 1 15 6 10 1 5 1 1 – + + + 30 8 30 8 30 8 30 8

4. - continued (e) (f) Find the conditional p.m.f. of X1 | x2 . X1 | x2 can be treated as “the number of red chips selected when For x2 = X1 | x2 has a distribution with p.m.f. E(X1 | x2) can be written as a linear function of x2 , since E(X1 | x2) = Therefore, the least squares

line for predicting X1 from X2 must be E(X2 | x1) can be written as a linear function of x2 , since E(X2 | x1) = Therefore, the least squares line for predicting X2 from X1 must be (h) Find the covariance and correlation between X1 and X2 by making use of the following facts (instead of using direct formulas): The slope in the least squares line for predicting X1 from X2 is The slope in the least squares line for predicting X2 from X1 is The product of the slope in the least squares line for predicting X1 from X2 and the slope in the least squares line for predicting X2 from X1 is equal to .

Suppose a set contains N = N1 + N2 + N3 items, where N1 items are of one type, N2 items are of a second type, and N3 items are of a third type; n items are selected from the N items at random and without replacement. If the random variable X1 is defined to be the number of selected n items that are of the first type, the random variable X2 is defined to be the number of selected n items that are of the second type, and the random variable X3 is defined to be the number of selected n items that are of the third type, then the joint distribution of (X1 , X2 , X3) is called a trivariate hypergeometric distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . N1 x1 N2 x2 N – N1 – N2 n – x1 – x2 The joint p.m.f. of (X1 , X2) is if x1 and x2 are “appropriate” integers N n Each Xi has a distribution. If the number of types of items is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multivariate hypergeometric distribution. hypergeometric

5. (a) An urn contains 15 red chips, 10 blue chips, and 5 white chips. Eight chips are selected at random and with replacement. The following random variables are defined: X1 = number of red chips selected , X2 = number of blue chips selected , X3 = number of white chips selected . Find the joint p.m.f. of (X1 , X2 , X3) . (X1 , X2 , X3) have a trinomial distribution, and X3 = 8 – X1 – X2 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is x1 x2 8 – x1 – x2 8! 1 — 2 1 — 3 1 — 6 f(x1, x2) = x1! x2! (8 – x1 – x2)! x1 = 0, 1, …, 8 if x2 = 0, 1, …, 8 x1 + x2  8

(b) Find the marginal p.m.f. for each of X1 , X2 , and X3 . Each of X1 , X2 , and X3 has a binomial distribution. 8 8! 1 — 2 f1(x1) = if x1 = 0, 1, …, 8 x1! (8 – x1)! x2 8 – x2 8! 1 — 3 2 — 3 f2(x2) = if x2 = 0, 1, …, 8 x2! (8 – x2)!

5. - continued (c) x3 8 – x3 8! 1 — 6 5 — 6 f3(x3) = if x3 = 0, 1, …, 8 x3! (8 – x3)! Are X1 , X2 , and X3 independent? Why or why not? X1, X2, X3 cannot possibly be independent, because any one of these random variables is totally determined by the other two.

(d) Find the probability that at least two of the selected chips are blue or at least two chips are white. P({X2  2}  {X3  2}) = 1 – P({X2  1}  {X3  1}) = 1 – [P(X2 = 0 , X3 = 0) + P(X2 = 1 , X3 = 0) + P(X2 = 0 , X3 = 1) + P(X2 = 1 , X3 = 1)] = 8 7 1 — 2 8! 1 — 2 1 — 3 1 – + + 7! 1! 7 6 8! 1 — 2 1 — 6 8! 1 — 2 1 — 3 1 — 6 + 7! 1! 6! 1! 1!

5. - continued (e) (f) Find the conditional p.m.f. of X1 | x2 . X1 | x2 can be treated as “the number of red chips selected when For x2 = X1 | x2 has a distribution with p.m.f. E(X1 | x2) can be written as a linear function of x2 , since E(X1 | x2) = Therefore, the least squares

line for predicting X1 from X2 must be E(X2 | x1) can be written as a linear function of x2 , since E(X2 | x1) = Therefore, the least squares line for predicting X2 from X1 must be (h) Find the covariance and correlation between X1 and X2 by making use of the following facts (instead of using direct formulas): The slope in the least squares line for predicting X1 from X2 is The slope in the least squares line for predicting X2 from X1 is The product of the slope in the least squares line for predicting X1 from X2 and the slope in the least squares line for predicting X2 from X1 is equal to .

if x1 and x2 are non-negative integers such that x1 + x2  n Suppose each in a sequence of independent trials must result in one of outcome 1, outcome 2, or outcome 3. The probability of outcome 1 on each trial is p1 , the probability of outcome 2 on each trial is p2 , and the probability of outcome 3 on each trial is p3 = 1 – p1 – p2 . If the random variable X1 is defined to be the number of the n trials resulting in outcome 1, the random variable X2 is defined to be the number of the n trials resulting in outcome 2, and the random variable X3 is defined to be the number of the n trials resulting in outcome 3, then the joint distribution of (X1 , X2 , X3) is called a trinomial distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is x1 x2 n – x1 – x2 n! p1 p2 (1 – p1 – p2) x1! x2! (n – x1 – x2)! if x1 and x2 are non-negative integers such that x1 + x2  n Each Xi has a distribution. If the number of outcomes is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multinomial distribution. b( , ) n pi

6. One chip is selected from each of two urns, one containing three chips labeled distinctively with the integers 1 through 3 and the other containing two chips labeled distinctively with the integers 1 and 2. The following random variables are defined: X = largest integer among the labels on the selected chips , Y = smallest integer among the labels on the selected chips . The space of (X, Y) is {(1,1) (2,1) (3,1) (2,2) (3,2)}. (Note: We immediately see that X and Y cannot be independent, since the joint space is not “rectangular”.) 2 y 1 1/6 1/6 1/6 1/3 1/6 1 2 3 x The joint p.m.f. of (X, Y) is f(x, y) = 1 — 6 if (x, y) = (1, 1) , (3, 1) , (2, 2) , (3, 2) 1 — 3 if (x, y) = (2, 1)

1 / 6 if x = 1 if x = 2, 3 The marginal p.m.f. of X is f1(x) = 1 / x The marginal p.m.f. of Y is f2(y) = (3 – y) / 3 if y = 1, 2 E(X) = 13 / 6 E(X2) = 31 / 6 Var(X) = 31 / 6 – (13 / 6)2 = 17 / 36 E(Y) = 4 / 3 E(Y2) = 2 Var(Y) = 2 – (4 / 3)2 = 2 / 9

6. - continued Since _________________________, then the random variables X and Y _______________ independent f(x, y)  f1(x)f2(y) are not (as we previously noted). Using the joint p.m.f., E(XY) = (1)(1)(1/6) + (3)(1)(1/6) + (2)(2)(1/6) + (3)(2)(1/6) + (2)(1)(1/3) = 3 Cov(X,Y) =  = The least squares lines for predicting Y from X is

The least squares lines for predicting X from Y is The conditional p.m.f. of Y | X = 1 is Y | X = 2 is Y | X = 3 is

6. - continued The conditional p.m.f. of X | Y = 1 is X | Y = 2 is

E(Y | X = 1) = E(Y2 | X = 1) = Var(Y | X = 1) = E(Y | X = 2) = E(Y2 | X = 2) = Var(Y | X = 2) = E(Y | X = 3) = E(Y2 | X = 3) = Var(Y | X = 3) =

6. - continued E(X | Y = 1) = E(X2 | Y = 1) = Var(X | Y = 1) = E(X | Y = 2) = E(X2 | Y = 2) = Var(X | Y = 2) =

Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

For continuous type random variables (X, Y), the definitions of joint probability density function (joint p.d.f.), independence of X and Y, and mathematical expectation are each analogous to those for discrete type random variables, with summation signs replaced by integral signs. The covariance between random variables X and Y is The correlation between random variables X and Y is y Consider the equation of a line y = a + bx which comes “closest” to predicting the values of the random variable Y from the random variable X in the sense that E{[Y – (a + bX)]2} is minimized. x

Skip to #9 9. Random variables X and Y have joint p.d.f. f(x,y) = 5xy2 / 2 if 0 < x/2 < y < 1 . Skip to #9 The space of (X, Y) displayed graphically is as follows: y (Note: We immediately see that X and Y cannot be independent, since the joint space is not “rectangular”.) y = x / 2 (0,2) (2,1) x (0,0)

Event A = {(x,y) | 1/2 < x < 1 , 1/2 < y < 3/2} displayed graphically is as follows: (1/2, 3/2) (1, 3/2) (0,2) (2,1) (1/2, 1/2) (1, 1/2) x P(A) = f(x, y) dx dy = (0,0) A 1 1 1 1 1 1 5xy2 —— dx dy = 2 5x2y2 —— dy = 4 15y2 —— dy = 16 5y3 — = 16 35 –— 128 x = 1/2 y = 1/2 1/2 1/2 1/2 1/2

9. - continued  1 5xy2 —— dy = 2 The marginal p.d.f. of X is f1(x) = f(x, y) dy = –  x / 2 1 5xy3 —— = 6 40x – 5x4 ———— 48 if 0 < x < 2 y = x / 2 2 2 2 40x – 5x4 x ———— dx = 48 40x2 – 5x5 ———— dx = 48 5x3 5x6 — – —— = 18 288 10 — 9 E(X) = x = 0

2 2 2 40x – 5x4 x2 ———— dx = 48 40x3 – 5x6 ———— dx = 48 5x4 5x7 — – —— = 24 336 10 — 7 E(X 2) = x = 0 110 —– 567 Var(X) =

9. - continued  2y 5xy2 —— dx = 2 The marginal p.d.f. of Y is f2(y) = f(x, y) dx = –  2y 5x2y2 —— = 4 5y4 if 0 < y < 1 x = 0 5 — 6 E(Y) =

5 — 7 E(Y 2) = 5 —– 252 Var(Y) =

9. - continued Since _________________________, then the random variables X and Y _______________ independent f(x, y)  f1(x)f2(y) are not (as we previously noted).   1 2y 5xy2 xy —— dx dy = 2 E(XY) = xy f(x, y) dx dy = –  –  1 2y 1 1 1 2y 5x2y3 —— dx dy = 2 5x3y3 —— dy = 6 20y6 —— dy = 3 20y7 —– = 21 20 — 21 x = 0 y = 0 Cov(X,Y) =  =

The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is For 0 < x < 2, the conditional p.d.f. of Y | X = x is

9. - continued For 0 < x < 2, E(Y | X = x) = For 0 < x < 2, E(Y2 | X = x) = For 0 < x < 2, Var(Y | X = x) =

For 0 < y < 1, the conditional p.d.f. of X | Y = y is For 0 < y < 1, E(X | Y = y) = For 0 < y < 1, E(X2 | Y = y) =

9. - continued For 0 < y < 1, Var(X | Y = y) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

7. Random variables X and Y have joint p.d.f. f(x,y) = (x + y) / 8 if 0 < x < 2 , 0 < y < 2 . The space of (X, Y) displayed graphically is as follows: y (0,2) (2,2) x (0,0) (2,0)

Events A = {(x,y) | 1/2 < x < 1 , 1/2 < y < 3/2} and B = {(x,y) | x > y} displayed graphically are as follows: The set A = {(x, y) | 1/2 < x < 1 , 1/2 < y < 3/2} is graphically displayed as follows: The set B = {(x, y) | x > y} is graphically displayed as follows: A y y B (2,2) (0,2) (2,2) (0,2) (2,2) (1/2, 3/2) (1, 3/2) (1/2, 1/2) (1, 1/2) x x (0,0) (0,0) (2,0) (2,0)

7. - continued P(A) = P(1/2 < X < 1 , 1/2 < Y < 3/2) = f(x, y) dx dy = A 3/2 1 3/2 1 x + y —— dx dy = 8 x2 + 2xy ——— dy = 16 x = 1/2 1/2 1/2 1/2 3/2 3/2 3/2 1 + 2y 1/4 + y ——— – ——— dy = 16 16 3 + 4y ——— dy = 64 3y + 2y2 ——— = 64 7 — 64 y = 1/2 1/2 1/2

P(B) = P(X > Y) = f(x, y) dx dy = x > y 2 2 2 x2 x + y —— dx dy or 8 x + y —— dy dx 8 y x2 2 2 2 2x3 + x4 ——— dx = 16 5x4 + 2x5 ———— = 160 9 — 10 2xy + y2 ——— dx = 16 y = 0 x = 0

7. - continued  2 x + y —— dy = 8 The marginal p.d.f. of X is f1(x) = f(x, y) dy = –  2 2xy + y2 ——— = 16 x + 1 —— if 0 < x < 2 4 y = 0 E(X) = 7/6 E(X2) = 5/3 Var(X) = 11/36

 2 x + y —— dx = 8 The marginal p.d.f. of Y is f2(y) = f(x, y) dx = –  y + 1 —— if 0 < y < 2 4 E(Y) = 7/6 E(Y2) = 5/3 Var(Y) = 11/36 Since _________________________, then the random variables X and Y _______________ independent f(x, y)  f1(x)f2(y) are not

7. - continued 2 2   E(XY) = x2y + xy2 ——— dx dy = 8 xy f(x, y) dx dy = –  –  2 2 2 2 2x3y + 3x2y2 ————— dy = 48 4y + 3y2 ———— dy = 12 2y2 + y3 ——— = 12 4 — 3 y = 0 x = 0 Cov(X,Y) =  =

The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is For 0 < x < 2, the conditional p.d.f. of Y | X = x is

7. - continued For 0 < x < 2, E(Y | X = x) = For 0 < x < 2, E(Y2 | X = x) = For 0 < x < 2, Var(Y | X = x) =

For 0 < y < 2, the conditional p.d.f. of X | Y = y is For 0 < y < 2, E(X | Y = y) = For 0 < y < 2, E(X2 | Y = y) =

7. - continued For 0 < y < 2, Var(X | Y = y) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

8. Random variables X and Y have joint p.d.f. f(x,y) = (y – 1) / (2x2) if 1 < x , 1 < y < 3 . The space of (X, Y) displayed graphically is as follows: y (1,3) (1,1) x (0,0)

8. - continued Event A = {(x,y) | 1 < x < 3 , 1 < y < (x+1)/2} displayed graphically is as follows: y y = (x + 1) / 2 x =3 (1,3) (3,2) (1,1) x (0,0) 3 (x+1)/2 y – 1 —— dy dx = 2x2 f(x, y) dx dy = P(A) = A 1 1

3 (x+1)/2 3 (x+1)/2 y – 1 —— dy dx = 2x2 y2 – 2y ——— dx = 4x2 y = 1 1 1 1 3 3 (x + 1)2 – 4(x + 1) 1 ——————— + —— dx = 16x2 4x2 x2 – 2x + 1 ———— dx = 16x2 1 1 3 3 1 1 1 — – — + —— dx = 16 8x 16x2 x ln x 1 — – —— – —— = 16 8 16x 1 x = 1 3 ln 3 1 — – —— – — – 0 = 16 8 48 1 ln 3 — – —— 6 8

8. - continued Events B = {(x,y) | x > y} displayed graphically is as follows: y y = x (1,3) (3,3) Note that describing B as makes the integration more work than describing B as {1 < x < 3 , 1 < y < x}  {3 < x <  , 1 < y < 3} (1,1) {1 < y < 3 , y < x < } x (0,0) 3  3  y – 1 —— dx dy = 2x2 1 – y —— dy = 2x P(B) = f(x, y) dx dy = x > y 1 y 1 x = y

3 3 3  1 – y —— dy = 2x y – 1 —— dy = 2y y lny — – —– = 2 2 ln 3 1 – —— 2 1 x = y 1 y = 1

8. - continued 3  y – 1 —— dy = 2x2 The marginal p.d.f. of X is f1(x) = f(x, y) dy = 1 –  3 y2 – 2y ——— = 4x2 1 — if 1 < x x2 y = 1    1 x — dx = x2 1 — dx = x E(X) =  ln(x) = x = 1 1 1

  1 x2 — dx = x2 E(X2) = dx =  1 1 Var(X) = 

8. - continued   y – 1 —— dx = 2x2 The marginal p.d.f. of Y is f2(y) = f(x, y) dx = –  1  1 – y —— = 2x y – 1 —— if 1 < y < 3 2 x = 1 3 3 3 y – 1 y —— dy = 2 y2 – y —— dy = 2 y3 y2 — – — = 6 4 7 — 3 E(Y) = y = 1 1 1

3 3 3 y – 1 y2 —— dy = 2 y3 – y2 —— dy = 2 y4 y3 — – — = 8 6 17 — 3 E(Y2) = y = 1 1 1 2 — 9 Var(Y) =

8. - continued Since _________________________, then the random variables X and Y _______________ independent f(x, y) = f1(x)f2(y) are 3   3 y – 1 —— dx dy = 2x2 1 — dx = x y2 – y —— dy 2 E(XY) = xy 1 1 1 1  1 — dx = x 7 — 3  Cov(X,Y) = 1  =

The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is For 1 < x , the conditional p.d.f. of Y | X = x is For 1 < y < 3, the conditional p.d.f. of X | Y = y is For 1 < x , E(Y | X = x) = For 1 < x , Var(Y | X = x) = For 0 < y < 3, E(X | Y = y) = For 0 < y < 3, Var(X | Y = y) =