Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiple Discrete Random Variables. Introduction Consider the choice of a student at random from a population. We wish to know student’s height, weight,

Similar presentations


Presentation on theme: "Multiple Discrete Random Variables. Introduction Consider the choice of a student at random from a population. We wish to know student’s height, weight,"— Presentation transcript:

1 Multiple Discrete Random Variables

2 Introduction Consider the choice of a student at random from a population. We wish to know student’s height, weight, blood pressure, pulse rate, etc. The mapping from sample space of students to measurements of height and weight, would be H(s i ) = h i, W(s i ) = w i, of the student selected. The table is a two-dimensional array that lists the probability P[H = h i and W =w j ].

3 Introduction The information can also be displayed in a three-dimensional format. We will study dependencies between the multiple RV. For example: “Can we predict a person’s height from his weight?” These probabilities were termed join probabilities. The height and weight could be represented as 2 x 1 random vector.

4 Jointly Distributed RVs Consider two discrete RVs X and Y. They represent the functions that map an outcome of an experiment s i to a value in the plane. for all The exp. consists of the simultaneous tossing of a penny and a nickel. Two random variable that are defined on the same sample space S are said to be jointly distributed.

5 Jointly Distributed RVs There are four vectors that comprise the sample space The values of the random vector (multiple random variables) are denoted either by (x,y) a point in the plane or [x y] T a 2D vector. The size of the sample space for discrete RV can be Finite Countably infinite. If X can take on 2 values N x = 2, and Y can take on 2 values N Y =2, the total number of elements in S X,Y is N X N Y = 4.

6 Jointly Distributed RVs Generally, if S X = {x 1, x 2,…,x Nx } and S Y = {y 1, y 2,…,y Ny }, then the random vector can take on values in The notation A × B, denotes a Cartesian product set. The joint PMF ( bivariate PMF) as

7 Properties of joint PMF Property 1. Range of values of joint PMF Property 2. Sum of values of joint PMF Similarly for a countably infinite sample space. For two fair coins that do not interact as they are tossed we might assign p X,Y [i,j] = ¼.

8 The procedure to determine the joint PMF from the probabilities defined on S The procedure depends on whether the RV mapping is one-to-one or many-to-one. For a one-to-one mapping from S to S X,Y we have It is assumed that s k is the only solution to X(s) = x i and Y(s) = y j. For a many-to-one transformation the joint PMF is found as

9 Two dice toss with different colored dice A red die and a blue die are tossed. The die that yields the larger number of dots is chosen. If both dice display the same number of dots, the red die is chosen. The numerical outcome of the experiment is defined to be 0 if the blue die is chosen and 1 if the red die is chosen, along with its corresponding number of dots. What is p X,Y [1,3] for example?

10 Two dice toss with different colored dice To determine the desired value of the PMF, we assume that each outcome in S is equally likely and therefore is equal to 1/36. Since there are three outcomes that map into (1,3). In general, we can use the joint PMF, to find probability of event A defined on S X,Y = S X × S Y.

11 Marginal PMFs and CDFs If p X,Y [x,y] is known, then marginal probabilities p X [x i ] and p Y [y i ] can be determined. Consider an event of interest A on countably infinite sample space. Let A = {x k } × S Y. Then, with i = k only with j = k only

12 Example: Two coin toss A penny (RV X ) and a nickel (RV Y ) are tossed and the outcomes are mapped into a 1 for a head and a 0 for a tail. Consider the joint PMF The marginal PMFs are given as =1

13 Joint PMF cannot be determined from marginal PMFs It is not possible in general to obtain joint PMF from marginal PMFs. Consider the following joint PMF The marginal PMFs are the same as the ones before. There are an infinite number of joint PMFs that have the same marginal PMFs. joint PMF  marginal PMFs marginal PMFs  joint PMF

14 Joint cumulative distribution function A joint cumulative distribution function (CDF) can be defined for a random vector as and can be found explicitly by summing the joint PMFs as The PMF can be recovered as

15 Properties of Cumulative distribution functions The marginal CDFs can be easily found from the joint CDF as Property 1. Range of values Property 2. Values of “endpoints” Property 3. Monotonically increasing Monotonically increases as x and/or y increases. Property 4. “Right” continuous The joint CDF takes the value after the jump.

16 Independence of Multiple RV Consider the experiment of tossing a coin and then a die. The outcome of the coin X = {0,1} and The outcome of a die Y = {1,2,3,4,5,6} hence the probability of the random vector (X,Y) taking on a value Y = y i does not depend on X = x i. X and Y are independent random variables if all the joint events on S X,Y are independent. The probability of joint events may be reduced to probabilities of “marginal events”. If A = {x i } and B = {y j }, then and are independent

17 Independence of Multiple RV The converse it true. If the joint PMF factors, then X and Y are independent. Example: Two coin toss – independence Assume we toss a penny and nickel. If all outcomes are equivalently the joint PMF is given by marginal probability

18 Independence of Multiple RV Example: Two coin toss – dependence Consider the same experiment but with a joint PMF given by Then p X,Y [0,0] = 1/8 ≠ (1/4)(3/8) = p X [0]p Y [0] and hence X and Y cannot be independent. If two random variables are not independent, they are said to be dependent.

19 Independence of Multiple RV Example: Two coin toss – dependent but fair coins Consider the same experiment again but with joint PMF given by Since p X,Y [0,0] = 3/8 ≠(1/2)(1/2), X and Y are dependent. But, by examining the marginal PMFs we see P[heads] = ½ (fair??), we might conclude that the RVs were independent. This is incorrect. If the RVs are independent, the joint CDF factors as well.

20 Transformations of Multiple Random Variables The PMF of Y = g(x) if the PMF of X is known is given by In the case of two discrete RVs X and Y that are transformed into W = g(X, Y) and Z = h(X, Y), we have Sometimes we wish to determine the PMF of Z = h(X, Y) only. Then we can use auxiliary RV W = X, so that p Z is the marginal PMF and can be found form the formula above as

21 Example: Independent Poisson RVs Assume that the joint PMF is give as the product of the marginal PMFs, and each PMF is Poisson PMF. Consider the transformation We need to determine all (k,l) so that But x k, y l and w i, z j can be replaced by k,l and i,j each with 0,1,….

22 Example: Independent Poisson RVs Apply the given transformation we get Solving for (k, l) for the given (i, j), we have We must have l ≥ 0 so that l = j – i ≥ 0. discrete unit step

23 Use the discrete unit step sequence to avoid mistakes The discrete unit step sequence was introduced to designate the region of w-z plane over which p W,Z [i,j] is nonzero. The transformation will generally change the region over which the new joint PMF is nonzero. A common mistake is to disregard this region and assert that the joint PMF is nonzero over i = 0,1,…; j = 0,1,…. To avoid possible errors unit steps are applied

24 Example: Independent Poisson RVs To find the PMF of Z = X + Y from the joint PMF obtained earlier we set W = X so we have S W = S X = {0,1,…} and Since u[i] = 1 for i = 0,1,… and u[j - i] = 1 for i = 0,1,…,j and u[j - i] = 0 for i > j, we drop u[i]u[j – i] multipliers. Note that Z can take on values j = 0,1,… since Z = X + Y.

25 Connection to characteristic function Generally the formula for the PMF of the sum of any two discrete RV X and Y, dependent or independent is given by If the RV are independent, then since the joint PMF must factor, we have the result This summation is a discrete convolution. Taking the Fourier transformation (defined with a +j) of both sides produces 

26 Example: Independent Poisson RVs using CF approach We showed that if X ~ Pois(λ), then Thus using the above Fourier property we have But the CF in the braces is that of a Poisson RV and corresponds to The use of CF for determination of PMF for a sum of independent RV has considerably simplified the derivation. In summary, if X and Y are independent RV with integer values, then the PMF of Z = X + Y is given by

27 Transformation of a fine sample space It is possible to obtain the PMF of Z = g(X,Y) by a direct calculation if the sample S X,Y is finite. To first obtain the transformed joint PMF p w,z we 1.determine the finite sample space SZ. 2.determine which sample points (x i,y j ) in S X,Y map into each 3.sum the probabilities of those (x i,y j ) sample points to yield p z [z k ]. Mathematically this is equivalent to

28 Direct computation of PMF for transformed RV Consider the transformation of the RV (X,Y) into the scalar RV Z = X 2 + Y 2. The joint PMF is given by To find the PMF for Z first note that (X,Y) takes on the values (i,j) = (0,0),(1,0),(0,1),(1,1). Therefore, Z must take on the values z k = i 2 + j 2 = 0,1,2. Then

29 Expected Values If Z = g(X, Y), then by definition its expected value Or using a more direct approach EX. Expected value of a sum of random variables: Z = g(X,Y) = X + Y

30 Expected value of a product of RV If Z = g(X, Y) = XY, then If X and Y are independent, then since the joint PMF factors, we have More generally,

31 Variance of a sum of RVs Consider the calculation of var(X + Y). Then, letting Z = g(X,Y) = ((X + Y) – E X,Y [(X + Y)]) 2, we have The last term is called the covariance and defined as or alternatively

32 Joint moments The questions of interest If the outcomes of one RV is a given value, what can we say about the outcome of the other RV? There is clearly a relationship between height and weight.

33 Joint moments To quantify these relationships we form the product XY, which can take on the values +1, -1, and ±1 for the joint PMF above. To determine the value of XY on the average we define the joint moment as E X,Y [XY]. For the case (a)

34 Joint moments In previous example E X [X] = E Y [Y] = 0. If means aren’t zero, the joint moments will depend on the values of the means. To nullify this effect it is convenient to use the joint central moments. That will produce the desired +1 for the joint PMF above.

35 Independence implies zero covariance but zero covariance does not imply independence For the joint PMF the covariance is zero since Consider the joint PMF which assigns equal probability ½ to each of the four points point. and thus However, X and Y are dependent because p X,Y [1,0] = 1/4 but p X [1]p Y [0]=(1/4)(1/2) = 1/8.

36 The joint k-l th moment More generally the joint k-l th moment is defined as For k = 1,2,…; l =1,2,…, when it exists. The joint k-l th moment central moment is defined as For k = 1,2,…; l =1,2,…, when it exists.

37 Prediction of a RV outcome The covariance between two RVs is useful for predicting Y based on knowledge of the outcome of X. We seek a predictor Y that is linear in X or The constants a and b are to be chosen so that “on the average” the observed value of aX + b is close to the observed value of Y. The solution is given by

38 Prediction of a RV outcome The the optimal linear prediction of Y given the outcome X = x is Example: Predicting one RV outcome from knowledge of second RV outcome. We found from marginals Regression line

39 Prediction of a RV outcome We could also have predicted X from Y = y by interchanging X and Y. If cov( X,Y ) = 0, then or X = x provides no information to predict Y i.e. X and Y are independent. If the cov is zero, the RV can still be dependent.

40 A standardized random variable A standardized RV is defined to be for which the mean is zero and the variance is one. Example: if X ~ Pois(λ), then Let’s find the best linear prediction of the standardized Y based on a standardized X s = x s. then and therefore

41 Previous example continued For the previous example we have then and so

42 Correlation coefficient The factor that scales x s to produce Y S is called correlation coefficient (CC) When X and Y have ρ X,Y ≠0, then X and Y are said to be correlated. If covariance is zero and hence ρ X,Y =0 then the RVs are said to be uncorrelated.

43 Property Property: Correlation coefficient is always less than or equal to one in magnitude or Proof: For RVs V and W the Cauchy-Schwarz inequality says that With equality if and only if W = cV for c a constant. Thus letting V = X – E X [X] and W = Y – E Y [Y], we have Equality will hold if and only if W = cV or equivalently if Y – E Y [Y] = c(X – E X [X]), which is easily shown to imply that

44 Correlation does not imply a causal relationship between RVs A frequent misapplication of probability is to assert that two quantities that are correlated (ρ X,Y ≠0 ) are such because one causes the other. Incidence of prostate cancer per 1000 individuals older than age 55 versus height. Correlation between two variables only indicates an association i.e. if one increases, then so does the other (or vice versa).

45 Joint Characteristic Functions For the RV X and Y it is defined as It is seen to be the 2D Fourier transform of the two- dimensional sequence p X,Y [ k,l ]. The joint moments are given by the formula Assuming both RV take on integer values, it is evaluated using as

46 Joint Characteristic Functions The import application is to finding the PMF for the sum of independent RVs X and Y then the joint characteristic function factors due to the property E X,Y [g(X)h(Y)] = E X [g(X)]E Y [h(Y)].

47 Joint Characteristic Functions If the joint CF factors, then X and Y are independent RVs. Consider the transformed RV W = g(X ) and Z = h(Y), where X and Y are independent. Prove that W and Z are independent as well. The joint CF of the transformed RVs is But we have that

48 Real-world Example: Assessing Health Risks Obesity is found to be associated with many life-threatening illnesses, especially diabetes. The definition of an obese person is given by the BMI as Where W and H is weight and height in inches. 25 < BMI < 30overweight 30 < BMIobese We will use a hypothetical population of college students. For this population we would like to know the probability of obese persons (i.e. 30 < BMI ).

49 Real-world Example: Assessing Health Risks

50 Computer simulation of Random Vectors If X and Y are independent, then we generate a realization of X according to p X [x i ] and a realization Y, according to p Y [y j ] Concatenating the realizations together we form the vector of random variables. j = 0j = 1 i = 01/8 i = 11/41/2

51 Computer simulation of Random Vectors Once the realization are available we can estimate the joint PMF and marginal PMFs And the joint moments estimated as

52 Practice problems Two coins are tossed in succession with a head being mapped into a +1 and a tail mapped into a -1. If a RV is defined as (X,Y) with X representing the mapping of the first toss and Y representing the mapping of the second toss, draw the mapping. Also, what is S X,Y ? (Hint: see slides 4,5). Two dice are tossed. The number of dots observed on the dice are added together to form the random variable X and also difference to form Y. Determine the possible outcomes of the random vector (X,Y) and plot them in the plane. How many possible outcomes are there? Is a valid joint PMF? For a given joint PMF is given find marginal probability

53 Practice problems Find a formula for var(X - Y) similar to What can you say about the relationship between var(X + Y) if X and Y are uncorrelated? Find the covariance for the joint PMF given in the table How do you know the value that you obtained is correct? 53

54 Homework 1 2) The values of a joint PMF are given below. Determine the marginal PMFs 3)

55 Homework 4) Prove that the minimum mean square error of the optimal linear predictor is given by 55


Download ppt "Multiple Discrete Random Variables. Introduction Consider the choice of a student at random from a population. We wish to know student’s height, weight,"

Similar presentations


Ads by Google