Multiple Discrete Random Variables. Introduction Consider the choice of a student at random from a population. We wish to know student’s height, weight,

Slides:



Advertisements
Similar presentations
MOMENT GENERATING FUNCTION AND STATISTICAL DISTRIBUTIONS
Advertisements

Chapter 2 Multivariate Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
CS433: Modeling and Simulation
Random Variables ECE460 Spring, 2012.
Random Variable A random variable X is a function that assign a real number, X(ζ), to each outcome ζ in the sample space of a random experiment. Domain.
Chapter 5 Discrete Random Variables and Probability Distributions
Chapter 4 Discrete Random Variables and Probability Distributions
Review.
Discrete Random Variables and Probability Distributions
Class notes for ISE 201 San Jose State University
Chapter 6 Continuous Random Variables and Probability Distributions
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
1 Engineering Computation Part 5. 2 Some Concepts Previous to Probability RANDOM EXPERIMENT A random experiment or trial can be thought of as any activity.
C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.
Chapter 4: Joint and Conditional Distributions
Random Variable and Probability Distribution
Lecture II-2: Probability Review
1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Pairs of Random Variables Random Process. Introduction  In this lecture you will study:  Joint pmf, cdf, and pdf  Joint moments  The degree of “correlation”
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Correlation.
Chapter 5 Discrete Random Variables and Probability Distributions ©
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
CHAPTER 4 Multiple Random Variable
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Two Functions of Two Random.
1 7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to.
Random Variables A random variable is simply a real-valued function defined on the sample space of an experiment. Example. Three fair coins are flipped.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function.
Multiple Random Variables Two Discrete Random Variables –Joint pmf –Marginal pmf Two Continuous Random Variables –Joint Distribution (PDF) –Joint Density.
One Random Variable Random Process.
The two way frequency table The  2 statistic Techniques for examining dependence amongst two categorical variables.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Two Random Variables.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Operations on Multiple Random Variables
1 8. One Function of Two Random Variables Given two random variables X and Y and a function g(x,y), we form a new random variable Z as Given the joint.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Chapter 5a:Functions of Random Variables Yang Zhenlin.
Chapter 2: Random Variable and Probability Distributions Yang Zhenlin.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Expected values of discrete Random Variables. The function that maps S into S X in R and which is denoted by X(.) is called a random variable. The name.
Lecture 29 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
Discrete Random Variables. Introduction In previous lectures we established a foundation of the probability theory; we applied the probability theory.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Probability and Moment Approximations using Limit Theorems.
Joint Moments and Joint Characteristic Functions.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
MULTIPLE RANDOM VARIABLES A vector random variable X is a function that assigns a vector of real numbers to each outcome of a random experiment. e.g. Random.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Expectations of Continuous Random Variables. Introduction The expected value E[X] for a continuous RV is motivated from the analogous definition for a.
1 Two Discrete Random Variables The probability mass function (pmf) of a single discrete rv X specifies how much probability mass is placed on each possible.
Basic Random Processes. Introduction Annual summer rainfall in Rhode Island is a physical process has been ongoing for all time and will continue. We’d.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Function of a random variable Let X be a random variable in a probabilistic space with a probability distribution F(x) Sometimes we may be interested in.
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
Tutorial 9: Further Topics on Random Variables 2
Descriptive Analysis and Presentation of Bivariate Data
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
Further Topics on Random Variables: Covariance and Correlation
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
Discrete Random Variables and Probability Distributions
Further Topics on Random Variables: Covariance and Correlation
Presentation transcript:

Multiple Discrete Random Variables

Introduction Consider the choice of a student at random from a population. We wish to know student’s height, weight, blood pressure, pulse rate, etc. The mapping from sample space of students to measurements of height and weight, would be H(s i ) = h i, W(s i ) = w i, of the student selected. The table is a two-dimensional array that lists the probability P[H = h i and W =w j ].

Introduction The information can also be displayed in a three-dimensional format. We will study dependencies between the multiple RV. For example: “Can we predict a person’s height from his weight?” These probabilities were termed join probabilities. The height and weight could be represented as 2 x 1 random vector.

Jointly Distributed RVs Consider two discrete RVs X and Y. They represent the functions that map an outcome of an experiment s i to a value in the plane. for all The exp. consists of the simultaneous tossing of a penny and a nickel. Two random variable that are defined on the same sample space S are said to be jointly distributed.

Jointly Distributed RVs There are four vectors that comprise the sample space The values of the random vector (multiple random variables) are denoted either by (x,y) a point in the plane or [x y] T a 2D vector. The size of the sample space for discrete RV can be Finite Countably infinite. If X can take on 2 values N x = 2, and Y can take on 2 values N Y =2, the total number of elements in S X,Y is N X N Y = 4.

Jointly Distributed RVs Generally, if S X = {x 1, x 2,…,x Nx } and S Y = {y 1, y 2,…,y Ny }, then the random vector can take on values in The notation A × B, denotes a Cartesian product set. The joint PMF ( bivariate PMF) as

Properties of joint PMF Property 1. Range of values of joint PMF Property 2. Sum of values of joint PMF Similarly for a countably infinite sample space. For two fair coins that do not interact as they are tossed we might assign p X,Y [i,j] = ¼.

The procedure to determine the joint PMF from the probabilities defined on S The procedure depends on whether the RV mapping is one-to-one or many-to-one. For a one-to-one mapping from S to S X,Y we have It is assumed that s k is the only solution to X(s) = x i and Y(s) = y j. For a many-to-one transformation the joint PMF is found as

Two dice toss with different colored dice A red die and a blue die are tossed. The die that yields the larger number of dots is chosen. If both dice display the same number of dots, the red die is chosen. The numerical outcome of the experiment is defined to be 0 if the blue die is chosen and 1 if the red die is chosen, along with its corresponding number of dots. What is p X,Y [1,3] for example?

Two dice toss with different colored dice To determine the desired value of the PMF, we assume that each outcome in S is equally likely and therefore is equal to 1/36. Since there are three outcomes that map into (1,3). In general, we can use the joint PMF, to find probability of event A defined on S X,Y = S X × S Y.

Marginal PMFs and CDFs If p X,Y [x,y] is known, then marginal probabilities p X [x i ] and p Y [y i ] can be determined. Consider an event of interest A on countably infinite sample space. Let A = {x k } × S Y. Then, with i = k only with j = k only

Example: Two coin toss A penny (RV X ) and a nickel (RV Y ) are tossed and the outcomes are mapped into a 1 for a head and a 0 for a tail. Consider the joint PMF The marginal PMFs are given as =1

Joint PMF cannot be determined from marginal PMFs It is not possible in general to obtain joint PMF from marginal PMFs. Consider the following joint PMF The marginal PMFs are the same as the ones before. There are an infinite number of joint PMFs that have the same marginal PMFs. joint PMF  marginal PMFs marginal PMFs  joint PMF

Joint cumulative distribution function A joint cumulative distribution function (CDF) can be defined for a random vector as and can be found explicitly by summing the joint PMFs as The PMF can be recovered as

Properties of Cumulative distribution functions The marginal CDFs can be easily found from the joint CDF as Property 1. Range of values Property 2. Values of “endpoints” Property 3. Monotonically increasing Monotonically increases as x and/or y increases. Property 4. “Right” continuous The joint CDF takes the value after the jump.

Independence of Multiple RV Consider the experiment of tossing a coin and then a die. The outcome of the coin X = {0,1} and The outcome of a die Y = {1,2,3,4,5,6} hence the probability of the random vector (X,Y) taking on a value Y = y i does not depend on X = x i. X and Y are independent random variables if all the joint events on S X,Y are independent. The probability of joint events may be reduced to probabilities of “marginal events”. If A = {x i } and B = {y j }, then and are independent

Independence of Multiple RV The converse it true. If the joint PMF factors, then X and Y are independent. Example: Two coin toss – independence Assume we toss a penny and nickel. If all outcomes are equivalently the joint PMF is given by marginal probability

Independence of Multiple RV Example: Two coin toss – dependence Consider the same experiment but with a joint PMF given by Then p X,Y [0,0] = 1/8 ≠ (1/4)(3/8) = p X [0]p Y [0] and hence X and Y cannot be independent. If two random variables are not independent, they are said to be dependent.

Independence of Multiple RV Example: Two coin toss – dependent but fair coins Consider the same experiment again but with joint PMF given by Since p X,Y [0,0] = 3/8 ≠(1/2)(1/2), X and Y are dependent. But, by examining the marginal PMFs we see P[heads] = ½ (fair??), we might conclude that the RVs were independent. This is incorrect. If the RVs are independent, the joint CDF factors as well.

Transformations of Multiple Random Variables The PMF of Y = g(x) if the PMF of X is known is given by In the case of two discrete RVs X and Y that are transformed into W = g(X, Y) and Z = h(X, Y), we have Sometimes we wish to determine the PMF of Z = h(X, Y) only. Then we can use auxiliary RV W = X, so that p Z is the marginal PMF and can be found form the formula above as

Example: Independent Poisson RVs Assume that the joint PMF is give as the product of the marginal PMFs, and each PMF is Poisson PMF. Consider the transformation We need to determine all (k,l) so that But x k, y l and w i, z j can be replaced by k,l and i,j each with 0,1,….

Example: Independent Poisson RVs Apply the given transformation we get Solving for (k, l) for the given (i, j), we have We must have l ≥ 0 so that l = j – i ≥ 0. discrete unit step

Use the discrete unit step sequence to avoid mistakes The discrete unit step sequence was introduced to designate the region of w-z plane over which p W,Z [i,j] is nonzero. The transformation will generally change the region over which the new joint PMF is nonzero. A common mistake is to disregard this region and assert that the joint PMF is nonzero over i = 0,1,…; j = 0,1,…. To avoid possible errors unit steps are applied

Example: Independent Poisson RVs To find the PMF of Z = X + Y from the joint PMF obtained earlier we set W = X so we have S W = S X = {0,1,…} and Since u[i] = 1 for i = 0,1,… and u[j - i] = 1 for i = 0,1,…,j and u[j - i] = 0 for i > j, we drop u[i]u[j – i] multipliers. Note that Z can take on values j = 0,1,… since Z = X + Y.

Connection to characteristic function Generally the formula for the PMF of the sum of any two discrete RV X and Y, dependent or independent is given by If the RV are independent, then since the joint PMF must factor, we have the result This summation is a discrete convolution. Taking the Fourier transformation (defined with a +j) of both sides produces 

Example: Independent Poisson RVs using CF approach We showed that if X ~ Pois(λ), then Thus using the above Fourier property we have But the CF in the braces is that of a Poisson RV and corresponds to The use of CF for determination of PMF for a sum of independent RV has considerably simplified the derivation. In summary, if X and Y are independent RV with integer values, then the PMF of Z = X + Y is given by

Transformation of a fine sample space It is possible to obtain the PMF of Z = g(X,Y) by a direct calculation if the sample S X,Y is finite. To first obtain the transformed joint PMF p w,z we 1.determine the finite sample space SZ. 2.determine which sample points (x i,y j ) in S X,Y map into each 3.sum the probabilities of those (x i,y j ) sample points to yield p z [z k ]. Mathematically this is equivalent to

Direct computation of PMF for transformed RV Consider the transformation of the RV (X,Y) into the scalar RV Z = X 2 + Y 2. The joint PMF is given by To find the PMF for Z first note that (X,Y) takes on the values (i,j) = (0,0),(1,0),(0,1),(1,1). Therefore, Z must take on the values z k = i 2 + j 2 = 0,1,2. Then

Expected Values If Z = g(X, Y), then by definition its expected value Or using a more direct approach EX. Expected value of a sum of random variables: Z = g(X,Y) = X + Y

Expected value of a product of RV If Z = g(X, Y) = XY, then If X and Y are independent, then since the joint PMF factors, we have More generally,

Variance of a sum of RVs Consider the calculation of var(X + Y). Then, letting Z = g(X,Y) = ((X + Y) – E X,Y [(X + Y)]) 2, we have The last term is called the covariance and defined as or alternatively

Joint moments The questions of interest If the outcomes of one RV is a given value, what can we say about the outcome of the other RV? There is clearly a relationship between height and weight.

Joint moments To quantify these relationships we form the product XY, which can take on the values +1, -1, and ±1 for the joint PMF above. To determine the value of XY on the average we define the joint moment as E X,Y [XY]. For the case (a)

Joint moments In previous example E X [X] = E Y [Y] = 0. If means aren’t zero, the joint moments will depend on the values of the means. To nullify this effect it is convenient to use the joint central moments. That will produce the desired +1 for the joint PMF above.

Independence implies zero covariance but zero covariance does not imply independence For the joint PMF the covariance is zero since Consider the joint PMF which assigns equal probability ½ to each of the four points point. and thus However, X and Y are dependent because p X,Y [1,0] = 1/4 but p X [1]p Y [0]=(1/4)(1/2) = 1/8.

The joint k-l th moment More generally the joint k-l th moment is defined as For k = 1,2,…; l =1,2,…, when it exists. The joint k-l th moment central moment is defined as For k = 1,2,…; l =1,2,…, when it exists.

Prediction of a RV outcome The covariance between two RVs is useful for predicting Y based on knowledge of the outcome of X. We seek a predictor Y that is linear in X or The constants a and b are to be chosen so that “on the average” the observed value of aX + b is close to the observed value of Y. The solution is given by

Prediction of a RV outcome The the optimal linear prediction of Y given the outcome X = x is Example: Predicting one RV outcome from knowledge of second RV outcome. We found from marginals Regression line

Prediction of a RV outcome We could also have predicted X from Y = y by interchanging X and Y. If cov( X,Y ) = 0, then or X = x provides no information to predict Y i.e. X and Y are independent. If the cov is zero, the RV can still be dependent.

A standardized random variable A standardized RV is defined to be for which the mean is zero and the variance is one. Example: if X ~ Pois(λ), then Let’s find the best linear prediction of the standardized Y based on a standardized X s = x s. then and therefore

Previous example continued For the previous example we have then and so

Correlation coefficient The factor that scales x s to produce Y S is called correlation coefficient (CC) When X and Y have ρ X,Y ≠0, then X and Y are said to be correlated. If covariance is zero and hence ρ X,Y =0 then the RVs are said to be uncorrelated.

Property Property: Correlation coefficient is always less than or equal to one in magnitude or Proof: For RVs V and W the Cauchy-Schwarz inequality says that With equality if and only if W = cV for c a constant. Thus letting V = X – E X [X] and W = Y – E Y [Y], we have Equality will hold if and only if W = cV or equivalently if Y – E Y [Y] = c(X – E X [X]), which is easily shown to imply that

Correlation does not imply a causal relationship between RVs A frequent misapplication of probability is to assert that two quantities that are correlated (ρ X,Y ≠0 ) are such because one causes the other. Incidence of prostate cancer per 1000 individuals older than age 55 versus height. Correlation between two variables only indicates an association i.e. if one increases, then so does the other (or vice versa).

Joint Characteristic Functions For the RV X and Y it is defined as It is seen to be the 2D Fourier transform of the two- dimensional sequence p X,Y [ k,l ]. The joint moments are given by the formula Assuming both RV take on integer values, it is evaluated using as

Joint Characteristic Functions The import application is to finding the PMF for the sum of independent RVs X and Y then the joint characteristic function factors due to the property E X,Y [g(X)h(Y)] = E X [g(X)]E Y [h(Y)].

Joint Characteristic Functions If the joint CF factors, then X and Y are independent RVs. Consider the transformed RV W = g(X ) and Z = h(Y), where X and Y are independent. Prove that W and Z are independent as well. The joint CF of the transformed RVs is But we have that

Real-world Example: Assessing Health Risks Obesity is found to be associated with many life-threatening illnesses, especially diabetes. The definition of an obese person is given by the BMI as Where W and H is weight and height in inches. 25 < BMI < 30overweight 30 < BMIobese We will use a hypothetical population of college students. For this population we would like to know the probability of obese persons (i.e. 30 < BMI ).

Real-world Example: Assessing Health Risks

Computer simulation of Random Vectors If X and Y are independent, then we generate a realization of X according to p X [x i ] and a realization Y, according to p Y [y j ] Concatenating the realizations together we form the vector of random variables. j = 0j = 1 i = 01/8 i = 11/41/2

Computer simulation of Random Vectors Once the realization are available we can estimate the joint PMF and marginal PMFs And the joint moments estimated as

Practice problems Two coins are tossed in succession with a head being mapped into a +1 and a tail mapped into a -1. If a RV is defined as (X,Y) with X representing the mapping of the first toss and Y representing the mapping of the second toss, draw the mapping. Also, what is S X,Y ? (Hint: see slides 4,5). Two dice are tossed. The number of dots observed on the dice are added together to form the random variable X and also difference to form Y. Determine the possible outcomes of the random vector (X,Y) and plot them in the plane. How many possible outcomes are there? Is a valid joint PMF? For a given joint PMF is given find marginal probability

Practice problems Find a formula for var(X - Y) similar to What can you say about the relationship between var(X + Y) if X and Y are uncorrelated? Find the covariance for the joint PMF given in the table How do you know the value that you obtained is correct? 53

Homework 1 2) The values of a joint PMF are given below. Determine the marginal PMFs 3)

Homework 4) Prove that the minimum mean square error of the optimal linear predictor is given by 55