1 Multivariable Distributions ch4. 2  It may be favorable to take more than one measurement on a random experiment.  The data may then be collected.

Slides:



Advertisements
Similar presentations
MOMENT GENERATING FUNCTION AND STATISTICAL DISTRIBUTIONS
Advertisements

Chapter 5 Discrete Random Variables and Probability Distributions
Chapter 2 Discrete Random Variables
Lecture note 6 Continuous Random Variables and Probability distribution.
Chapter 4 Discrete Random Variables and Probability Distributions
1 Def: Let and be random variables of the discrete type with the joint p.m.f. on the space S. (1) is called the mean of (2) is called the variance of (3)
Normal Distribution ch5.
Probability Densities
Today Today: More of Chapter 2 Reading: –Assignment #2 is up on the web site – –Please read Chapter 2 –Suggested.
Discrete Random Variables and Probability Distributions
1 Sampling Distribution Theory ch6. 2 F Distribution: F(r 1, r 2 )  From two indep. random samples of size n 1 & n 2 from N(μ 1,σ 1 2 ) & N(μ 2,σ 2 2.
1 Continuous Distributions ch3. 2   A random variable X of the continuous type has a support or space S that is an interval(possibly unbounded) or a.
Chapter 6 Continuous Random Variables and Probability Distributions
Probability Distributions
1 Multivariate Distributions ch4. 2 Multivariable Distributions  It may be favorable to take more than one measurement on a random experiment. –The data.
Sections 4.1, 4.2, 4.3 Important Definitions in the Text:
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
Probability Distributions Random Variables: Finite and Continuous Distribution Functions Expected value April 3 – 10, 2003.
Continuous Random Variables and Probability Distributions
Chapter 4: Joint and Conditional Distributions
Chapter 5 Continuous Random Variables and Probability Distributions
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Continuous Random Variables and Probability Distributions.
1 Sampling Distribution Theory ch6. 2  Two independent R.V.s have the joint p.m.f. = the product of individual p.m.f.s.  Ex6.1-1: X1is the number of.
Variance Fall 2003, Math 115B. Basic Idea Tables of values and graphs of the p.m.f.’s of the finite random variables, X and Y, are given in the sheet.
Simple Linear Regression and Correlation
Random Variable and Probability Distribution
Lecture II-2: Probability Review
Jointly distributed Random variables
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Chapter 5 Discrete Random Variables and Probability Distributions ©
Probability theory 2 Tron Anders Moger September 13th 2006.
Continuous Probability Distributions  Continuous Random Variable  A random variable whose space (set of possible values) is an entire interval of numbers.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Theory of Probability Statistics for Business and Economics.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Two Functions of Two Random.
1 7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745
Review of Probability Concepts ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
1 Lecture 14: Jointly Distributed Random Variables Devore, Ch. 5.1 and 5.2.
Multiple Random Variables Two Discrete Random Variables –Joint pmf –Marginal pmf Two Continuous Random Variables –Joint Distribution (PDF) –Joint Density.
1 Since everything is a reflection of our minds, everything can be changed by our minds.
Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Chapter 5a:Functions of Random Variables Yang Zhenlin.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
TRANSFORMATION OF FUNCTION OF A RANDOM VARIABLE
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Continuous Random Variables and Probability Distributions
Distributions of Functions of Random Variables November 18, 2015
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
1 Probability: Introduction Definitions,Definitions, Laws of ProbabilityLaws of Probability Random VariablesRandom Variables DistributionsDistributions.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
The simple linear regression model and parameter estimation
MECH 373 Instrumentation and Measurements
3.1 Expectation Expectation Example
How accurately can you (1) predict Y from X, and (2) predict X from Y?
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
ASV Chapters 1 - Sample Spaces and Probabilities
Chapter 2. Random Variables
Discrete Random Variables and Probability Distributions
Presentation transcript:

1 Multivariable Distributions ch4

2  It may be favorable to take more than one measurement on a random experiment.  The data may then be collected in pairs of (x i, y i ).  Def.4.1-1: X & Y are two discrete R.V. defined over the support S. The probability that X=x, Y=y is denoted as f(x,y)=P(X=x,Y=y). f(x,y) is the joint probability mass function (joint p.m.f.) of X and Y:  0≤f(x,y)≤1; ΣΣ (x,y) ∈ S f(x,y)=1; P[(X,Y) ∈ A]=ΣΣ (x,y) ∈ A f(x,y), A ⊆ S.

3 Illustration Example  Ex.4.1-3: Roll a pair of dice: X is the smaller and Y is the larger.  The outcome is (3, 2) or (2, 3) ⇒ X=2 & Y=3 with 2/36 probability.  The outcome is (2, 2) ⇒ X=2 & Y=2 with 1/36 probability.  Thus, the joint p.m.f. of X and Y is 11/3662/362/362/362/362/361/369/3652/362/362/362/361/36 7/3642/362/362/361/36 5/3632/362/361/36 3/3622/361/36 1/3611/ /369/367/365/363/361/36 Marginal p.m.f. y x

4 Marginal Probability and Independence  Def.4.1-2: X and Y have the joint p.m.f f(x,y) with space S.  The marginal p.m.f. of X is f 1 (x)=Σ y f(x,y)=P(X=x), x ∈ S 1.  The marginal p.m.f. of Y is f 2 (y)=Σ x f(x,y)=P(Y=y), y ∈ S 2.  X and Y are independent iff P(X=x, Y=y)=P(X=x)P(Y=y), namely, f(x,y)=f 1 (x)f 2 (y), x ∈ S 1, y ∈ S 2.  Otherwise, X and Y are dependent.  X and Y in Ex5.1-3 are dependent: 1/36=f(1,1) ≠ f 1 (1)f 2 (1)=11/36*1/36.  Ex4.1-4: The joint p.m.f. f(x,y)=(x+y)/21, x=1,2,3, y=1,2.  Then, f 1 (x)=Σ y=1~2 (x+y)/21=(2x+3)/21, x=1,2,3.  Likewise, f 2 (1)=Σ x=1~3 (x+y)/21=(6+3y)/21, y=1,2.  Since f(x,y)≠f 1 (x)f 2 (y), X and Y are dependent.  Ex4.1-6: f(x,y)=xy 2 /13, (x,y)=(1,1),(1,2),(2,2).

5 Quick Dependence Checks  Practically, “dependence” can be quickly determined if  The support of X and Y is NOT rectangular, or  S is therefore not the product set {(x,y): x ∈ S 1, y ∈ S 2 }, as in Ex  f(x,y) cannot be factored (separated) into the product of an x-alone expression and a pure y function.  In Ex4.1-4, f(x,y) is a sum, not a product, of x-alone and y-alone functions.  Ex4.1-7: [Probability Histogram for a joint p.m.f.]

6 Mathematical Expectation  If u(X 1,X 2 ) is a function of two R.V. X 1 & X 2, then if it exists, is called the mathematical expectation (or expected value) of u(X 1,X 2 ).  The mean of Xi, i=1,2:  The variance of X i :  Ex4.1-8: A player selects a chip from a bowl having 8 chips: 3 marked (0,0), 2 (1,0), 2 (0,1), 1 (1,1).

7 Probability Density Function Joint  Joint Probability Density Function, joint p.d.f., of two continuous-type R.V. X & Y, is an integrable function f(x,y):  f(x,y)≥0; ∫ y=-∞~∞ ∫ x=-∞~∞ f(x,y)dxdy=1;  P[(X,Y) ∈ A]=∫∫ A f(x,y)dxdy, for an event A.  Ex4.1-9: X and Y have the joint p.d.f.  A={(x,y): 0<x<1, 0<y<x}.  The respective marginal p.d.f.s are X and Y are independent!

8 Independence of Continuous Type R.V.s  Two continuous type R.V. X and Y are independent iff the joint p.d.f. factors into the product of their marginal p.d.f.s.  Ex4.1-10: X and Y have the joint p.d.f.  The support S={(x,y): 0≤x≤y≤1}, bounded by x=0, y=1, x=y lines.  The marginal p.d.f.s are  Various expected values: X and Y are dependent!

9 Multivariate Hypergeometric Distribution  Ex4.1-11: Of 200 students, 40 have As, 60 Bs; 100 Cs, Ds, or Fs.  A sample of size 25 is taken at random without replacement.  X 1 is the number of A students, X 2 is the number of B students, and  25 –X 1 –X 2 is the number of the other students.  The space S = {(x 1,x 2 ): x 1,x 2 ≥0, x 1 +x 2 ≤25}.  The marginal p.m.f. of X 1 can be also obtained as: X 1 and X 2 are dependent! From the knowledge of the model.

10 Binomial ⇒ Trinomial Distribution  Trinomial Distribution: The experiment is repeated n times.  The probability p 1 : perfect, p 2 : second; p 3 : defective, p 3 =1-p 1 -p 2.  X 1 : the number of perfect items, X 2 for second, X 3 for defective.  The joint p.m.f. is  X 1 is b(n,p 1 ), X 2 is b(n,p 2 ); both are dependent.  Ex4.1-13: In manufacturing a certain item,  95% of the items are good; 4% are “seconds”, and 1% defective.  An inspector observes n=20 items selected at random, counting the number X of seconds, and the number Y of defectives.  The probability that at least 2 seconds or at least 2 defective items are found, namely A={(x,y): x≥2 or y≥2}, is

11 Correlation Coefficient  For two R.V. X 1 & X 2,  The mean of X i, i=1,2:  The variance of X i :  The covariance of X 1 & X 2 is  The correlation coefficient of X 1 & X 2 is  Ex4.2-1: X 1 & X 2 have the joint p.m.f. → Not a product ⇒ Dependent!

12 Insights of the Meaning of ρ  Among all points in S, ρ tends to be positive if more points are simultaneously above or below their respective means with larger probability.  The least-squares regression line is a line passing given (μ x,μ y ) with the best slope b s.t. K(b)=E{[(Y-μ y )-b(X-μ x )] 2 } is minimized.  The square of the vertical distance from a point to the line.  ρ= ±1: K(b)=0 ⇒ all the points lie on the least-squares regression line.  ρ= 0: K(b)=σ y 2, the line is y=μ y ; X and Y could be independent!!  ρmeasures the amount of linearity in the probability distribution.

13 Example  Ex4.2-2: Roll a pair of 4-sided die: X is the number of ones, Y is the number of twos and threes.  The joint p.m.f. is  The line of best fit is

14 Independence ⇒ ρ=0  The converse is not necessarily true!  Ex4.2-3: The joint p.m.f. of X and Y is f(x,y)=1/3, (x,y)=(0,1), (1,0), (2,1).  Obviously, the support is not “rectangular”, so X and Y are dependent.  Empirical Data: from n bivariate observations: (x i,y i ), i=1..n.  We can compute the sample mean and variance for each variate.  We can also compute the sample correlation coefficient and the sample least squares regression line.  We can also compute the sample correlation coefficient and the sample least squares regression line. (Ref. p.241) ∵ independence

15 Conditional Distributions  Def.4.3-1: The conditional probability mass function of X, given that Y=y, is defined by g(x|y)=f(x,y)/f 2 (y), if f 2 (y)>0.  Likewise, h(y|x)=f(x,y)/f 1 (x), if f 1 (x)>0.  Ex.4.3-1: X and Y have the joint p.m.f f(x,y)=(x+y)/21, x=1,2,3; y=1,2.  f 1 (x)=(2x+3)/21, x=1,2,3; f 2 (y)=(3y+6)/21, y=1,2.  Thus, given Y=y, the conditional p.m.f. of X is  When y=1, g(x|1)=(x+1)/9, x=1,2,3; g(1|1):g(2|1):g(3|1)=2:3:4.  When y=2, g(x|2)=(x+2)/12, x=1,2,3; g(1|2):g(2|2):g(3|2)=3:4:5.  Similar relationships about h(y|x) can be obtained. Dependent!

16 Conditional Mean and Variance  The conditional mean of Y, given X=x, is  The conditional variance of Y, given X=x, is  Ex.4.3-2: [from Ex.4.3-1] X and Y have the joint p.m.f f(x,y)=(x+y)/21, x=1,2,3; y=1,2.

17 Relationship about Conditional Mean  The point (μ X,μ Y ) locates on the above two lines, and is their junction.  The product of the slopes is ρ 2.  The ratio of the slopes is These relations can derive the unknown from the others known.

18 Example  Ex.4.3-3: X and Y have the trinomial p.m.f. with n, p 1, p 2, p 3 =1-p 1 -p 2  They have the marginal p.m.f. b(n, p 1 ), b(n, p 2 ), so

19 Example for Continuous-type R.V.  Ex4.3-5: [From Ex4.1-10] ⇒ The conditional distribution of Y given X=x is U(x,1). [U(a,b) has mean (b+a)/2, and variance (b-a) 2 /12.]

20 Bivariate Normal Distribution  The joint p.d.f of X : N(μ X,σ X 2 )and Y : N(μ Y,σ Y 2 ) is  Therefore, A linear function of x.A constant w.r.t. x.

21 Examples  Ex.5.6-1:  Ex.5.6-2

22 Bivariate Normal:ρ=0 ⇒ Independence  Thm5.6-1: For X and Y with a bivariate normal distribution with ρ, X and Y are independent iffρ=0.  So are trivariate and multivariate normal distributions.  When ρ=0,

23 Transformations of R.V.s  In Section 3.5, the transformation of a single variable X with f(x) to another Y=v(X), an increasing or decreasing fn, can be done as:  Ex.4.4-1: X: b(n,p), Y=X 2, if n=3, p=1/4, then  What is the transformation u(X/n) leading to a variance free of p? Taylor’s expansion about p:  Ex: X: b(100,1/4) or b(100,9/10). Continuous type Discrete type When the variance is constant, or free of p,

24 Multivariate Transformations  When the function Y=u(X) does not have a single-valued inverse, it needs to consider possible inverse functions individually.  Each range will be delimited to match the right inverse.  For multivariate, the derivative is replaced by the Jacobian.  Continuous R.V. X 1 and X 2 have the joint p.d.f. f(x 1, x 2 ).  If has the single-valued inverse then the joint p.d.f. of Y 1 and Y 2 is  [Most difficult] The mapping of the supports are considered.

25 Transformation to the Independent  Ex4.4-2: X 1 and X 2 have the joint p.d.f. f(x 1, x 2 )=2, 0<x 1 <x 2 <1.  Consider Y 1 =X 1 /X 2, Y 2 =X 2 :  The mapping of the supports:  The marginal p.d.f.:  ∵ g(y 1,y 2 )=g 1 (y 1 )g 2 (y 2 ) ∴ Y 1,Y 2 Independent. →

26 Transformation to the Dependent  Ex4.4-3: X 1 and X 2 are indep., each with p.d.f. f(x)=e -x, 0<x<∞.  Their joint p.d.f. f(x 1, x 2 )= e -x1 e -x2, 0<x 1 <∞, 0<x 2 <∞.  Consider Y 1 =X 1 -X 2, Y 2 =X 1 -X 2 :  The mapping of the supports:  The marginal p.d.f.:  ∵ g(y 1,y 2 ) ≠g 1 (y 1 )g 2 (y 2 ) ∴ Y 1,Y 2 Dependent. → Double exponential p.d.f.

27 Beta Distribution  Ex4.4-4: X 1 and X 2 have indep. Gamma distributions withα,θ and β, θ. Their joint p.d.f. is  Consider Y 1 =X 1 /(X 1 +X 2 ), Y 2 =X 1 +X 2 : i.e., X 1 =Y 1 Y 2, X 2 =Y 2 -Y 1 Y 2.  The marginal p.d.f.:  ∵ g(y 1,y 2 )=g 1 (y 1 )g 2 (y 2 ) ∴ Y 1,Y 2 Independent. Beta p.d.f. Gamma p.d.f.

28 Box-Muller Transformation Box-Muller Transformation  Ex5.3-4: X 1 and X 2 have indep. Uniform distributions U(0,1).  Consider  Two indep. U(0,1) ⇒ two indep. N(0,1)!!

29 Distribution Function Technique  Ex.5.3-5: Z is N(0,1), U is χ 2 (r), Z and U are independent.  The joint p.d.f. of Z and U is χ 2 (r+1)

30 Another Example  Ex.4.4-5: U: χ 2 (r 1 ) and V: χ 2 (r 2 ) are independent.  The joint p.d.f. of Z and U is  The knowledge of known distributions and their associated integration relationships are useful to derive the distributions of unknown distributions. χ 2 (r 1 +r 2 )

31 Order Statistics  The order statistics are the observations of the random sample arranged in magnitude from the smallest to the largest.  Assume there is no tie: identical observations.  Ex6.9-1: n=5 trials: {0.62, 0.98, 0.31, 0.81, 0.53} for the p.d.f. f(x)=2x, 0<x<1. The order statistics are {0.31, 0.53, 0.62, 0.81, 0.98}.  The sample median is 0.62, and the sample range is =0.67.  Ex6.9-2: Let Y 1 <Y 2 <Y 3 <Y 4 <Y 5 be the order statistics for X 1, X 2, X 3, X 4, X 5, each from the p.d.f. f(x)=2x, 0<x<1.  Consider P(Y 4 <1/2) ≡at least 4 of X i ’s must be less than 1/2: 4 successes.

32 General Cases  The event that the rth order statistic Y r is at most y, {Y r ≤y}, can occur iff at least r of the n observations are no more than y.  The probability of “success” on each trial is F(y).  We must have at least r successes. Thus,

33 Alternative Approach  A heuristic approach to obtain g r (y):  Within a short interval Δy:  There are (r-1) items fall less than y, and (n-r) items above y+Δy.  The multinomial probability with n trials is approximated as.  Ex5.9-3: (from Ex6.9-2) Y 1 <Y 2 <Y 3 <Y 4 <Y 5 are the order statistics for X 1, X 2, X 3, X 4, X 5, each from the p.d.f. f(x)=2x, 0<x<1. On a single trial

34 More Examples  Ex: 4 indep. Trials(Y 1 ~ Y 4 ) from a distribution with f(x)=1, 0<x<1.  Find the p.d.f. of Y 3.  Ex: 7 indep. trials(Y 1 ~ Y 7 ) from a distribution f(x)=3(1-x) 2, 0<x<1.  Find the p.d.f. of the sample median, i.e. Y 4, is less than  Method 1: find g 4 (y), then  Method 2: find then By Table II on p.647.

35 Order Statistics of Uniform Distributions  Thm3.5-2: if X has a distribution function F(X), which has U(0,1). {F(X 1 ),F(X 2 ),…,F(X n )} ⇒ Wi’s are the order statistics of n indep. observations from U(0,1).  The distribution function of U(0,1) is G(w)=w, 0<w<1.  The p.d.f. of the rth order statistic W r =F(Y r ) is ⇒ Y’s partition the support of X into n+1 parts, and thus n+1 areas under f(x) and above the x-axis.  Each area equals 1/(n+1) on the average. p.d.f. Beta

36 Percentiles Percentiles  The (100p) th sample percentile π p is defined s.t. the area under f(x) to the left of π p is p.  Therefore, Y r is the estimator of π p, where r=(n+1)p.  In case (n+1)p is not an integer, a (weighted) average of Y r and Y r+1 can be used, where r=floor[(n+1)p].  The sample median is  Ex6.9-5: X is the weight of soap; n=12 observations of X is listed:  1013, 1019, 1021, 1024, 1026, 1028, 1033, 1035, 1039, 1040, 1043,  ∵ n=12, the sample median is  ∵ (n+1)(0.25)=3.25, the 25 th percentile or first quartile is  ∵ (n+1)(0.75)=9.75, the 75 th percentile or third quartile is  ∵ (n+1)(0.6)=7.8, the 60 th percentile

37 Another Example  Ex: The order statistics of 13 indep. Trials(Y 1 <Y 2 < …< Y 13 ) from a continuous type distribution with the 35 th percentile π  Ex5.6-7: The order statistics of 13 indep. Trials(Y 1 <Y 2 < …< Y 13 ) from a continuous type distribution with the 35 th percentile π  Find P(Y 3 < π 0.35 < Y 7 )  The event {Y 3 < π 0.35 < Y 7 } happens iff there are at least 3 but less than 7 “successes”, where the success probability is p=0.35. By Table II on p.677~681. Success