Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.

Chapter 4 DeGroot & Schervish

Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution. A random variable X with mean 2 has the same mean as the constant random variable Y such that Pr(Y = 2) = 1 even if X is not constant! To distinguish the distribution of X from the distribution of Y in this case, it might be useful to give some measure of how spread out the distribution of X is. The variance of X is one such measure. The standard deviation of X is the square root of the variance.

Stock Price Changes Consider the prices A and B of two stocks at a time one month in the future. Assume that A has the uniform distribution on the interval [25, 35] and B has the uniform distribution on the interval [15, 45]. Both stocks have a mean price of 30. But the distributions are very different.

Stock Price Changes

Variance/Standard Deviation Let X be a random variable with finite mean μ = E(X). The variance of X, denoted by Var(X), is defined as follows: The standard deviation of X is the nonnegative square root of Var(X) if the variance exists. When only one random variable is being discussed, it is common to denote its standard deviation by the symbol σ, and the variance is denoted by σ 2.

Stock Price Changes Return to the two random variables A and B in the example

Variance and Standard Deviation of a Discrete Distribution Suppose that a random variable X can take each of the five values −2, 0, 1, 3, and 4 with equal probability. E(X) = 1/5(−2 + 0 + 1+ 3 + 4) = 1.2. W = (X − μ) 2, Var(X) = E(W).

Properties of the Variance Theorem: Var(X) = 0 if and only if there exists a constant c such that Pr(X = c) = 1.

Properties of the Variance Theorem: For constants a and b, Y = aX + b, Var(Y ) = a 2 Var(X),and σ Y = |a|σ X.

Calculating the Variance and Standard Deviation of a Linear Function Suppose that a random variable X can take each of the five values −2, 0, 1, 3, and 4 with equal probability. Determine the variance and standard deviation of Y = 4X − 7. The mean of X is μ = 1.2 and the variance is 4.56 Var(Y ) = 16 Var(X) = 72.96. Also, the standard deviation σ of Y is σ Y = 4σ X = 4(4.56) 1/2 = 8.54.

For every random variable X, Var(X) = E(X 2 ) − [E(X)] 2.

Theorem If X1,..., Xn are independent random variables with finite means, then Var(X1 +... + Xn) = Var(X1) +... + Var(Xn).

The Variance of a Binomial Distribution Suppose that a box contains red balls and blue balls, and that the proportion of red balls is p (0 ≤ p ≤ 1). Suppose n balls is selected from the box with replacement. For i = 1,..., n, let Xi = 1 if the ith ball that is selected is red, and let Xi = 0 otherwise. If X denotes the total number of red balls in the sample, then X = X1 +... + Xn and X will have the binomial distribution with parameters n and p.

Since X1,..., Xn are independent, it follows from the theorem E(Xi) = p for i = 1,..., n. Since Xi 2 = Xi for each i, E(Xi 2 ) = E(Xi) = p. Var(Xi) = E(Xi 2 ) − [E(Xi)] 2 = p − p 2 = p(1− p). Var(X) = np(1− p).

Moments For a random variable X, the means of powers X k (called moments) for k >2 have useful theoretical properties, and some of them are used for additional summaries of a distribution. The moment generating function is a related tool

Existence of Moments For each random variable X and every positive integer k, the expectation E(X k ) is called the kth moment of X In particular, in accordance with this terminology, the mean of X is the first moment of X.

Existence of Moments Suppose that X is a random variable for which E(X)=μ. For every positive integer k, the expectation E[(X −μ) k ] is called the kth central moment of X or the kth moment of X about the mean. In particular, in accordance with this terminology, the variance of X is the second central moment of X.

Moment Generating Functions Let X be a random variable. For each real number t, ψ(t) = E(e tX ). The function ψ(t) is called the moment generating function (abbreviated m.g.f.) of X. The Moment Generating Function of X Depends Only on the Distribution of X: Since the m.g.f. is the expected value of a function of X, it must depend only on the distribution of X. If X and Y have the same distribution, they must have the same m.g.f.

Theorem LetX be a random variables whose m.g.f. ψ(t) is finite for all values of t in some open interval around the point t = 0. Then, for each integer n > 0, the nth moment of X, E(X n ), is finite and equals the nth derivative ψ (n) (t) at t = 0. That is, E(X n ) = ψ (n) (0) for n = 1, 2,....

Example

Properties of Moment Generating Functions Theorem Let X be a random variable for which the m.g.f. is ψ1; let Y = aX + b, where a and b are given constants; and let ψ2 denote the m.g.f. of Y. Then for every value of t such that ψ1(at) is finite, ψ2(t) = e bt ψ1(at).

Example

Theorem Suppose that X1,..., Xn are n independent random variables; and for i = 1,..., n, let ψi denote the m.g.f. of Xi. Let Y = X1+... + Xn, and let the m.g.f. of Y be denoted by ψ. Then for every value of t such that ψi(t) is finite for i = 1,..., n,

The Moment Generating Function for the Binomial Distribution Suppose that a random variable X has the binomial distribution with parameters n and p. The mean and the variance of X are determined by representing X as the sum of n independent random variables X1,..., Xn. The distribution of each variable Xi is as follows: Pr(Xi = 1) = p and Pr(Xi = 0) = 1− p. Now use this representation to determine the m.g.f. of X = X1 +... + Xn.

The Moment Generating Function for the Binomial Distribution

Uniqueness of Moment Generating Functions Theorem If the m.g.f.’s of two random variables X1 and X2 are finite and identical for all values of t in an open interval around the point t = 0, then the probability distributions of X1 and X2 must be identical.

The Additive Property of the Binomial Distribution If X1 and X2 are independent random variables, and if Xi has the binomial distribution with parameters ni and p (i = 1, 2), then X1 + X2 has the binomial distribution with parameters n1 + n2 and p.

The Mean and the Median Although the mean of a distribution is a measure of central location, the median is also a measure of central location for a distribution. Let X be a random variable. Every number m with the following property is called a median of the distribution of X: Pr(X ≤ m) ≥ 1/2 and Pr(X ≥ m) ≥ 1/2. Indeed, the 1/2 quantile is a median.

Example The Median of a Discrete Distribution: Suppose that X has the following discrete distribution: Pr(X = 1) = 0.1, Pr(X = 2) = 0.2, Pr(X = 3) = 0.3, Pr(X = 4) = 0.4. The value 3 is a median of this distribution because Pr(X ≤ 3) = 0.6, which is greater than 1/2, and Pr(X ≥ 3) = 0.7, which is also greater than 1/2. Furthermore, 3 is the unique median of this distribution.

Example A Discrete Distribution for Which the Median Is Not Unique: Suppose that X has the following discrete distribution: Pr(X = 1) = 0.1, Pr(X = 2) = 0.4, Pr(X = 3) = 0.3, Pr(X = 4) = 0.2. Pr(X ≤ 2) = 1/2, and Pr(X ≥ 3) = 1/2. Therefore, every value of m in the closed interval 2 ≤ m ≤ 3 will be a median of this distribution. The most popular choice of median of this distribution would be the midpoint 2.5.

Example The Median of a Continuous Distribution. Suppose that X has a continuous distribution for which the p.d.f. is as follows:

Mean Squared Error/M.S.E Suppose that X is a random variable with mean μ and variance σ 2. Suppose also that the value of X is to be observed in some experiment, but this value must be predicted before the observation can be made. One basis for making the prediction is to select some number d for which the expected value of the square of the error X − d will be a minimum. The number E[(X − d) 2 ] is called the mean squared error (M.S.E.) of the prediction d. The number d for which the M.S.E. is minimized is E(X).

Mean Absolute Error/M.A.E. Another possible basis for predicting the value of a random variable X is to choose some number d for which E(|X − d|) will be a minimum. The M.A.E. is minimized when the chosen value of d is a median of the distribution of X.

Predicting a Discrete Uniform Random Variable. Suppose that the probability is 1/6 that a random variable X will take each of the following six values: 1, 2, 3, 4, 5, 6. Determine the prediction for which the M.S.E. is minimum and the prediction for which the M.A.E. is minimum. In this example, E(X) = 1/6(1+ 2 + 3 + 4 + 5 + 6) = 3.5. Therefore, the M.S.E. will be minimized by the unique value d = 3.5. Also, every number m in the closed interval 3 ≤ m ≤ 4 is a median of the given distribution. Therefore, the M.A.E. will be minimized by every value of d such that 3 ≤ d ≤ 4. Because the distribution of X is symmetric, the mean of X is also a median of X.

Covariance and Correlation When we are interested in the joint distribution of two random variables, it is useful to have a summary of how much the two random variables depend on each other. The covariance and correlation are attempts to measure that dependence, but they only capture a particular type of dependence, namely linear dependence.

Covariance Let X and Y be random variables having finite means. Let E(X) = μ X and E(Y) = μ Y. The covariance of X and Y, which is denoted by Cov(X,Y), is defined as Cov(X, Y ) = E[(X − μX)(Y − μY )]

Example Let X and Y have the joint p.d.f. f:

Theorem For all random variables X and Y Cov(X, Y ) = E(XY) − E(X)E(Y). Proof Cov(X, Y ) = E(XY − μ X Y − μ Y X + μ X μ Y ) = E(XY) − μ X E(Y) − μ Y E(X) + μ X μ Y.

Correlation Let X and Y be random variables with finite variances σ X 2 and σ Y 2, respectively. Then the correlation of X and Y, which is denoted by ρ(X, Y), is defined as follows:

Theorem

Properties of Covariance and Correlation If X and Y are independent random variables Cov(X, Y ) = ρ(X, Y) = 0. Proof If X and Y are independent, then E(XY) = E(X)E(Y). Cov(X, Y ) = 0. Also, it follows that ρ(X, Y) = 0.

Theorem Suppose that X is a random variable and Y = aX + b. If a>0, then ρ(X, Y) = 1. If a <0, then ρ(X, Y)=−1. Since σ Y = |a|σ X, the theorem follows from Correlation equation.

Theorem If X and Y are random variables Var(X + Y) = Var(X) + Var(Y ) + 2 Cov(X, Y ).

Theorem

Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.

Similar presentations

Presentation on theme: "Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.

Similar presentations

Presentation on theme: "Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution."— Presentation transcript:

Similar presentations

About project

Feedback