Expectations of Continuous Random Variables
Introduction The expected value E[X] for a continuous RV is motivated from the analogous definition for a discrete RV. Based on the PDF description of a continuous RV, the expected value is defined and its properties explored. 2
Determining the Expected Value The expected value for discrete RV X was defined as In the case of continuous RV, S X is not countable. If X ~ U(0, 1), let’s find it’s average by approximating the PDF by a uniform PMF, using a fine partitioning of the interval (0, 1). For i = 1, 2,…, M and with Δx = 1/M, we have 3
Determining the Expected Value But so that and as M ∞ we have E[X] ½, as expected. To extend this results to more general PDFs we first note that But and as Δx 0, this is the probability per unit length, i.e. PDF at x = x i. 4
Determining the Expected Value In this example, p X (x i ) does not depend on the interval center x i, so that the PDF is uniform p X (x) = 1, for 0 < x < 1. Thus, as Δ x 0 and this becomes the integral where p X (x) = 1 for 0 < x < 1 and is zero otherwise. In general, the expected value for a continuous RV X is defined as 5
Expected value for RV with a nonuniform PDF The PDF is given by p X (x) = 1/2 then the expected value is 6 Example
Not all PDFs have expected values Before computing E[X] we must make sure that it exists. Not all integrals of the form even if. Example then but 7
Not all PDFs have expected values A more subtle and surprising example is the Cauchy PDF. Since the PDF is symmetric about x = 0, we would expect that E[X] = 0. However, by correctly interpreting the region of integrating in a limiting sense, we have 8 see next slide
Not all PDFs have expected values But for a Cauchy PDF Hence, if the limits are taken independently, the the result is indeterminate. To make the expected value useful in practice the independent choice of limits (and not L = U ) is necessary. 9
Not all PDFs have expected values The indeterminacy can be avoided, if we require “absolute convergence” Hence, E[X] is defined to exit if E[|X|] < ∞. Because of slow decay of the “tails” of the PDF ( 1/x 2 ), very large outcomes are possible. 10
Mean value is a best predictor The expected value is the best guess of the outcome of the RV. By “best” we mean that the use of b = E[X] as our estimator. This estimator minimizes the mean square error, which is defined as See Problem 11.5 for more. 11
Expected Values for Important PDFs Uniform If X ~ U(a, b), E[X] = (a + b)/2, or the mean lies at the midpoint of the interval. See problem 11.8 for more. Exponential If X ~ N(μ, σ 2 ), then since the PDF is symmetric about the point x = μ, we know that E[X] = μ. We can also derive the mean as follows 12 Letting u = x – μ in the first integral we have 0 1
Expected Values for Important PDFs Laplacian is symmetric about x = 0 (and the expected value exists – needed to avoid the situation of the Cauchy PDF), we must have E[X] = 0. Gamma If X ~ Γ(α,λ), then To evaluate this integral we attempt to modify the integrand so that it becomes the PDF of a Γ(α /,λ / ) RV. Then we can equate the integral to one. 13
Expected Values for Important PDFs Rayleigh. It can be shown that Chi-square is a Gamma distribution with α = N/2, and λ = ½ thus 14
Expected Value for a Function of a RV If Y = g ( X ), where X is a continuous RV, then assuming that Y is also a continuous RV with PDF p Y (y), we have the definition of E [ X ] We can use Y = g ( X ) directly 15
Partial Proof 16 For simplicity assume that Y = g(X) is a continuous RV with PDF p Y (y). Also, assume that Y = g(X) is monotonically increasing so that it has a single solution to the equation y = g(x) for all y. Monotonically increasing function used to derive E[g(X)]
Partial Proof 17 Next change variables from y to x using x = g -1 (y). Since we have assumed that g(x) is monotonically increasing, the limits for y of ±∞ also become ±∞ for x. Then, since x = g -1 (y), we have yp X (g -1 (y)) becomes g(x)p X (x) and g is monotonically increasing implies g -1 derivative is positive From which result follows
Expectation of linear (affine) function If Y = aX + b, then since g(x) = ax + b, we have Thus More generally, it is easily shown that 18
Power of N(0, 1) RV If X ~ N(0, 1) and Y = X 2, consider E[Y] = E[X 2 ]. The quantity E[X 2 ] is the average squared value of X and can be interpreted physically as a power. If X is a voltage across a 1 ohm resistor, then X 2 is the power and therefore E[X 2 ] is the average power. 19 We use integration by parts with U = x, dU = dx, and Using L’Hospital’s rule we get 0. Evaluates to ½, half of PDF
Expected value of indicator RV An indicator function indicates whether a point is in a given set. Example: If the set A = [3, 4], then the indicator function is defined as 20
Expected value of indicator RV I A (x) may be thought of as a generalization of the unit step function since if u(x) = 1 for x ≥ 0 and zero otherwise, we have If X is a RV, then I A (x) is transformed RV that takes on values 1 and 0, depending upon whether the outcome belong to A or not. Its expectation is given by The expected value of the indicator RV is the prob. of the set or event. 21
Expected value of indicator RV: Example Consider the estimation of P[3 ≤ X ≤ 4]. But this is just E[I A (X)] ! To estimate the expected value of a transformed RV, we use observed x 1, x 2,…,x M, then transform each one to the new RV and finally compute the sample mean for our estimate using This is what we have been using all long, since counts all the outcomes for which 3 ≤ x ≤ 4. The indictor function provides a means to connect the expected value with the probability. 22
Variance of a continuous RV The variance of a continuous RV, as for discrete RV measures the average squared deviation from the mean. Variance of a Normal RV: Lets find variance of a N (μ,σ 2 ) RV 23 Recall that E[X] = μ Letting u = (x - μ)/σ produces
Variance of a Normal RV It’s common to refer the square-root of the variance as the standard deviation σ. σ indicates how close outcomes tend to cluster about the mean. If RV is N (μ,σ 2 ), then 68.2% of the outcomes will be within the interval [μ - σ, μ + σ], 95.5% will be within [μ - 2σ, μ + 2σ] and 99.8% will be within [μ - 3σ, μ +3 σ] 24
Variance of a uniform RV If X ~ U(a, b), then and letting u = x – (a + b)/2, we have 25
Properties of Variance The variance of a continuous RV enjoys the same properties as for a discrete RV. An alternate form of variance computation is If c is a constant then The variance is a nonlinear type of operation in that 26
Moments E[X n ] is termed the nth moment and it is defined to exist if E[|X| n ] < ∞. If it is know that E[X s ] exists, then it can be shown that E[X r ] exists for r < s. Also E[X r ] is known not to exist, then E[X s ] cannot exist for s > r. 27
Moments of an exponential RV For X ~ exp(λ) that To evaluate this first show how the nth moment can be written recursively in terms of the (n - 1) st moment. Knowing that E[X] = 1/λ and applying integration by parts will yield the recursive formula 28
Moments of an exponential RV Thus, U = x n and dV = λexp(-λx)dx so dU = nx n-1 dx and V = - exp(-λx), we have 29 Hence, the nth moment can be written in term of the (n - 1)st moment. Since we know that E[X] = 1/λ, we have
Moments of an exponential RV and in general The variance can be found to be var( X ) = 1/ λ 2 using this results. 30
Central moments Often it is important to be able to compute moments about some point. Usually this point is mean. The nth central moment about the point E[X] is defined as E[(X – E[X]) n ]. 31 or finally we have that
Properties of continuous RVs 32
Pafnuty Chebyshev Pafnuty Lvovich Chebyshev was a Russian mathematician. Chebyshev studied at the college level at Moscow University, where he earned his bachelor's degree in After Chebyshev became a professor of mathematics in Moscow himself, his two most illustrious graduate students were Andrei Andreyevich Markov (the elder) and Alexandr Lyapunov. Chebyshev is known for his work in the fields of probability, statistics, mechanics, and number theory. 33
Chebyshev Inequality The mean and variance of a RV indicate the average value and variability of the outcomes of a repeated experiment. However, they are not sufficient to determine probabilities of events. 34 (Gaussian)(Laplacian) Both have E[X] = 0 (due to symmetry about x = 0 ) and var(X) = 1. Yet, the probability of a given interval can be very different. Although the relationship between the mean and variance, and the probability of an event in not a direct one, we can still obtain some information about the probabilities based on the mean and variance.
Chebyshev Inequality It is possible to bound the probability or the be able to assert that This is useful if we only wish to make sure the probability is below a certain value, without having to find the probability. 35
Example: Chebyshev Inequality If the probability of a speech signal of mean 0 and variance 1 exceeding a given magnitude γ is to be no more than 1%, then we would be satisfied if we could determine a γ so that Lets show that the probability of the event |X – E[X]| > γ can be bounded if we know the mean and variance. 36 The PDF does not need to be known!
Chebyshev Inequality Using the definition of the variance we have 37 So that we have Chebyshev inequlity
Chebyshev Inequality: Example P [ |X| > γ ] for Gaussianl and Laplacian RV with zero mean and unity variance compared to Chebyshev inequality 38
Problems Problem 1. Prove that if the PDF is symmetric about a point x = a, which is to say that it satisfy p X (a + u) = p X (a - u) for all -∞ < u < ∞, then the mean will be a. Hint: Write the integral As And then let u = x – a in the first integral and u = a – x in the second integral. Problem 2. Find the mean for a uniform PDF. Do so by using the definition and then rederive it using the results form problem 1. 39
Problems Problem 3. Prove that the best prediction of the outcome of a continuous RV is its mean. Best is to be interpreted as the value that minimizes the mean square error mse(b) = E[(X - b) 2 ]. Problem 4. Determine the mean of chi-square PDF. 40