Expectations of Continuous Random Variables. Introduction The expected value E[X] for a continuous RV is motivated from the analogous definition for a.

Slides:



Advertisements
Similar presentations
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Advertisements

Random Variable A random variable X is a function that assign a real number, X(ζ), to each outcome ζ in the sample space of a random experiment. Domain.
Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.
Prof. Bart Selman Module Probability --- Part d)
SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:
Evaluating Hypotheses
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
Week 51 Theorem For g: R  R If X is a discrete random variable then If X is a continuous random variable Proof: We proof it for the discrete case. Let.
Copyright © Cengage Learning. All rights reserved. 4 Continuous Random Variables and Probability Distributions.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics, 2007 Instructor Longin Jan Latecki Chapter 7: Expectation and variance.
Review of Probability.
Chapter 12 Review of Calculus and Probability
Tch-prob1 Chap 3. Random Variables The outcome of a random experiment need not be a number. However, we are usually interested in some measurement or numeric.
CHAPTER 4 Multiple Random Variable
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
1 7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology The Weak Law and the Strong.
Continuous Distributions The Uniform distribution from a to b.
Multiple Random Variables Two Discrete Random Variables –Joint pmf –Marginal pmf Two Continuous Random Variables –Joint Distribution (PDF) –Joint Density.
1 Two Functions of Two Random Variables In the spirit of the previous lecture, let us look at an immediate generalization: Suppose X and Y are two random.
One Random Variable Random Process.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Two Random Variables.
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.
1 8. One Function of Two Random Variables Given two random variables X and Y and a function g(x,y), we form a new random variable Z as Given the joint.
1 3. Random Variables Let ( , F, P) be a probability model for an experiment, and X a function that maps every to a unique point the set of real numbers.
1 3. Random Variables Let ( , F, P) be a probability model for an experiment, and X a function that maps every to a unique point the set of real numbers.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Multiple Discrete Random Variables. Introduction Consider the choice of a student at random from a population. We wish to know student’s height, weight,
Chapter 5a:Functions of Random Variables Yang Zhenlin.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Expected values of discrete Random Variables. The function that maps S into S X in R and which is denoted by X(.) is called a random variable. The name.
Fourier series, Discrete Time Fourier Transform and Characteristic functions.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Topic 5: Continuous Random Variables and Probability Distributions CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text,
1 Probability and Statistical Inference (9th Edition) Chapter 5 (Part 2/2) Distributions of Functions of Random Variables November 25, 2015.
Continuous Random Variables
Discrete Random Variables. Introduction In previous lectures we established a foundation of the probability theory; we applied the probability theory.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Probability and Moment Approximations using Limit Theorems.
Section 10.5 Let X be any random variable with (finite) mean  and (finite) variance  2. We shall assume X is a continuous type random variable with p.d.f.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
One Function of Two Random Variables
ELEC 303, Koushanfar, Fall’09 ELEC 303 – Random Signals Lecture 9 – Continuous Random Variables: Joint PDFs, Conditioning, Continuous Bayes Farinaz Koushanfar.
Chapter 6 Large Random Samples Weiqi Luo ( 骆伟祺 ) School of Data & Computer Science Sun Yat-Sen University :
Virtual University of Pakistan Lecture No. 26 Statistics and Probability Miss Saleha Naghmi Habibullah.
Random Variables By: 1.
Week 61 Poisson Processes Model for times of occurrences (“arrivals”) of rare phenomena where λ – average number of arrivals per time period. X – number.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
ASV Chapters 1 - Sample Spaces and Probabilities
Expectations of Random Variables, Functions of Random Variables
Chapter 7: Sampling Distributions
3.1 Expectation Expectation Example
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
How accurately can you (1) predict Y from X, and (2) predict X from Y?
Virtual University of Pakistan
8. One Function of Two Random Variables
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
Chapter 2. Random Variables
copyright Robert J. Marks II
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
8. One Function of Two Random Variables
Continuous Distributions
Presentation transcript:

Expectations of Continuous Random Variables

Introduction The expected value E[X] for a continuous RV is motivated from the analogous definition for a discrete RV. Based on the PDF description of a continuous RV, the expected value is defined and its properties explored. 2

Determining the Expected Value The expected value for discrete RV X was defined as In the case of continuous RV, S X is not countable. If X ~ U(0, 1), let’s find it’s average by approximating the PDF by a uniform PMF, using a fine partitioning of the interval (0, 1). For i = 1, 2,…, M and with Δx = 1/M, we have 3

Determining the Expected Value But so that and as M  ∞ we have E[X]  ½, as expected. To extend this results to more general PDFs we first note that But and as Δx  0, this is the probability per unit length, i.e. PDF at x = x i. 4

Determining the Expected Value In this example, p X (x i ) does not depend on the interval center x i, so that the PDF is uniform p X (x) = 1, for 0 < x < 1. Thus, as Δ x  0 and this becomes the integral where p X (x) = 1 for 0 < x < 1 and is zero otherwise. In general, the expected value for a continuous RV X is defined as 5

Expected value for RV with a nonuniform PDF The PDF is given by p X (x) = 1/2 then the expected value is 6 Example

Not all PDFs have expected values Before computing E[X] we must make sure that it exists. Not all integrals of the form even if. Example then but 7

Not all PDFs have expected values A more subtle and surprising example is the Cauchy PDF. Since the PDF is symmetric about x = 0, we would expect that E[X] = 0. However, by correctly interpreting the region of integrating in a limiting sense, we have 8 see next slide

Not all PDFs have expected values But for a Cauchy PDF Hence, if the limits are taken independently, the the result is indeterminate. To make the expected value useful in practice the independent choice of limits (and not L = U ) is necessary. 9

Not all PDFs have expected values The indeterminacy can be avoided, if we require “absolute convergence” Hence, E[X] is defined to exit if E[|X|] < ∞. Because of slow decay of the “tails” of the PDF ( 1/x 2 ), very large outcomes are possible. 10

Mean value is a best predictor The expected value is the best guess of the outcome of the RV. By “best” we mean that the use of b = E[X] as our estimator. This estimator minimizes the mean square error, which is defined as See Problem 11.5 for more. 11

Expected Values for Important PDFs Uniform If X ~ U(a, b), E[X] = (a + b)/2, or the mean lies at the midpoint of the interval. See problem 11.8 for more. Exponential If X ~ N(μ, σ 2 ), then since the PDF is symmetric about the point x = μ, we know that E[X] = μ. We can also derive the mean as follows 12 Letting u = x – μ in the first integral we have 0 1

Expected Values for Important PDFs Laplacian is symmetric about x = 0 (and the expected value exists – needed to avoid the situation of the Cauchy PDF), we must have E[X] = 0. Gamma If X ~ Γ(α,λ), then To evaluate this integral we attempt to modify the integrand so that it becomes the PDF of a Γ(α /,λ / ) RV. Then we can equate the integral to one. 13

Expected Values for Important PDFs Rayleigh. It can be shown that Chi-square is a Gamma distribution with α = N/2, and λ = ½ thus 14

Expected Value for a Function of a RV If Y = g ( X ), where X is a continuous RV, then assuming that Y is also a continuous RV with PDF p Y (y), we have the definition of E [ X ] We can use Y = g ( X ) directly 15

Partial Proof 16 For simplicity assume that Y = g(X) is a continuous RV with PDF p Y (y). Also, assume that Y = g(X) is monotonically increasing so that it has a single solution to the equation y = g(x) for all y. Monotonically increasing function used to derive E[g(X)]

Partial Proof 17 Next change variables from y to x using x = g -1 (y). Since we have assumed that g(x) is monotonically increasing, the limits for y of ±∞ also become ±∞ for x. Then, since x = g -1 (y), we have yp X (g -1 (y)) becomes g(x)p X (x) and g is monotonically increasing implies g -1 derivative is positive From which result follows

Expectation of linear (affine) function If Y = aX + b, then since g(x) = ax + b, we have Thus More generally, it is easily shown that 18

Power of N(0, 1) RV If X ~ N(0, 1) and Y = X 2, consider E[Y] = E[X 2 ]. The quantity E[X 2 ] is the average squared value of X and can be interpreted physically as a power. If X is a voltage across a 1 ohm resistor, then X 2 is the power and therefore E[X 2 ] is the average power. 19 We use integration by parts with U = x, dU = dx, and Using L’Hospital’s rule we get 0. Evaluates to ½, half of PDF

Expected value of indicator RV An indicator function indicates whether a point is in a given set. Example: If the set A = [3, 4], then the indicator function is defined as 20

Expected value of indicator RV I A (x) may be thought of as a generalization of the unit step function since if u(x) = 1 for x ≥ 0 and zero otherwise, we have If X is a RV, then I A (x) is transformed RV that takes on values 1 and 0, depending upon whether the outcome belong to A or not. Its expectation is given by The expected value of the indicator RV is the prob. of the set or event. 21

Expected value of indicator RV: Example Consider the estimation of P[3 ≤ X ≤ 4]. But this is just E[I A (X)] ! To estimate the expected value of a transformed RV, we use observed x 1, x 2,…,x M, then transform each one to the new RV and finally compute the sample mean for our estimate using This is what we have been using all long, since counts all the outcomes for which 3 ≤ x ≤ 4. The indictor function provides a means to connect the expected value with the probability. 22

Variance of a continuous RV The variance of a continuous RV, as for discrete RV measures the average squared deviation from the mean. Variance of a Normal RV: Lets find variance of a N (μ,σ 2 ) RV 23 Recall that E[X] = μ Letting u = (x - μ)/σ produces

Variance of a Normal RV It’s common to refer the square-root of the variance as the standard deviation σ. σ indicates how close outcomes tend to cluster about the mean. If RV is N (μ,σ 2 ), then 68.2% of the outcomes will be within the interval [μ - σ, μ + σ], 95.5% will be within [μ - 2σ, μ + 2σ] and 99.8% will be within [μ - 3σ, μ +3 σ] 24

Variance of a uniform RV If X ~ U(a, b), then and letting u = x – (a + b)/2, we have 25

Properties of Variance The variance of a continuous RV enjoys the same properties as for a discrete RV. An alternate form of variance computation is If c is a constant then The variance is a nonlinear type of operation in that 26

Moments E[X n ] is termed the nth moment and it is defined to exist if E[|X| n ] < ∞. If it is know that E[X s ] exists, then it can be shown that E[X r ] exists for r < s. Also E[X r ] is known not to exist, then E[X s ] cannot exist for s > r. 27

Moments of an exponential RV For X ~ exp(λ) that To evaluate this first show how the nth moment can be written recursively in terms of the (n - 1) st moment. Knowing that E[X] = 1/λ and applying integration by parts will yield the recursive formula 28

Moments of an exponential RV Thus, U = x n and dV = λexp(-λx)dx so dU = nx n-1 dx and V = - exp(-λx), we have 29 Hence, the nth moment can be written in term of the (n - 1)st moment. Since we know that E[X] = 1/λ, we have

Moments of an exponential RV and in general The variance can be found to be var( X ) = 1/ λ 2 using this results. 30

Central moments Often it is important to be able to compute moments about some point. Usually this point is mean. The nth central moment about the point E[X] is defined as E[(X – E[X]) n ]. 31 or finally we have that

Properties of continuous RVs 32

Pafnuty Chebyshev Pafnuty Lvovich Chebyshev was a Russian mathematician. Chebyshev studied at the college level at Moscow University, where he earned his bachelor's degree in After Chebyshev became a professor of mathematics in Moscow himself, his two most illustrious graduate students were Andrei Andreyevich Markov (the elder) and Alexandr Lyapunov. Chebyshev is known for his work in the fields of probability, statistics, mechanics, and number theory. 33

Chebyshev Inequality The mean and variance of a RV indicate the average value and variability of the outcomes of a repeated experiment. However, they are not sufficient to determine probabilities of events. 34 (Gaussian)(Laplacian) Both have E[X] = 0 (due to symmetry about x = 0 ) and var(X) = 1. Yet, the probability of a given interval can be very different. Although the relationship between the mean and variance, and the probability of an event in not a direct one, we can still obtain some information about the probabilities based on the mean and variance.

Chebyshev Inequality It is possible to bound the probability or the be able to assert that This is useful if we only wish to make sure the probability is below a certain value, without having to find the probability. 35

Example: Chebyshev Inequality If the probability of a speech signal of mean 0 and variance 1 exceeding a given magnitude γ is to be no more than 1%, then we would be satisfied if we could determine a γ so that Lets show that the probability of the event |X – E[X]| > γ can be bounded if we know the mean and variance. 36 The PDF does not need to be known!

Chebyshev Inequality Using the definition of the variance we have 37 So that we have Chebyshev inequlity

Chebyshev Inequality: Example P [ |X| > γ ] for Gaussianl and Laplacian RV with zero mean and unity variance compared to Chebyshev inequality 38

Problems Problem 1. Prove that if the PDF is symmetric about a point x = a, which is to say that it satisfy p X (a + u) = p X (a - u) for all -∞ < u < ∞, then the mean will be a. Hint: Write the integral As And then let u = x – a in the first integral and u = a – x in the second integral. Problem 2. Find the mean for a uniform PDF. Do so by using the definition and then rederive it using the results form problem 1. 39

Problems Problem 3. Prove that the best prediction of the outcome of a continuous RV is its mean. Best is to be interpreted as the value that minimizes the mean square error mse(b) = E[(X - b) 2 ]. Problem 4. Determine the mean of chi-square PDF. 40