Chapter 16 Random Variables math2200
Life insurance A life insurance policy: –Pay $10,000 when the client dies –Pay $5,000 if the client is permanently disabled –Charge $50 per year
Random variable We call the variable X a random variable if the numeric value of X is based on the outcome of a random event. e.g. The amount the company pays out on one policy –Random variable is often denoted by a capital letter, e.g. X, Y and Z. A particular value of the variable is often denoted by the corresponding lower case letter, e.g. x, y and z
Random variable Discrete –If we can list all the outcomes (finite or countable) e.g. the amount the insurance pays out is either $10,000, $5,000 or $0 Continuous –any numeric value within a range of values. Example: the time you spend from home to school
Probability model The collection of all possible values and the probabilities that they occur is called the probability model for the random variable.
Example Death rate :1 out of every 1000 people per year Disability rate: 2 out of 1000 per year Probability model Policyholder outcome Payment (x) Probability (Pr (X=x)) Death10,0001/1000 Disability5,0002/1000 Neither0997/1000
What does the insurance company expect? 1000 people insured and in a year, –1 dies –2 disabled –pays $10,000 + $5,000*2 = $20,000 –payment per customer: $20,000/1000 = $20 –charge per customer: $50 –profit : $30 per customer!
Expected value $20 is the expected payment per customer E(X) = 20 =10000 * 1/ * 2/ *997/1000 If X is a discrete random variable
Expected value Of particular interest is the value we expect a random variable to take on, notated μ (for population mean) or E(X) for expected value. The expected value of a (discrete) random variable can be found by summing the products of each possible value and the probability that it occurs: Note: Be sure that every possible outcome is included in the sum and verify that you have a valid probability model to start with.
How about spread? Most of the time, the company makes $50 per customer But, with small probabilities, the company needs to pay a lot ($10000 or $5000) The variation is big How to measure the variation?
Spread The variance of a random variable is: The standard deviation for a random variable is:
Variance and standard deviation Policyholder outcome Payment (x)Probability Pr(X=x) Deviation Death10,0001/1000( ) = 9980 Disability5,0002/ =4980 Neither0997/ = -20 Variance = (1/1000) (2/1000)+(-20) 2 (997/1000) = 149,600 Standard deviation = square root of variance Var (X) = Σ[x-E(X)] 2 * P(X=x) SD(X) = $386.78
Properties of Expected value and Standard deviation Shifting –E(X+c) = E(X) + c –Var(X+c) = Var(X) Example: Consider everyone in a company receiving a $5000 increase in salary. Rescaling –E(aX) = aE(X) –Var(aX) = a 2 Var(X) Example: Consider everyone in a company receiving a 10% increase in salary.
Properties of expected value and standard deviation Additivity –E(X ± Y) = E(X) ± E(Y) –If X and Y are independent Var(X ± Y) = Var(X) + Var(Y) Suppose the payments for two customers are independent, the variance for the total payment to these two customers Var (X+Y) = Var (X)+ Var (Y) = = If one customer is insured twice as much, the variance is –Var(2X) = 4Var(X) = 4* = –SD(2X) = 2SD(X)
X+Y and 2X Random variables do not simply add up together! –X and Y have the same probability model –But they are not the same random variables –Can NOT be written as X + X
Example :Combine Random Variables Sell used Isuzu Trooper and purchase a new Honda motor scooter –Selling Isuzu for a mean of $6940 with a standard deviation $250 –Purchase a new scooter for a mean of $1413 with a standard deviation $11 How much money do I expect to have after the transaction? What is the standard deviation?
Combining Random Variables Bad News: the probability model for the sum of two variables is often different from what we start with. Good news: the magical normal model the probability model for the sum of independent Normal random variables is still normal.
Example: Combining normal random variables packaging stereos –Packing the system Normal with mean 9 min and sd 1.5min –Boxing the system Normal with mean 6 min and sd 1min What is the probability that packing two consecutive systems take over 20 minutes? What percentage of the stereo systems take longer to pack than to box ?
X1: mean=9, sd = 1.5 X2: mean=9, sd = 1.5 T=X1+X2: total time to pack two systems –E(T) = E(X1)+E(X2) = 9+9=18 –Var(T) = Var(X1)+Var(X2) = = 4.5 (assuming independence) –T is Normal with mean 18 and sd 2.12 –P(T>20) = normalcdf(20,1E99, 18, 2.12) =0.1736
What percentage of the stereo systems take longer to pack than to box ? –P: time for packing –B: time for boxing –D=P-B: difference in times to pack and box a system –The questions is P(D>0)=? –Assuming P and B are independent
E(D) = E(P-B) = E(P)-E(B) = 9-6=3 Var(D) = Var(P-B) = Var(P)+Var(B) = = 3.25 SD(D) = 1.80 D is Normal with mean 3 and sd 1.80 P(D>0) =normalcdf(0,1E99,3,1.80)= About 95% of all the stereo systems will require more time for packing than for boxing
If E(X)=µ and E(Y)=ν, then the covariance of the random variables X and Y is defined as Cov(X,Y)=E((X- µ )(Y- ν )) The covariance measures how X and Y vary together. Correlation and Covariance (OPTIONAL)
properties of covariance Cov(X,Y)=Cov(Y,X) Cov(X,X)=Var(X) Cov(cX,dY)=c*dCov(X,Y) Cov(X,Y) = E(XY)- µν If X and Y are independent, Cov(X,Y)=0 –The converse is NOT true Var(X ± Y) = Var(X) + Var(Y) ± 2Cov(X,Y)
Covariance, unlike correlation, doesn’t have to be between -1 and 1. To fix the “problem” we can divide the covariance by each of the standard deviations to get the correlation coefficient: Correlation and Covariance (cont.)
What Can Go Wrong? Don’t assume everything’s Normal. –You must Think about whether the Normality Assumption is justified. Watch out for variables that aren’t independent: –You can add expected values for any two random variables, but –you can only add variances of independent random variables.
What Can Go Wrong? (cont.) Don’t forget: Variances of independent random variables add. Standard deviations don’t. Don’t forget: Variances of independent random variables add, even when you’re looking at the difference between them. Don’t write independent instances of a random variable with notation that looks like they are the same variables.