Intensive Actuarial Training for Bulgaria January 2007 Lecture 0 – Review on Probability Theory By Michael Sze, PhD, FSA, CFA
Topics Covered Some definitions and properties Moment generating functions Some common probability distributions Conditional probability Properties of expectations
Some Definitions and Properties Cumulative distribution function F(x) –F is non-decreasing: a < b F(a) < F(b) –Lim b F(b) = 1 –Lim a - F(a) = 0 –F is right continuous:b n b Lim n F(b n ) = b E[X] = x p(x) = , where p(x) = P(X = x) –E[g(x)] = i x i g(x i ) p(x i ) –E[aX + b] = a E[X] + b –E[X 2 ] = i x i 2 p(x i ) –Var(X) = E[(X - ) 2 ] = E[X 2 ] – (E[X]) 2 –Var(a X + b) = a 2 Var (X)
Moment Generating Functions Definition: mgf M X (t) = E[e t x ] Properties: –There is a 1 – 1 correspondence between f(x) and M X (t) –X, Y independent r.v. M X+Y (t)=M X (t).M Y (t) –X 1,…,X n indep. M i xi (t)= i M xi (t) –mgf for f 1 + f 2 + f 3 = M x1 (t) + M x2 (t) + M x3 (t) –M’ X (0) = E[X] –M (n) X (0) = E[X n ]
Some Common Discrete Probability Distributions Binomial random variable (r.v.) with parameters (n, p) Poisson r.v. with parameter Geometric r.v. with parameter p Negative binomial r.v. with parameter (r, p)
Some Common Continuous Probability Distributions Uniform r.v. on (a, b) Normal r.v. with parameter ( , 2 ) Exponential r.v. with parameter Gamma r.v. with parameters (t, ), t, > 0
Binomial r.v. B(n, p) n is integer, 0 p 1 Probability of getting i heads in n trials p(i) = n C i p i q n – i E[X] = n p Var(X) = n p q M X (t) = (p e t + q) n
Poisson r.v. with parameter > 0, the expected number of events Poisson is good approximation of binomial for large n, small p, and not too big np n p p(i) = P(X = i) = e - x ( i / i!) E[X] = Var(X) = M X (t) = exp [ (e t - 1) ]
Geometric r.v with parameter p 0 p 1, probability of success in one trial Geometric r.v. is used to study the probability of getting the success in n trials p(n) = P(X = n) = q n - 1 p E[X] = 1/p Var(X) = q / p 2. M X (t) = p e t / ( 1 - q e t )
Negative Binomial r.v. with parameter r, p p = probability of success in each trial r = number of successes wanted Negative binomial r.v. is used to study the probability of getting first r successes in n trials p(n) = P(X = n) = n - 1 C r - 1 q n - r p r. E[X] = r / p Var(X) = r q / p 2 M X (t) = [p e t / ( 1 - q e t )] r
Uniform r.v. on (a, b) a < x < b f(x) = 1 / (b – a) for a < x < b 0 otherwise F(c) = (c – a) / (b – a) for a < x < b 0 otherwise E[X] = (a + b) / 2 Var(X) = (b – a) 2 / 12 M X (t) = (e tb - e ta ) / [t (b - a)]
Normal r.v. with parameters ( , 2 ) By central limit theorem, many r.v. can be approximated by a normal distribution f(x) = [1/ (2 2 )] exp [ - (x - ) 2 / 2 2 ] E[X] = Var(X) = 2. M X (t) = exp [ t + 2 t 2 /2 ]
Exponential r.v. with parameter > 0 Exponential r.v. X gives the amount of waiting time until the next event happens X is memoryless: P(X>s+t|X>t) = P(X>s) for all s, t 0 f(x) = e - x. for x 0, 0 otherwise F(a) = 1 - e - a E[X] = 1 / Var(X) = 1 / 2 M X (t) = / ( - t )
Gamma r.v. with parameters (s, ) s, > 0 Exponential r.v. X gives the amount of waiting time until the next s events happen f(x) = e - x ( x) s – 1 / (t) for x 0, 0 otherwise (s) = 0 e - y y s – 1 dy (n) = (n – 1)!, (1) = (0) = 1 E[X] = s / Var(X) = s / 2 M X (t) = [ / ( - t )] s
Conditional Probability Definition:For P(F)>0, P(E|F) = P(EF)/P(F) Properties: –For A 1,…,A n,whereA i A j = for i j (exclusive), and A i = S(exhaustive), then P(B) = i P(B|A i ) P(A i ) –Baye’s Theorem: For P(B)>0, P(A|B) = [P(B|A).P(A)]/P(B) –E[X|A] = i x i P(x i |A) –E[X| A i ] = i E(X|A i ) P(A i )
Properties of Expectation E[X + Y] = E[X] + E[Y] E[ i X i ] = i E[X i ] If X,Y are independent, then E[g(X) h(Y)] = E[g(X)] E[h(Y)] Def.: Cov(X,Y) = E[(X-E[X])(Y-E[Y])] Cov(X,Y) = Cov(Y,X) Cov(X,X) = Var(X) Cov(aX,Y) = a Cov(X,Y)
Properties of Expectation(continued) Cov( i X i, j Y j ) = i j Cov(X i,Y j ) Var( i X i ) = i Var(X i ) + i j Cov(X i,Y j ) If S N = X 1 +…+X N is a compound process –X i are mutually independent, –X i are independent of N, and –X i have the same distribution, then E[S N ] = i E[X i ] Var(S N ) = E[N] Var(X) + Var(N) (E[X]) 2