Download presentation
Presentation is loading. Please wait.
Published byRussell Charles Modified over 9 years ago
1
Limits and the Law of Large Numbers Lecture XIII
2
Almost Sure Convergence Let represent the entire random sequence {Z t }. As discussed last time, our interest typically centers around the averages of this sequence:
3
Definition 2.9: Let {b n ( )} be a sequence of real-valued random variables. We say that b n ( ) converges almost surely to b, written if and only if there exists a real number b such that
4
The probability measure P describes the distribution of and determines the joint distribution function for the entire sequence {Z t }. Other common terminology is that b n ( ) converges to b with probability 1 (w.p.1) or that b n ( ) is strongly consistent for b.
5
4 Example 2.10: Let where {Z t } is a sequence of independently and identically distributed (i.i.d.) random variables with E(Z t )= < . Then by the Komolgorov strong law of large numbers (Theorem 3.1).
6
4 Proposition 2.11: Given g: R k R l (k,l<∞) and any sequence {b n } such that where b n and b are k x 1 vectors, if g is continuous at b, then
7
4 Theorem 2.12: Suppose –y=X 0 + ; –X’ /n a.s. 0; –X’X/ a.s. M, finite and positive definite. Then n exists a.s. for all n sufficiently large, and n a.s. 0.
8
4 Proof: Since X’X/n a.s. M, it follows from Proposition 2.11 that det(X’X/n) a.s. det(M). Because M is positive definite by (iii), det(M)>0. It follows that det(X’X/n)>0 a.s. for all n sufficiently large, so (X’X/N) -1 exists a.s. for all n sufficiently large. Hence
9
4 In addition, 4 It follows from Proposition 2.11 that
10
Convergence in Probability 4 A weaker stochastic convergence concept is that of convergence in probability. Definition 2.23: Let {b n ( )} be a sequence of real-valued random variables. If there exists a real number b such that for every > 0, as n , then b n ( ) converges in probability to b.
11
The almost sure measure of probability takes into account the joint distribution of the entire sequence {Z t }, but with convergence in probability, we only need to be concerned with the joint distribution of those elements that appear in b n ( ). 4 Convergence in probability is also referred to as weak consistency.
12
Theorem 2.24: Let { b n ( )} be a sequence of random variables. If If b n converges in probability to b, then there exists a subsequence {b nj } such that
13
Convergence in the r th Mean Definition 2.37: Let {b n ( )} be a sequence of real-valued random variables. If there exists a real number b such that as n for some r > 0, then b n ( ) converges in the r th mean to b, written as
14
4 Proposition 2.38: (Jensen’s inequality) Let g: R 1 R 1 be a convex function on an interval B R 1 and let Z be a random variable such that P[Z B]=1. Then g(E(Z)) E(g(Z)). If g is concave on B, then g(E(Z)) E(g(Z)).
15
Proposition 2.41: (Generalized Chebyshev Inequality) Let Z be a random variable such that E|Z| r 0. Then for ever > 0
16
Theorem 2.42: If b n ( ) r.m. b for some r > 0, then b n ( ) p b.
17
Laws of Large Numbers 4 Proposition 3.0: Given restrictions on the dependence, heterogeneity, and moments of a sequence of random variables {Z t }, where
18
Independent and Identically Distributed Observations 4 Theorem 3.1: (Komolgorov) Let {Z t } be a sequence of i.i.d. random variables. Then if and only if E|Z t | < and E(Z t ) = . This result is consistent with Theorem 6.2.1 (Khinchine) Let {X i } be independent and identically distributed (i.i.d.) with E[X i ] = . Then
19
4 Proposition 3.4: (Holder’s Inequality) If p > 1 and 1/p+1/q=1 and if E|Y| p < and E|Z| q < , then E|YZ| [E|Y| p ] 1/p [E|Z| q ] 1/q. 4 If p=q=2, we have the Cauchy-Schwartz inequality
20
Asymptotic Normality Under the traditional assumptions of the linear model (fixed regressors and normally distributed error terms) n is distributed multivariate normal with: for any sample size n.
21
However, when the sample size becomes large the distribution of n is approximately normal under some general conditions.
22
4 Definition 4.1: Let {b n } be a sequence of random finite-dimensional vectors with joint distribution functions {F n }. If F n (z) F(z) as n for every continuity point z, where F is the distribution function of a random variable Z, then b n converges in distribution to the random variable Z, denoted
23
4 Other ways of stating this concept are that b n converges in law to Z: Or, b n is asymptotically distributed as F In this case, F is called the limiting distribution of b n.
24
Example 4.3: Let {Z t } be a i.i.d. sequence of random variables with mean and variance 2 < . Define Then by the Lindeberg-Levy central limit theorem (Theorem 6.2.2),
25
Theorem (6.2.2): (Lindeberg-Levy) Let {X i } be i.i.d. with E[X i ]= and V(X i )= 2. Then Z n N(0,1). 4 Definition 4.8: Let Z be a k x 1 random vector with distribution function F. The characteristic function of Z is defined as where i 2 =-1 and is a k x 1 real vector.
26
Example 4.10: Let Z~N( , 2 ). Then 4 This proof follows from the derivation of the moment generating function in Lecture VII.
27
4 Specifically, note the similarity between the definition of the moment generating function and the characteristic function: 4 Theorem 4.11 (Uniqueness Theorem) Two distribution functions are identical if and only if their characteristic functions are identical.
28
4 Note that we have a similar theorem for moment generating functions. 4 Proof of Lindeberg-Levy: –First define f( ) as the characteristic function for Z t - and let f n ( ) be the characteristic function of
29
–By the structure of the characteristic function we have
30
–Taking a second order Taylor series expansion of f( ) around =0 gives Thus,
31
4 Thus, by the Uniqueness Theorem the characteristic function of the sample approaches the characteristic function of the standard normal.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.