Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sufficient statistics. The Poisson and the exponential can be summarized by (n, ). So too can the normal with known variance Consider a statistic S(Y)

Similar presentations


Presentation on theme: "Sufficient statistics. The Poisson and the exponential can be summarized by (n, ). So too can the normal with known variance Consider a statistic S(Y)"— Presentation transcript:

1 Sufficient statistics. The Poisson and the exponential can be summarized by (n, ). So too can the normal with known variance Consider a statistic S(Y) Suppose that the conditional distribution of Y given S does not depend on , then S is a sufficient statistic for  based on Y Occurs iff the density of Y factors into a function of s(y) and  and a function of y that doesn't depend on  More Chapter 4

2 Example. Exponential IExp(  ) ~ Y E(Y) =  Var(Y) =  2 Data y 1,...,y n L(  ) =   -1 exp(-  y j /  ) l(  ) = -nlog(  ) -  y j /   y j /n is sufficient

3 maximum

4 =

5

6 Approximate 100(1-2  )% CI for  0 Example. spring data

7 Weibull.

8 Note. Expected information

9 Gamma.

10 Example. Bernoulli Pr{Y = 1} = 1 - Pr{Y = 0} =  0    1 L(  ) =   ^y i (1 -  )^(1-y i ) =  r (1 -  ) n-r l(  ) = rlog(  ) + (n-r)log(1-  ) r =  y j R =  Y j is sufficient for , as is R/n L(  ) factors into a function of r and a constant

11 Score vector  [ y j /  - (n-y j )/(1-  )] Observed information  [y j /  2 + (n-y j )/(1-  ) 2 ] M.l.e.

12 Cauchy. ICau(  ) f(y;  ) = 1/  (1+(y-  ) 2 ) E|Y| =  Var(Y) =  L(  ) =  1/(  (1+(y j -  ) 2 ) Many local maxima l(  ) = -  log(1+(y j -  ) 2 ) J(  ) = 2  ((1-(y j -  ) 2 )/(1+(y j -  ) 2 ) 2 I(  ) = n/2

13

14 Uniform. f(u;  ) = 1/  0 < u <  = 0 otherwise L(  ) = 1/  n 0 < y 1,..., y n <  = 0 otherwise

15 l(  ) becomes increasingly spikey E u(  ) = -1 i(  ) = - 

16 Logistic regression. Challenger data Ibinomials R j, m j,  j

17

18 Likelihood ratio. Model includes  dim(  ) = p true (unknown) value  0 Likelihood ratio statistic

19 Justification. Multinormal result If Y ~ N ( ,  ) then (Y-  ) T  -1 (Y-  ) ~  p 2

20 Uses. Pr[W(  0 )  c p (1-2  )]  1-2  Approx 100(1-2  )% confidence region

21 Example. exponential Spring data: 96 <  <335 vs. asymp normal approx 64 <  <273 kcycles

22 Prob-value/P-value. See (7.28) Choose T whose large values cast doubt on H 0 Pr 0 (T  t obs ) Example. Spring data Exponential E(Y) =  H 0 :  = 100?

23 Nesting  : p by 1 parameter of interest : q by 1 nuisance parameter Model with params (  0, ) nested within ( , ) Second model reduces to first when  =  0

24 Example. Weibull params ( ,  ) exponential when  = 1      How to examine H 0 :  = 1?

25 Spring failure times. Weibull

26 Challenger data. Logistic regression temperature x 1 pressure x 2  (  0,  1,  2 ) = exp{  }/(1+exp{  })  =  0 +  1 x 1 +  2 x 2 linear predictor loglike l(  0,  1,  2 ) =  0  r j +  1  r j x 1j +  2  r j x 2j - m  log(1+exp{  j }) Does pressure matter?

27 Model fit. Are labor times Weibull? Nest its model in a more general one Generalized gamma. Gamma for  =1 Weibull for  =1 Exponential for  =  =1

28 Likelihood results. max log likelihood: generalized gamma -250.65 gamma -251.12 Weibull -251.17 gamma vs. generalized gamma - 2 log like diff: 2(-250.65+251.12) =.94 P-value Pr 0 (  1 2 >.94) = Pr(|Z|>.969) = 2(.166) =.332

29 Chi-squared statistics. Pearson's chi-squared categories 1,...,k count of cases in category i: Y i Pr(case in i) =  i 0 <  i < 1  1 k  i =1 E(Y i ) = n  i var(Y i ) =  i (1 -  i )n cov(Y i,Y j ) = -  i  j n i  j E.g. k=2 case cov(Y,n-Y) = -var(Y) = -n  1  2  = { (  1,...,  k ):  1 k  i = 1, 0<  1,...,  k <1} dimension k-1

30 Reduced dimension possible? model  i ( ) dim( ) = p log like general model:  1 k-1 y i log  i + y k log[1-  1 -...-  k-1 ],  1 k y i = n log like restricted model: l( ) =  1 k-1 y i log  i ( ) + y k log[1-  1 ( )-...-  k-1 ( )]

31 likelihood ratio statistic: if restricted model true The statistic is sometimes written W = 2  O i log(O i /E i )   (O i - E i ) 2 /E i

32 Pearson's chi-squared.

33 Example. Birth data. Poisson? Split into k=13 categories [0,7.5), [7.5,8.5),...[18.5,24] hours O(bserved) 6 3 3 8... E(xpected) 5.23 4.37 6.26 8.08... P = 4.39 P-value Pr(  11 2 > 4.39) =.96

34 Two way contingency table. r rows and c columns n individuals Blood groups A, B, AB, O A, B antigens - substance causing body to produce antibodies group count model I model II O = 1 - A - B

35

36 Question. Rows and columns independent? W = 2  y ij log ny ij / y i. y.j with y i. =  j y ij ~  k-1-p 2 =  (r-1)c-1) 2 with k=rc p=(r-1)+(c-1) P =  (y ij - y i. y.j /n) 2 / (y i. y.j /n) ~  (r-1)(c-1) 2

37 Model 1 W = 17.66 Pr(  1 2 > 17.66) = Pr(|Z| > 4.202) = 2.646E-05 P = 15.73 Pr(  1 2 > 15.73) = Pr(|Z| > 3.966) = 7.309E-05 k-1-p = 4-1-2 = 1 Model 2 W = 3.17 Pr(|Z| > 1.780) =.075 P = 2.82 Pr(|Z|>1.679) =.093

38 Incorrect model. True model g(y), fit f(y;  )

39 Example 1. Quadratic, fit linear

40 Example 2. True lognormal, but fit exponential

41 Large sample distribution.

42 Model selection. Various models: non-nested Ockham's razor. Prefer the simplest model

43 Formal criteria. Look for minimum

44 Example. Spring failure ModelpAICBIC M 1 12744.8*769.9* M 2 7771.8786.5 M 3 2827.8831.2 M 4 2925.1929.3 6 stress levels M 1 : Weibull - unconnected ,  at each stress level


Download ppt "Sufficient statistics. The Poisson and the exponential can be summarized by (n, ). So too can the normal with known variance Consider a statistic S(Y)"

Similar presentations


Ads by Google