1 Nonparametric Methods I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
2 Parametric vs. Nonparametric MLE: probability distribution and likelihood Bayes: conditional, prior and posterior distributions Distribution free? -parametric_statistics -parametric_statistics
3 Motivation (1) In many applications, direct access to a measurement and is not possible. However, an estimation of the measurement is needed. Most of the time, the large scale repetition of an experiment is not economically feasible. What can one do?
4 Motivation (2) Q1: What estimator for the problem of interest can be used? Q2: Having chosen an estimator, how accurate is it? What is the bias and variance of an estimator? Q3: How to make inference? What is the confidence interval? What is the p-value for a hypothesis testing?
5 References B. Efron (1979) Computers and the theory of statistics: thinking the unthinkable, SIAM Review, 21, B. Efron and R. J. Tibshirani (1993) An Introduction to the Bootstrap. Chapman & Hall. J. I. De la Rosa and G. A. Fleury (2006) Bootstrap methods for a measurement estimation problem. IEEE Transactions on Instrumentation and Measurement, 55, 3, 820– 827. tistics%29#Jackknife tatistics%29 tistics%29#Jackknife tatistics%29
6 Resampling Techniques Data resampling PART 1: Jackknife Resampling without replacement PART 2: Bootstrap Resampling with replacement
7 PART 1: Jackknife Naming Illustration Math Expression Examples R codes C codes
8 Why the funny name of Jackknife? Jackknife: a pocket knife Mosteller and Tukey (1977, p. 133) described a predecessor resampling method, the jackknife, in the following way: “ The name ‘ jackknife ’ is intended to suggest the broad usefulness of a technique as a substitute for specialized tools that may not be available, just as the Boy Scout ’ s trustworthy tool serves so variedly …”
9 Illustration of Jackknife Population, resampling sampling N times inference statistics Estimate by
10 Math Expression
11 An Example of Jackknife (1) HW
12 An Example of Jackknife (2)
13 Summary of the Jackknife Method
14 How do quartiles lead to an estimate?
15 Jackknife by R 1. Open “R”
16 2. Install add-on packages
17 3.Select a mirror site, like Taiwan (Taipeh)
18 4.Select the package of “bootstrap”
19
20 5. type: library(bootstrap)
21 If you want to see the manual, you can type “?jackniffe”.
22
23 R-package
24 Select the menu to open the editor in R
25 You can edit your program in this box and then store this program.
26 You can save your program……
27 main.jackknife.function
28 (1) Use mouse to select the R commands you want to run. (2) Press “ F5 ” to run
29 output
30 Jackknife by C define functions
31
32
33 An example for jackknife
34
35
36
37 PART 2: Bootstrap Naming Illustration Math Expression Examples R codes Three approaches Package(bootstrap) Package(boot) Write your own R codes C codes
38 The Bootstrap Bootstrap technique was proposed by Bradley Efron (1979, 1981, 1982) in literature. Bootstrapping is an application of intensive computing to traditional inferential methods.
39 Why the funny name of bootstrap? Bootstrap: archives/Bootstrap_1.jpg archives/Bootstrap_1.jpg In the book of ‘Singular Travels, Campaigns and Adventures of Baron Munchausen’ by R. E. Raspe (1786), the main character, finding himself in a deep hole, extracts himself using only the straps of his boots. e/stathumr.htm e/stathumr.htm
40 Illustration of Bootstrap Population, resampling sampling B times inference statistics estimate by
41 Math Expression
42 Population,
43 Population step1 sampling
44 step2 resampling B times
45 Step 3: statistics
46
47 Summary of the Bootstrap Method
48 Bootstrap by R Approach 1 Use package “bootstrap” Approach 2 Use package “boot” Approach 3 Write your own R codes
49 Approach 1
50 1. Install the add-on package
51 2.Select a mirror site like “Taiwan (Taipeh)”
52 3.Select the package of “bootstrap”
53
54 4. type library(bootstrap)
55 If you want to see the manual, you can type “?bootstrap”.
56 bias
57 Use this package to do bootstrap
58
59
60 Approach 2
61 Library(boot)
62
63 A character string indicating the type of simulation required. Possible values are "ordinary" (the default), "parametric", "balanced", "permutation", or "antithetic". Importance resampling is specified by including importance weights; the type of importance resampling must still be specified but may only be "ordinary" or "balanced" in this case. Arguments
64 R code Approach 3
65 An example
66 Run functions
67 Run main function
68 Bootstrap by C
69
70 實際操作 An example
71
72
73
74 Exercises Write your own programs similar to those examples presented in this talk. Write programs for those examples mentioned at the reference web pages. Write programs for the other examples that you know. Prove those theoretical statements in this talk. 74