Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4.

Slides:



Advertisements
Similar presentations
Review bootstrap and permutation
Advertisements

Hypothesis testing and confidence intervals by resampling by J. Kárász.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Sampling: Final and Initial Sample Size Determination
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Statistical inference form observational data Parameter estimation: Method of moments Use the data you have to calculate first and second moment To fit.
Overview of STAT 270 Ch 1-9 of Devore + Various Applications.
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Bootstrapping applied to t-tests
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
Choosing Statistical Procedures
Chapter 13: Inference in Regression
Analysis & Interpretation: Individual Variables Independently Chapter 12.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Education 793 Class Notes T-tests 29 October 2003.
Applications of bootstrap method to finance Chin-Ping King.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Comparing Two Means Prof. Andy Field.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Confidence intervals and hypothesis testing Petter Mostad
Determination of Sample Size: A Review of Statistical Theory
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Bootstrap Event Study Tests Peter Westfall ISQS Dept. Joint work with Scott Hein, Finance.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 7.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 2.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 9.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 3.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 6.
BIOSTATISTICS Hypotheses testing and parameter estimation.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 5.
Modern Approaches The Bootstrap with Inferential Example.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
AP Statistics Friday, 05 February 2016 OBJECTIVE TSW review for Monday’s test on confidence intervals. ASSIGNMENT DUE –WS AP Review: Confidence Intervals.
PEP-PMMA Training Session Statistical inference Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
1/61: Topic 1.2 – Extensions of the Linear Regression Model Microeconometric Modeling William Greene Stern School of Business New York University New York.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistics for Decision Making Hypothesis Testing QM Fall 2003 Instructor: John Seydel, Ph.D.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
Estimating standard error using bootstrap
Application of the Bootstrap Estimating a Population Mean
Comparing Two Means Prof. Andy Field.
ESTIMATION.
Sampling distribution
Microeconometric Modeling
When we free ourselves of desire,
CHAPTER 29: Multiple Regression*
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Bootstrap - Example Suppose we have an estimator of a parameter and we want to express its accuracy by its standard error but its sampling distribution.
Microeconometric Modeling
Ch13 Empirical Methods.
AP Statistics Chapter 12 Notes.
Introductory Statistics
How Confident Are You?.
Presentation transcript:

Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4

Statistical Data Analysis 2 Statistical Data Analysis: Introduction Topics Summarizing data Exploring distributions Bootstrap (continued) Robust methods Nonparametric tests Analysis of categorical data Multiple linear regression

Statistical Data Analysis 3 Today’s topics: Bootstrap (Chapter 4: 4.3, 4.4) 4. Bootstrap 4.1. Simulation (read yourself) (last week) 4.2. Bootstrap estimators for distribution (last week) 4.3. Bootstrap confidence intervals 4.4. Bootstrap tests

Statistical Data Analysis 4 Bootstrap: recap (1) Situation realizations of, independent, unknown distr. P Bootstrap to estimate distribution of estimator or test statistic Which steps? First error Second error Step 1. Estimate by Step 2. Estimate by i.e. by empirical distribution of

Statistical Data Analysis 5 Bootstrap: recap (2) Step 1: Determine theoretical bootstrap estimator empirical distribution i) Estimate P by parametric distribution, parameter estimated stochastic: estimator ii) Estimate by stochastic: bootstrap estimator First error

Statistical Data Analysis 6 Bootstrap: recap (3) Step 2: From estimator to estimate: fixed i) If has explicit expression, then done ii) If not, then estimate the estimate: use bootstrap (sampling) scheme to estimate where and from by empirical distribution of, is stochastic: estimator empirical distr. of simulated realizations of is estimate Second error

Statistical Data Analysis 7 Bootstrap: recap (4) Obtain empirical distr. of simulated realizations of with bootstrap (sampling) scheme: With the B bootstrap values get impression of (characteristics of) unknown distribution of T n : n draw histogram n compute sample variance n compute sample sd

Statistical Data Analysis Bootstrap confidence intervals (1) T n : estimator of unknown parameter θ Seen: accuracy of estimator T n : variance of estimator’s distribution Now: accuracy of estimator T n : confidence interval (1 - 2α)x100% confidence interval for θ is interval around T n such that it contains `true’ θ with probability > 1 - 2α If interval is [T n - b 1, T n + b 2 ], how to determine b 1 and b 2 ? (blackboard)

Statistical Data Analysis 9 Bootstrap confidence intervals (2) (1 - 2α)x100% confidence interval for θ is interval around T n such that it contains `true’ θ with probability > 1 - 2α If interval is [T n - b 1, T n + b 2 ], then b 1 and b 2 determined by [T n - b 1, T n + b 2 ] = with, the distribution of T n – θ, So b 1 and –b 2 are quantiles of unknown distribution How to estimate the quantiles b 1 and –b 2 ?

Statistical Data Analysis 10 Bootstrap confidence intervals (3) Interval is [T n - b 1, T n + b 2 ] = How to estimate quantiles b 1 and –b 2 of unknown distribution of T n – θ? Estimate with, use bootstrap Gives estimate of conf interval: (4.1)

Statistical Data Analysis 11 Estimate of conf interval: (4.1) In practice, determine in steps: 1. Estimate unknown distribution of T n – θ with,: use bootstrap Same as before? No: T n – θ, need bootstrap values 2. Estimate quantiles by empirical quantiles of bootstrap values 3. Bootstrap confidence interval: Bootstrap confidence intervals (4) (4.2) (You have to know this formula!!)

Statistical Data Analysis 12 Estimate of confidence interval: Corresponding bootstrap confidence interval: This is original bootstrap confidence interval, also called reflection method Other method: percentile method Estimate of confidence interval: Corresponding bootstrap confidence interval: Only suitable if symmetric around 0. (Asymptotically two methods give same result) Bootstrap confidence intervals (5) (4.2) (4.1) We will use!! We just discussed:

Statistical Data Analysis 13 Bootstrap confidence intervals (5) How to obtain the (sample) α-quantile ? R: if zstar contains the bootstrap values > quantile(zstar, α) Note: always same function of as of For two samples and Y 1,..., Y m method is same Example: if T n,m = X n -Y m, then T n,m * = X n * - Y m * and Z n * = X n * - Y m * - (X n -Y m ) (cf. Example 4.4. in Reader)

Statistical Data Analysis Bootstrap Tests (1) Remember last week’s slide:

Statistical Data Analysis 15 From lecture 3: Kolmogorov-Smirnov test (5) Data: y H 0 : F is normal ← composite null hypothesis H 1 : F is not normal Test statistic: R: > ks.test(y,pnorm) D = , p-value = 6.661e-16 > ks.test(y,pnorm,mean=mean(y),sd=sd(y)) D = , p-value = > mean(y) [1] > sd(y) [1] adj Incorrect: this is test for H 0 : F = N(0,1) H 1 : F ≠ N(0,1) Incorrect : this is test for H 0 : F = N( ,( ) 2 ) H 1 : F ≠ N( ,( ) 2 ) of y Example We have not used D adj ! ! p-value should be (next week) Correct?

Statistical Data Analysis 16 Bootstrap Tests (2) Solve this with bootstrap test! General idea on blackboard

Statistical Data Analysis 17 Bootstrap Tests (3) Example

Statistical Data Analysis 18 Bootstrap Tests (4) > hist(dprec, prob=T) > qqnorm(dprec) Example dprec

Statistical Data Analysis 19 Bootstrap Tests (5) Example

Statistical Data Analysis 20 Bootstrap Tests (6) Example

Statistical Data Analysis 21 Bootstrap Tests (7) Example

Statistical Data Analysis 22 Bootstrap Tests (8) Example

Statistical Data Analysis 23 Bootstrap Tests (9) Example

Statistical Data Analysis 24 Recap Bootstrap 4.3. Bootstrap confidence intervals 4.4. Bootstrap tests

Statistical Data Analysis 25 Bootstrap The end