DATA ANALYSIS AND MODEL BUILDING LECTURE 5 Prof. Roland Craigwell Department of Economics University of the West Indies Cave Hill Campus and Rebecca Gookool.

Slides:



Advertisements
Similar presentations
Slides prepared by Elizabeth Newton (MIT)
Advertisements

Hypothesis testing Another judgment method of sampling data.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Probability & Statistical Inference Lecture 6
Hypothesis Testing: One Sample Mean or Proportion
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Pengujian Hipotesis Nilai Tengah Pertemuan 19 Matakuliah: I0134/Metode Statistika Tahun: 2007.
Hypothesis : Statement about a parameter Hypothesis testing : decision making procedure about the hypothesis Null hypothesis : the main hypothesis H 0.
8 - 10: Intro to Statistical Inference
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Inference about a Mean Part II
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Tests of significance: The basics BPS chapter 15 © 2006 W.H. Freeman and Company.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.
Introduction to Biostatistics and Bioinformatics
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
14. Introduction to inference
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Chapter 4 Introduction to Hypothesis Testing Introduction to Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
STA Statistical Inference
LECTURE 19 THURSDAY, 14 April STA 291 Spring
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 Lecture 19: Hypothesis Tests Devore, Ch Topics I.Statistical Hypotheses (pl!) –Null and Alternative Hypotheses –Testing statistics and rejection.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Chapter 20 Testing Hypothesis about proportions
1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
MeanVariance Sample Population Size n N IME 301. b = is a random value = is probability means For example: IME 301 Also: For example means Then from standard.
Introduction to inference Tests of significance IPS chapter 6.2 © 2006 W.H. Freeman and Company.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
One Sample Inf-1 In statistical testing, we use deductive reasoning to specify what should happen if the conjecture or null hypothesis is true. A study.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Hypothesis Testing Chapter Hypothesis Testing  Developing Null and Alternative Hypotheses  Type I and Type II Errors  One-Tailed Tests About.
Hypothesis Testing Concepts of Hypothesis Testing Statistical hypotheses – statements about population parameters Examples mean weight of adult.
Chapter 9 -Hypothesis Testing
Slides by JOHN LOUCKS St. Edward’s University.
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
FINAL EXAMINATION STUDY MATERIAL III
Statistics for Business and Economics (13e)
CONCEPTS OF HYPOTHESIS TESTING
Introduction to Inference
Introduction to Inference
Confidence Intervals.
Presentation transcript:

DATA ANALYSIS AND MODEL BUILDING LECTURE 5 Prof. Roland Craigwell Department of Economics University of the West Indies Cave Hill Campus and Rebecca Gookool Department of Economics University of the West Indies St. Augustine Campus

Basic Concepts of Inference Basic Concepts of Inference

Statistical Inference “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.”H. G. Wells Statistical Inference: Deals with methods for making statements about a population based on a sample drawn from the population. Point Estimation: Estimate an unknown population parameter. Confidence Interval Estimation: Find an interval that contains the parameter with pre-assigned probability. Hypothesis testing: Testing hypothesis about an unknown population parameter.

Examples Point Estimation: estimate the mean package weight of a cereal box filled during a production shift. Confidence Interval Estimation: Find an interval [L,U] based on the data that includes the mean weight of the cereal box with a specified probability. Hypothesis testing: Do the cereal boxes meet the minimum mean weight specification of 16 oz?

Two Levels of Statistical Inference Informal, using summary statistics (may only be descriptive statistics) Formal, which uses methods of probability and sampling distributions to develop measures of statistical accuracy

Estimation Problems Point estimation: estimation of an unknown population parameter by a single statistic calculated from the sample data. Confidence interval estimation: calculation of an interval from sample data that includes the unknown population parameter with a pre- assigned probability.

Point Estimation Terminology

Methods of Evaluating Estimators Bias and Variance

Accuracy and Precision

Mean Squared Error

Example: estimators of variance

Standard Error (SE)

Precision and Standard Error A precise estimate has a small standard error, but exactly how are the precision and standard error related? If the sampling distribution of an estimator is normal with mean equal to the true parameter value (i.e., unbiased). Then we know that about 95% of the time the estimator will be within two SE’s from the true parameter value.

Methods of Point Estimation Method of Moments Maximum Likelihood Estimation Least Squares

Method of Moments

Maximum Likelihood Parameter Estimation By far the most popular estimation method! (Casella& Berger). MLE is the parameter point for which observed data is most likely under the assumed probability model. Likelihood function: L(θ|x) = f(x| θ), where x is the vector of sample values, θ also a vector possibly. When we consider f(x| θ), we consider θ as fixed and x as the variable. When we consider L(θ|x), we are considering x to be the fixed observed sample point and θ to be varying over all possible parameter values.

MLE (continued)

Confidence Interval Estimation

95% Confidence Interval: Normal Variance known

Normal Distribution, 95% of area under curve is between and 1.96xdnorm

Frequentist Interpretation of CI’s In an infinitely long series of trials in which repeated samples of size n are drawn from the same population and 95% CI’s for μ are calculated using the same method, the proportion of intervals that actually include μ will be 95% (coverage probability). However, for any particular CI, it is not known whether or not the CI includes μ, but the probability that it includes μ is either 0 or 1, that is, either it does or it doesn’t. It is incorrect to say that the probability is 0.95 that the true μ is in a particular CI.

95% CI, 50 samples from unit normal distribution

Arbitrary Confidence Level for CI: variance known

One-sided Confidence Intervals

Hypothesis Testing The objective of hypothesis testing is to access the validity of a claim against a counterclaim using sample data. The claim to be “proved” is the alternative hypothesis(H 1 ). The competing claim is called the null hypothesis(H 0 ). One begins by assuming that H 0 is true. If the data fails to contradict H 0 beyond a reasonable doubt, then H 0 is not rejected. However, failing to reject H 0 does not mean that we accept it as true. It simply means that H 0 cannot be ruled out as a possible explanation for the observed data. A proof by insufficient data is not a proof at all.

Testing Hypotheses “The process by which we use data to answer questions about parameters is very similar to how juries evaluate evidence about a defendant.” –from Geoffrey Vining, Statistical Methods for Engineers, Duxbury, 1st edition, For more information, see that textbook.

Hypothesis Tests A hypothesis test is a data-based rule to decide between H 0 and H 1. A test statistic calculated from the data is used to make this decision. The values of the test statistics for which the test rejects H 0 comprise the rejection region of the test. The complement of the rejection region is called the acceptance region. The boundaries of the rejection region are defined by one or more critical constants (critical values).

Hypothesis Testing as a Two-Decision Problem

Probabilities of Type I and II Errors α= P{Type I error} = P{Reject H 0 when H 0 is true} = P{Reject H 0 | H 0 } also called α-risk or producer’s risk or false alarm rate β= P{Type II error} = P{Fail to reject H 0 when H 1 is true} = P{Fail to reject H 0 | H 1 } also called β-risk or consumer’s risk or prob. of not detecting π= 1 -β= P{Reject H 0 | H 1 } is prob. of detection or power of the test We would like to have low α and low β (or equivalently, high power). α and 1-βare directly related, can increase power by increasing α. These probabilities are calculated using the sampling distributions from either the null hypothesis (for α) or alternative hypothesis (for β).

Power Function and OC Curve The operating characteristic function of a test is the probability that the test fails to reject H 0 as a function of θ, where θ is the test parameter. OC(θ) = P{test fails to reject H0| θ} For θ values included in H 1 the OC function is the β–risk. The power function is: π(θ) = P{Test rejects H0 | θ} = 1 –OC(θ) Example: In SAT coaching, for the test that rejects the null hypothesis when mean change is 25 or greater, the power = 1- pnorm(25,mean=0:50,sd=40/sqrt(20))

Level of Significance The practice of test of hypothesis is to put an upper bound on the P(Type I error) and, subject to that constraint, find a test with the lowest possible P(Type II error). The upper bound on P(Type I error) is called the level of significance of the test and is denoted by α (usually some small number such as 0.01, 0.05, or 0.10). The test is required to satisfy: P{ Type I error } = P{ Test Rejects H0 | H0 } ≤α Note that α is now used to denote an upper bound on P(Type I error). Motivated by the fact that the Type I error is usually the more serious. A hypothesis test with a significance level α is called an a α-level test.

Choice of Significance Level What α level should one use? Recall that as P(Type I error) decreases P(Type II error) increases. A proper choice of α should take into account the relative costs of Type I and Type II errors. (These costs may be difficult to determine in practice, but must be considered!) Fisher said: α=0.05 Today α= 0.10, 0.05, 0.01 depending on how much proof against the null hypothesis we want to have before rejecting it. P-values have become popular with the advent of computer programs.

Observed Level of Significance or P-value

One-sided and Two-sided Tests

Relationship Between Confidence Intervals and Hypothesis Tests An α-level two-sided test rejects a hypothesis H0: μ= μ 0 if and only if the (1-α)100% confidence interval does not contain μ 0.

Use/Misuse of Hypothesis Tests in Practice Difficulties of Interpreting Tests on Non-random samples and observational data Statistical significance versus Practical significance –Statistical significance is a function of sample size Perils of searching for significance Ignoring lack of significance Confusing confidence (1 -α) with probability of detecting a difference (1 -β)

The end

Jerzy NeymanEgon Pearson ( ) ( ) Carried on a decades-long feud with Fisher over the foundations of statistics (hypothesis testing and confidence limits) -Fisher never recognized Type II error & developed fiducial limits