Statistics and Data Analysis

Slides:



Advertisements
Similar presentations
Part 15: Hypothesis Tests 15-1/18 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Advertisements

Part 14: Statistical Tests – Part /25 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
1 Hypothesis testing. 2 A common aim in many studies is to check whether the data agree with certain predictions. These predictions are hypotheses about.
Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Hypothesis Testing: One Sample Mean or Proportion
Statistical Inference and Regression Analysis: GB Professor William Greene Stern School of Business IOMS Department Department of Economics.
Sample Size Determination In the Context of Hypothesis Testing
Stat 217 – Day 15 Statistical Inference (Topics 17 and 18)
Chapter 9 Hypothesis Testing.
Quantitative Business Methods for Decision Making Estimation and Testing of Hypotheses.
Part 24: Multiple Regression – Part /45 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Experimental Statistics - week 2
Jump to first page HYPOTHESIS TESTING The use of sample data to make a decision either to accept or to reject a statement about a parameter value or about.
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Chapter 9.3 (323) A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population.
1 Design of Engineering Experiments Part 2 – Basic Statistical Concepts Simple comparative experiments –The hypothesis testing framework –The two-sample.
Two Sample Tests Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take.
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
C HAPTER 4  Hypothesis Testing -Test for one and two means -Test for one and two proportions.
Hypothesis Testing  Test for one and two means  Test for one and two proportions.
Sample Size Needed to Achieve High Confidence (Means)
Chapter Nine Hypothesis Testing.
HYPOTHESIS TESTING.
9.3 Hypothesis Tests for Population Proportions
CHAPTER 9 Testing a Claim
Chapter 9 Hypothesis Testing.
One-Sample Tests of Hypothesis
Assumptions For testing a claim about the mean of a single population
Exercises #8.74, 8.78 on page 403 #8.110, on page 416
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Lecture Slides Essentials of Statistics 5th Edition
Warm Up Check your understanding P. 586 (You have 5 minutes to complete) I WILL be collecting these.
Chapter 9 Hypothesis Testing
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Chapter 2 Simple Comparative Experiments
Chapter 8 Hypothesis Testing with Two Samples.
Hypothesis Tests for 1-Sample Proportion
Hypothesis Testing: Hypotheses
HYPOTHESIS TESTING ALLPPT.com _ Free PowerPoint Templates, Diagrams and Charts By: Sathish Rajamani Associate Professor VNC - Panipat.
Elementary Statistics
Elementary Statistics
Two-sided p-values (1.4) and Theory-based approaches (1.5)
Hypothesis Tests for a Population Mean in Practice
Chapter 9 Hypothesis Testing.
Statistics and Data Analysis
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Chapter 9 Hypothesis Testing.
Review: What influences confidence intervals?
LESSON 20: HYPOTHESIS TESTING
Elementary Statistics
Decision Errors and Power
Statistics and Data Analysis
Virtual University of Pakistan
Hypothesis Testing: The Difference Between Two Population Means
Hypothesis Testing – Introduction
Carrying Out Significance Tests
Chapter 10 Basic Statistics Hypothesis Testing
Reasoning in Psychology Using Statistics
Section 8.2 Day 2.
STA 291 Summer 2008 Lecture 21 Dustin Lueker.
Inference Concepts 1-Sample Z-Tests.
Chapter 9 Lecture 3 Section: 9.3.
Presentation transcript:

Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics

Statistics and Data Analysis Part 14 – Statistical Tests: 2

Statistical Testing Applications Methodology Analyzing Means Analyzing Proportions

Classical Testing Methodology Formulate the hypothesis. Determine the appropriate test Decide upon the α level. (How confident do we want to be in the results?) The worldwide standard is 0.05. Formulate the decision rule (reject vs. not reject) – define the rejection region Obtain the data Apply the test and make the decision.

Comparing Two Populations These are data on the number of calls cleared by the operators at two call centers on the same day. Call center 1 employs a different set of procedures for directing calls to operators than call center 2. Do the data suggest that the populations are different? Call Center 1 (28 observations) 797 794 817 813 817 793 762 719 804 811 747 804 790 796 807 801 805 811 835 787 800 771 794 805 797 724 820 701 Call Center 2 (32 observations) 817 801 798 797 788 802 821 779 803 807 789 799 794 792 826 808 808 844 790 814 784 839 805 817 804 807 800 785 796 789 842 829

Application 1: Equal Means Application: Mean calls cleared at the two call centers are the same H0: μ1 = μ2 H1: μ1 ≠ μ2 Rejection region: Sample means from centers 1 and 2 are very different. Complication: What to use for the variance(s) for the difference?

Standard Approach H0: μ1 = μ2 H1: μ1 ≠ μ2 Equivalent: H0: μ1 – μ2 = 0 Test is based on the two means: Reject the null hypothesis if is very different from zero (in either direction. Rejection region is large positive or negative values of

Rejection Region for Two Means

Easiest Approach: Large Samples Assume relatively large samples, so we can use the central limit theorem. It won’t make much difference whether the variances are assumed (actually are) the same or not.

Variance Estimator

Test of Means H0: μCall Center 1 – μCall Center 2 = 0 H1: μCall Center 1 – μCall Center 2 ≠ 0 Use α = 0.05 Rejection region:

Basic Comparisons Descriptive Statistics: Center1, Center2 Variable N Mean SE Mean StDev Min. Med. Max. Center1 28 790.07 6.05 32.00 701.00 798.50 835.00 Center2 32 805.44 2.98 16.87 779.00 802.50 844.00 Means look different Standard deviations (variances) look quite different.

Test for the Difference Note minus 0 because that is the hypothesized value. It could have been some other value. For example, suppose we were investigating a claim that a test prep course would raise scores by 50 points. Stat  Basic Statistics  2 sample t (do not check equal variances box) This can also be done by providing just the sample sizes, means and standard deviations.

Application: Paired Samples Example: Do-overs on SAT tests Hypothesis: Scores on the second test are no better than scores on the first. (Hmmm… one sided test…) Hypothesis: Scores on the second test are the same as on the first. Rejection region: Mean of a sample of second scores is very different from the mean of a sample of first scores. Subsidiary question: Is the observed difference (to the extent there is one) explained by the test prep courses? How would we test this? Interesting question: Suppose the samples were not paired – just two samples.

Paired Samples No new theory is needed Compute differences for each observation Treat the differences as a single sample from a population with a hypothesized mean of zero.

Testing Application 2: Proportion Investigate: Proportion = a value Quality control: The rate of defectives produced by a machine has changed. H0: θ = θ 0 (θ 0 = the value we thought it was) H1: θ ≠ θ 0 Rejection region: A sample of rates produces a proportion that is far from θ0

Procedure for Testing a Proportion Use the central limit theorem: The sample proportion, p, is a sample mean. Treat this as normally distributed. The sample variance is p(1-p). The estimator of the variance of the mean is p(1-p)/N.

Testing a Proportion H0: θ = θ 0 H1: θ ≠ θ 0 As usual, set α = .05 Treat this as a test of a mean. Rejection region = sample proportions that are far from θ0. Note, assuming θ=θ0 implies we are assuming that the variance is θ0(1- θ0)

Default Rate Investigation: Of the 13,444 card applications, 10,499 were accepted. The default rate for those 10,499 was 996/10,499 = 0.09487. I am fairly sure that this number is higher than was really appropriate for cardholders at this time. I think the right number is closer to 6%. Do the data support my hypothesis?

Testing the Default Rate Sample data: p = 0.09487 Hypothesis: θ0 = 0.06 As usual, use  = 5%.

Application 3: Comparing Proportions Investigate: Owners and Renters have the same credit card acceptance rate H0: θRENTERS = θOWNERS H1: θRENTERS ≠ θOWNERS Rejection region: Acceptance rates for sample of the two types of applicants are very different.

Comparing Proportions Note, here we are not assuming a specific θO or θR so we use the sample variance.

The Evidence = Homeowners

Analysis of Acceptance Rates

Followup Analysis of Default OWNRENT 0 1 All 0 4854 615 5469 46.23 5.86 52.09 1 4649 381 5030 44.28 3.63 47.91 All 9503 996 10499 90.51 9.49 100.00 Are the default rates the same for owners and renters? The data for the 10,499 applicants who were accepted are in the table above. Test the hypothesis that the two default rates are the same.