Variance and Hypothesis Tests

Slides:



Advertisements
Similar presentations
Tests of Significance and Measures of Association
Advertisements

“Students” t-test.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Sampling Distributions (§ )
PSY 307 – Statistics for the Behavioral Sciences
Chapter 10 Simple Regression.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Sample size computations Petter Mostad
Chapter 6 Hypotheses texts. Central Limit Theorem Hypotheses and statistics are dependent upon this theorem.
Topic 2: Statistical Concepts and Market Returns
BHS Methods in Behavioral Sciences I
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Inference about a Mean Part II
“There are three types of lies: Lies, Damn Lies and Statistics” - Mark Twain.
Standard error of estimate & Confidence interval.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
AM Recitation 2/10/11.
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Lecture 14 Testing a Hypothesis about Two Independent Means.
Comparing Systems Using Sample Data Andy Wang CIS Computer Systems Performance Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Chapter 9: Testing Hypotheses
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
Basic Statistics Inferences About Two Population Means.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Determination of Sample Size: A Review of Statistical Theory
Psych 230 Psychological Measurement and Statistics
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Testing the Differences between Means Statistics for Political Science Levin and Fox Chapter Seven 1.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Comparing Systems Using Sample Data Andy Wang CIS Computer Systems Performance Analysis.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Chapter 9 Introduction to the t Statistic
Outline Sampling Measurement Descriptive Statistics:
TESTING STATISTICAL HYPOTHESES
HYPOTHESIS TESTING.
Chi-square test.
Comparing Systems Using Sample Data
More on Inference.
Using the t-distribution
3. The X and Y samples are independent of one another.
Sample Size Determination
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Practice & Communication of Science
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis Testing Review
Hypothesis Testing: Hypotheses
Inferences About Means from Two Groups
Hypothesis Tests for a Population Mean in Practice
More on Inference.
Chapter 9 Hypothesis Testing.
Problems: Q&A chapter 6, problems Chapter 6:
Discrete Event Simulation - 4
Experimental Design Data Normal Distribution
Product moment correlation
What are their purposes? What kinds?
Psych 231: Research Methods in Psychology
Sampling Distributions (§ )
Skills 5. Skills 5 Standard deviation What is it used for? This statistical test is used for measuring the degree of dispersion. It is another way.
Comparing two means: Module 7 continued module 7.
Presentation transcript:

Variance and Hypothesis Tests Econ. 201 – Econ. Data Analysis

Difference in Means - Design Try to isolate the experimental and control groups from each other while the experiment is conducted Amounts to ensuring either that both groups are random samples drawn from the “universe” of such populations or that each group has equivalent values of any variable that might also (in addition to the experiment) alter its outcome

Example “Men have higher cholesterol than women” Want to isolate gender as a cause for cholesterol levels to vary. Yet we know many other things effect cholesterol 2 are diet and genetics Could isolate gender in two ways 1: Draw random sample from the population Ask, is cm>cw? 2: Draw samples of men and women from populations with similar diets and genetics --- say 20 yr. old college students Pose the same question as #1

Example, contd. The Data in the example take the latter approach Then pose the question formally as: Ho: cm=cw (The null hypothesis) that which you hope to disprove H1: cm≠cw (The alternative hypothesis)

An alternative statement of the hypothesis Define the difference in cholesterol between the two groups and ask is it zero or not. Using mean levels of the data collected on each group, form the difference. d=cw-cm. Ho: d=0 (The null hypothesis) H1: d ≠0 (The alternative hypothesis)

Construction of mean and variance Look at raw data on p. 56. Construct the following from this and the sheet “Some Basic Statistical Formulae.” The mean value of cholesterol for each group. The variance and standard deviation of each group.

Standard Error of the Mean Another measure of a distribution is the Standard Error of the Mean. Formula is on the sheet: look at it. An estimate of the variability of dispersion of the sample mean. Assuming it were itself constructed from repeated samples of size n from a population. Is a measure of our uncertainty over the true or population mean, given that we are “estimating” it.

The Central Limit Theorem If the underlying experimental design that generated the data is a random one, then the means of various such experiments will be drawn from a distribution with a mean = (∑x)/n, and a standard deviation = s/√n. Then the area under the standard normal curve (p. 57) contains various ranges of the mean. A general rule of thumb says that we have a 95(.4)% confidence level that the true sample mean lies within +/- 2(s/√n)

Intuition Calculate the interval around each group’s mean with the standard errors of the means (see page 57). The further apart are the means and the smaller the dispersions around these means (stnd. errors), the more likely we are to determine that the mean levels of the two groups are different.

Alternative formulation, d Look at formation and resulting distribution of d on p. 57. d = 173.5 – 163.3 = 10.2 Now form the variance of this mean difference Defined as the sum of the variances of the standard errors of each individual mean see p. 57 for formulas, = 6.02

Formation of 95% confidence interval around mean d 2+/-(standard error of the difference), here 2+/-(6.02). Can be 95% confident that true mean d lies in the range from: -1.84 to 22.24. Cannot be 95% sure d is not 0. This interval includes zero, so at the 95% confidence level, given the data, we accept the null hypothesis, H0, reject the alternative, H1.

Cholesterol Example, contd. Look at raw data by frequency (p. 57) Understand that the two, equivalent, ways of framing the hypothesis concern either: 1. The degree of overlap between the confidence interval we construct around the mean of men’s cholesterol observations, and that we construct around the mean of woman’s cholesterol observations, and seeing by how much they overlap, or, 2.Whether the distribution of d contains 0 in the confidence interval we construct around its mean

Construction of the true confidence level We know we can meet the requirement of 1+/- (stnd. error of the mean) Would give us a 66% level of confidence because 66% of the area under the standard normal curve lie in this range Here = 10.2+/- 6.02: from 4.18 to 16.22 But we cannot meet the criteria for a 95% level of confidence – somewhere between 95% and 66% So there is weak support for the contention that cholesterol varies by gender

Or could consult a t-statistic t = mean/(it’s standard error) “critical values of t, depend on the size of the sample, and gives a significance value at which a particular sample mean can be assumed to be different than zero here t = 10.2/6.02 = 1.69 for a sample of 30, a t–statistic of 1.69, is significant at approximately the 90% level

Automate the calculation Use Excel Convenient for “big” data sets, with many observations Use it to calculate: 1. avg. cholesterol, 2. differences from avg., 3. differences squared 4. squared differences summed

Excel Computations, contd. Use a calculator and formula sheet for the rest Calculate the variance and the standard deviations of the two samples Calculate the stnd. error of each mean Then calculate the stnd. error of the difference in means