Review: Estimating a Mean

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Chapter 7 Hypothesis Testing
Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
Introduction to Hypothesis Testing Chapter 8. Applying what we know: inferential statistics z-scores + probability distribution of sample means HYPOTHESIS.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Hypothesis testing Week 10 Lecture 2.
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Research Problems (Lab #1)
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
Probability Population:
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
Inferential Statistics
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
AM Recitation 2/10/11.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Overview of Statistical Hypothesis Testing: The z-Test
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
Chapter 10 Hypothesis Testing
Overview Definition Hypothesis
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Mid-semester feedback In-class exercise. Chapter 8 Introduction to Hypothesis Testing.
Tuesday, September 10, 2013 Introduction to hypothesis testing.
Fundamentals of Hypothesis Testing: One-Sample Tests
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Chapter 8 Introduction to Hypothesis Testing
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Hypothesis Testing.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Slide Slide 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim about a Proportion 8-4 Testing a Claim About.
Chapter 8 Introduction to Hypothesis Testing
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Chapter 20 Testing Hypothesis about proportions
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Welcome to MM570 Psychological Statistics
© Copyright McGraw-Hill 2004
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Statistical Techniques
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Psych 230 Psychological Measurement and Statistics Pedro Wolf October 21, 2009.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
Hypothesis Testing and Statistical Significance
Copyright © 2009 Pearson Education, Inc. 9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Chapter 9 Introduction to the t Statistic
Hypothesis Testing: Hypotheses
Chapter Nine Part 1 (Sections 9.1 & 9.2) Hypothesis Testing
Presentation transcript:

Review: Estimating a Mean Suppose we want to know the mean of a variable in a population, but we can only take the mean over a sample. Law of Large Numbers: The sample mean is a consistent estimator for the population mean. But the sample mean would be different each time we collected a sample – it varies according to some sampling distribution (which we don’t know) To find out how good an estimate we have, can compute standard error as Central Limit Theorem: If the sample is large (~30+) the sampling distribution will be approximately normal, and we can compute probabilities of finding the true mean within ranges of our estimate.

Research Problems and Questions Justification Research Question Testable Hypothesis 1 Testable Hypothesis 2 This is ultimately why we care about all of this– we want to be able to formulate good research problems, turn them into questions and then test specific predictions.

Three Criteria for a Research Problem What are we going to learn as the result of the proposed project that we do not know now? (Will this tell us something new?) Why is it worth knowing? (What is so important about the problem?) How will we know that the conclusions are valid? (Only this one really deals with methods– other 2 are achieved through argumentation) Might seem obvious to you, but it isnt always clear what is “problematic” and why we should bother. Only the 3rd actually involves our methodology. The other two are achieved through our arguments. Photo credit: http://www.qualtrics.com/blog/

Justifying Research Problems Explain what is not known about the problem. Why does the problem matter? Provide documentation that this is actually a problem. Available statistics? Available literature that shows that this is a needed area of inquiry? Discrepancies in the literature What would convince *you* that something is actually problematic and worthy of further study?

What is not usually good enough for a full Justification? “No one has looked at it or studied it before” But institutional blind spots could hide important results “The issue or problem is not discussed in the literature, media, etc.” e.g., research has failed to examine the effect of shirt color on food preference.

Research Problem Some examples from my own work: What would the internet look like if we could build it from scratch, knowing everything we know about networking today? NSF 100x100 Clean-Slate Design Project How can the resistance to new technologies in the network layer be addressed? My Ischool thesis How does the graph-structure of supply chains influence market outcomes? Ongoing work with Pedro Ferreira

Defining Research Questions Forming a research question: “an interrogative sentence or statement that asks: What relation exists between two or more concepts?” A good problem can produce many research questions Some may be oriented to quantitative or qualitative investigation; we are focusing on the former. The research question(s) can be restated in one or more ways to produce testable hypotheses.

Testable Hypotheses Once a research problem has been identified and one or more research questions emerge, specific hypotheses are used to relate specific variables Note the significant role of operationalization at this point! Research problem  questions  hypotheses  variables.

Hypothesis “hypothesis statements contain two or more variables that are measurable or potentially measurable and that specify how the variables are related” (Kerlinger 1986) A good research question will produce one or more testable hypotheses. Testable hypotheses predict a relationship between variables (not concepts). A good research question will produce one or more testable hypotheses. Testable hypotheses predict a relationship between variables (not concepts).

Propositions and Hypotheses Propositions link concepts together with specific relationships Hypotheses link variables together with specific relationships Wikipedia Editing Online Leadership Total Number of Edits by Category during Time X Observed Activities in Organizational Wiki Pages

Null and Alternative Hypotheses Null Hypothesis, H0: The default, specific assumption. Usually that there is no effect. Alternative Hypothesis, HA: A prediction that comes from our expectation, theory, etc. Often that some relationship exists. We cannot prove our alternative hypothesis. All we can do is reject the null hypothesis or not.

Probability(Observations) Statistics Observation  Probability(Model) Specific Model  Probability(Observations) How we wish it worked How it actually works Space of models is too unstructured to assign probabilities: Given a sample mean, what is the probability that the average height of humans is exactly 5.5 feet? Um, probably zero? Given that the sun has come up for each of the last 10 days, what is the probability that it comes up at least 90% of the time? One question we can answer: Given null hypothesis, how unlikely are the observations? Why do we focus our investigation on the null hypothesis? This relates to the capabilities of statistics. We can’t find the probability that a model is true. We certainly can never prove a point-estimate. No amount of data will ever prove that the average height is exactly 5.5 feet. In fact, we really can’t even prove that a model is within a range with any probability. (ok, sometimes we want the probability that a parameter is within a confidence interval, but this requires many more assumptions. We’re saying, what is the probability we’re within a confidence interval, assuming that one of a family of models is true, and they all have equal probability before any evidence.) One thing we can do is take a point estimate, and find the probability for a set of observations.

Example Suppose that students in the US read an average of 6 hours a week, with a standard deviation of 2 hours. We want to know if Berkeley students are different than the national average Our null hypothesis should be that Berkeley student’s reading habits are exactly the same as the national average (drawn from the same distribution) Given the null hypothesis, we can compute how likely we are to find any average number of reading hours in our Berkeley sample.

The Z-test as a hypothesis test z is a test statistic More generally: Z tells us how many standard errors an observed value is from its expected value. Finding a z-score for a single variable Using a z-score as a statistical test z = observed – expected SE Note that we are now using observed and expected values, along with standard errors So, specifically, we are using the spread of a chance process.

One and two-tailed hypothesis testing One-tailed test Directional Hypothesis Probability at one end of the curve Two-tailed test Non-directional Hypothesis Probability is both ends of the curve

Logic of Hypothesis Testing: One-tailed tests Null Hypothesis: H0: μ1 = μ2 μ1 and μ2 are the two population means Alternative Hypotheses: H1: μ1 < μ2 H1: μ1 > μ2

Logic of Hypothesis Testing: two-tailed test Null Hypothesis: H0: μ1 = μ2 μ1 and μ2 are the two population means Alternative Hypothesis: H1: μ1 ≠ μ2

Example 1 Do Berkeley Information students read either more or less than the national average of 6 hours a week? H0: μ = 6 The mean for Berkeley students is equal to 6 H1: μ ≠ 6 The mean for Berkeley students is not equal to 6

Example 2 Do Berkeley Information students read more than the national average of 6 hours a week? H0: μ = 6 There is no difference between Berkeley students and other students H1: μ > 6 The mean for Berkeley students is higher than the mean for all students

Statistical Significance http://xkcd.com/539/

The meaning of p-value “The P-value of a test is the chance of getting a big test statistic– assuming the null hypothesis to be right. P is not the chance of the null hypothesis being right. -David Freedman (Statistics, 4th Edition) This is subtle, important, and constantly misunderstood.

The meaning of p-value Put differently: The p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.

“Cheating” in Hypothesis Testing The entire point of hypothesis testing is that we want to evaluate an outcome given our hypothesis (our prediction). If we change our hypothesis in light of the data and results, it becomes increasingly likely that we will report irreproducible results. This is cheating in research. Remember: the p-value does not tell you if something is right or not. It only tells you the chance of getting a big test statistic, assuming the null hypothesis is true.

Common Evaluation of p-values When p value > .10 → the observed difference is “not significant” When p value ≤ .10 → the observed difference is “marginally significant” or “borderline significant” When p value ≤ .05 → the observed difference is “statistically significant” When p value ≤ .01 → the observed difference is “highly significant”

More on Evaluating Statistical Hypothesis Testing Recall that of our two hypotheses—null and alternative– only one can be supported (the cannot both be supported at the same time). Our decision to retain or reject our null hypothesis leaves an opportunity to commit two different types of error. H0: μ1 = μ2 H1: μ1 ≠ μ2 Null Hypothesis: Alt Hypothesis:

Type I Error in Statistical Hypothesis Testing Type I Error: Falsely rejecting a null hypothesis (false positive) Occurs when we think we are supporting our alternative hypothesis but the effect is not real. Denoted by alpha: α Our alpha value is our statistical significance level, such as the standard .05 level (critical value must fall within 5% region of one tail, or within 2.5% of either tail) Might help to think of a pregnancy test…Null = not pregnant. (or not guilty) In much social science work, Type I is a bit more dangerous than Type II.

Type II Error in Statistical Hypothesis Testing Type II Error: Failing to reject the null hypothesis when it is false (false negative) Can occur with very conservative estimates. Denoted by beta: β If we reduce α in a given test, we will increase β, and vice versa

Statistical Power We know that the probability of a Type II error (β) is the probability to miss a potentially important finding by retaining the null when it is actually false. 1 – β is known as statistical power, or just power. It is the probability of rejecting the null hypothesis when it is, in reality, false. Image: http://rpsychologist.com/

More on Statistical Power Sample size is one of the most common ways that we increase our statistical power. As sample size increases, the sample becomes more and more like the population. The criterion used for statistical significance as well as the magnitude of the effect that we hope to observe also affect power. http://what-if.xkcd.com/

Can we “prove” the null or alternative hypothesis? We cannot support the null hypothesis It’s a point estimate As odd as it may seem at first, we reject or do not reject the null; a traditional hypothesis test is evaluated against the null. Again– this is very important– we never use the word “prove” with hypothesis testing and statistics, we reject or fail to reject. We always run the risk of a Type I or Type II error because we are making optimal use of available data. We minimize error, but we are always making a choice in the presence of uncertainty.

What your p-value won’t tell you Statistical significance (the p-value of our statistical hypothesis test) only tells us if we should reject or accept our null hypothesis. Statistical significance does not tell us if a difference or an effect is actually large, small, useful or not practically useful.

Effect Size After we conduct a statistical test, we look at the size of an effect– the effect size. Different effect size calculations exist for different types of tests Include: r coefficient for correlations, Cohen’s d for difference in means, regression coefficients.

Some Examples Example 1 Example 2 Hypothesis: approval ratings of the US president will be higher among group X compared to group Y. Measure: Approval scale from 1-10 (where 10 = highest approval) We get a random sample, N=30 Result: Mean for group X is 4, mean for group Y is 7 We get a p-value of .06 (marginally statistically significant) Hypothesis: approval ratings of the US president will be higher among group X compared to group Y. Measure: Approval scale from 1-10 (where 10 = highest approval) We get a random sample, N=10,000 Result: Mean for group X is 7, mean for group Y is 7.2 We get a p-value of .001 (highly statistically significant)

Practical vs. Statistical Significance Statistical Significance: Tells us if we should accept or reject our null hypothesis. This is determined by probability. Practical Significance: Tells us the magnitude of a given effect; whether the observed effect is meaningful or important.