EDUC 200C Section 5–Hypothesis Testing Forever November 2, 2012.

Slides:



Advertisements
Similar presentations
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Advertisements

Chapter 25: Paired Samples and Blocks. Paired Data Paired data arise in a number of ways. Compare subjects with themselves before and after treatment.
Topics Today: Case I: t-test single mean: Does a particular sample belong to a hypothesized population? Thursday: Case II: t-test independent means: Are.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Statistics 1: Introduction to Probability and Statistics Section 3-3.
Sociology 601 Class 8: September 24, : Small-sample inference for a proportion 7.1: Large sample comparisons for two independent sample means.
Objectives (BPS chapter 24)
INT 506/706: Total Quality Management Lec #9, Analysis Of Data.
Confidence Interval and Hypothesis Testing for:
Comparison of 2 or more means Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture material.
Lab 4: What is a t-test? Something British mothers use to see if the new girlfriend is significantly better than the old one?
Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
Sociology 601 Class 10: October 1, : Small sample comparisons for two independent groups. o Difference between two small sample means o Difference.
1 Test a hypothesis about a mean Formulate hypothesis about mean, e.g., mean starting income for graduates from WSU is $25,000. Get random sample, say.
PSY 307 – Statistics for the Behavioral Sciences
What z-scores represent
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Lecture 13: Review One-Sample z-test and One-Sample t-test 2011, 11, 1.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Inference about a Mean Part II
Statistics 101 Class 9. Overview Last class Last class Our FAVORATE 3 distributions Our FAVORATE 3 distributions The one sample Z-test The one sample.
Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations.
Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)
VARIABILITY. PREVIEW PREVIEW Figure 4.1 the statistical mode for defining abnormal behavior. The distribution of behavior scores for the entire population.
Chapter 9: Introduction to the t statistic
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
PSY 307 – Statistics for the Behavioral Sciences
The t-test Inferences about Population Means when population SD is unknown.
Fall 2012Biostat 5110 (Biostatistics 511) Discussion Section Week 8 C. Jason Liang Medical Biometry I.
T Test for One Sample. Why use a t test? The sampling distribution of t represents the distribution that would be obtained if a value of t were calculated.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Lecture 14 Testing a Hypothesis about Two Independent Means.
Section #4 October 30 th Old: Review the Midterm & old concepts 1.New: Case II t-Tests (Chapter 11)
STATISTICS For Research. 1. Quantitatively describe and summarize data A Researcher Can:
STATISTICS For Research. Why Statistics? 1. Quantitatively describe and summarize data A Researcher Can:
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
Experiments I will try and post the slides used in class each week on my website
Section 9-4 Hypothesis Testing Means. This formula is used when the population standard deviation is known. Once you have the test statistic, the process.
More About Significance Tests
Basic Statistics Measures of Variability The Range Deviation Score The Standard Deviation The Variance.
The Hypothesis of Difference Chapter 10. Sampling Distribution of Differences Use a Sampling Distribution of Differences when we want to examine a hypothesis.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Topic 5 Statistical inference: point and interval estimate
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
Inferential Statistics 2 Maarten Buis January 11, 2006.
One-sample In the previous cases we had one sample and were comparing its mean to a hypothesized population mean However in many situations we will use.
For 95 out of 100 (large) samples, the interval will contain the true population mean. But we don’t know  ?!
Testing means, part II The paired t-test. Outline of lecture Options in statistics –sometimes there is more than one option One-sample t-test: review.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Psych 230 Psychological Measurement and Statistics
Inferential Statistics. Coin Flip How many heads in a row would it take to convince you the coin is unfair? 1? 10?
Testing Differences between Means, continued Statistics for Political Science Levin and Fox Chapter Seven.
Descriptive Statistics Used to describe a data set –Mean, minimum, maximum Usually include information on data variability (error) –Standard deviation.
Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.
Data Analysis.
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Introduction to Inference Confidence Intervals Issue of accuracy Remember: all 3 conditions must be met (randomization, normality, independence) Margin.
366_8. Estimation: Chapter 8 Suppose we observe something in a random sample how confident are we in saying our observation is an accurate reflection.
366_7. T-distribution T-test vs. Z-test Z assumes we know, or can calculate the standard error of the distribution of something in a population We never.
T tests comparing two means t tests comparing two means.
Inference for distributions: - Comparing two means.
Sample Size Needed to Achieve High Confidence (Means)
SAT Math Scores. Travel Time to Work In repeated sampling, the average distance between the sample means and the population mean will be about
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
business analytics II ▌assignment one - solutions autoparts 
STATISTICS For Research
Hypothesis tests for the difference between two means: Independent samples Section 11.1.
Univariate Statistics
Presentation transcript:

EDUC 200C Section 5–Hypothesis Testing Forever November 2, 2012

Goals Quick review of hypothesis testing Confidence intervals Stata Practice Problem Questions?

Review of the General Idea of Hypothesis Testing We’re good at “the SAT question”—given a population mean and standard deviation, how rare is observing a particular score? (we all know our percentile on the GRE, for example, and what that means) Hypothesis testing is the same, except we have: – Sample means instead of test scores – Null hypothesis instead of population mean – Standard error instead of standard deviation We want to know, is our sample mean likely to have come from the population described by the null hypothesis?

Confidence Intervals Allows us to give a range of scores in which we are “confident” that the true mean of the population our sample was drawn from resides. We know our sample mean has a 95% chance of being within a certain distance of the mean of the true population from which the sample was drawn (this might not be the null hypothesis population) What is this distance? – Depends on the critical t value of our sample, t α

Confidence intervals with Z-scores We know with 95% confidence that our sample mean is no more than 1.96 standard deviations from the true mean. That is, the z score of the true mean (of the population from which our sample was drawn…might not be null hypothesis population) is within 1.96 of our sample mean z score. Another way to see it: we reject the null hypothesis for any z value not between and 1.96.

Confidence Interval math… The z score of the true mean is always zero Substitute the z score formula Multiply by the standard error Add the population mean

Confidence Intervals Thus we have that the true population mean lies, with 95% confidence in the range We can generalize this for other levels of confidence by changing our critical z value We can also generalize for the t distribution

Stata… Quick command to describe your data summarize varname This also has the “detail” option, which gives more detail “Summarize” can be shortened to “sum” and “detail” to “d” so we can write summarize varname, detail Or sum varname, d

Say we have a sample of reading scores…. sum rdg Variable | Obs Mean Std. Dev. Min Max rdg | sum rdg, d RDG Percentiles Smallest 1% % % Obs % Sum of Wgt % 52.1 Mean Largest Std. Dev % % Variance % Skewness % Kurtosis

Using Stata to test our null hypothesis Kenji talked yesterday about running a t-test to test our null hypothesis. You can use this to compare the mean of a sample to a particular value. ttest var==[null hyp. value]

. ttest rdg==50 One-sample t test Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] rdg | mean = mean(rdg) t = Ho: mean = 50 degrees of freedom = 299 Ha: mean 50 Pr(T |t|) = Pr(T > t) =

. ttest rdg==50 One-sample t test Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] rdg | mean = mean(rdg) t = Ho: mean = 50 degrees of freedom = 299 Ha: mean 50 Pr(T |t|) = Pr(T > t) =

. ttest rdg==50 One-sample t test Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] rdg | mean = mean(rdg) t = Ho: mean = 50 degrees of freedom = 299 Ha: mean 50 Pr(T |t|) = Pr(T > t) =

Practice Problem Fifteen years ago a complete survey of undergraduate students at a large university indicated that the average student smoked an average of 8.3 cigarettes per day. The director of the student health center wishes to determine whether the incidence of cigarette smoking at his university has decreased over the 15-year period. He obtains the following results from a recently selected random sample of undergraduate students: – What are H 0 and H 1 ? – Can you reject the null hypothesis with α=0.05? – What is the 95% confidence interval for the true value of current mean cigarettes smoked per day? – Draw final conclusions

Confidence Intervals for Hands data

Questions?