IS 4800 Empirical Research Methods for Information Science Class Notes March 13 and 15, 2012 Instructor: Prof. Carole Hafner, 446 WVH

Slides:



Advertisements
Similar presentations
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Advertisements

PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Sampling: Final and Initial Sample Size Determination
Statistical Decision Making
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Sampling Distributions
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Inferences About Means of Two Independent Samples Chapter 11 Homework: 1, 2, 4, 6, 7.
Inference about a Mean Part II
Inferences About Process Quality
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Statistics for the Social Sciences Psychology 340 Spring 2005 Using t-tests.
PSY 307 – Statistics for the Behavioral Sciences
Hypothesis Testing Using The One-Sample t-Test
PSY 307 – Statistics for the Behavioral Sciences
Inferential Statistics
AM Recitation 2/10/11.
Hypothesis Testing:.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Overview Definition Hypothesis
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Section #4 October 30 th Old: Review the Midterm & old concepts 1.New: Case II t-Tests (Chapter 11)
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Sampling Distribution of the Mean Central Limit Theorem Given population with and the sampling distribution will have: A mean A variance Standard Error.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Stats 95 t-Tests Single Sample Paired Samples Independent Samples
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Hypothesis Tests: One Sample Mean
AP Statistics Chapter 9 Notes.
Hypothesis Testing CSCE 587.
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
Individual values of X Frequency How many individuals   Distribution of a population.
Mid-Term Review Final Review Statistical for Business (1)(2)
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
IS 4800 Empirical Research Methods for Information Science Class Notes March 2, 2012 Instructor: Prof. Carole Hafner, 446 WVH Tel:
Stats Lunch: Day 4 Intro to the General Linear Model and Its Many, Many Wonders, Including: T-Tests.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Jeopardy Hypothesis Testing t-test Basics t for Indep. Samples Related Samples t— Didn’t cover— Skip for now Ancient History $100 $200$200 $300 $500 $400.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
© Copyright McGraw-Hill 2000
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
MeanVariance Sample Population Size n N IME 301. b = is a random value = is probability means For example: IME 301 Also: For example means Then from standard.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
IS 4800 Empirical Research Methods for Information Science Class Notes March 16, 2012 Instructor: Prof. Carole Hafner, 446 WVH Tel:
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Stats 95 t-Tests Single Sample Paired Samples Independent Samples.
Other Types of t-tests Recapitulation Recapitulation 1. Still dealing with random samples. 2. However, they are partitioned into two subsamples. 3. Interest.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
University of Palestine
Lecture Nine - Twelve Tests of Significance.
INF397C Introduction to Research in Information Studies Spring, Day 12
Hypothesis Tests: One Sample
Central Limit Theorem, z-tests, & t-tests
Presentation transcript:

IS 4800 Empirical Research Methods for Information Science Class Notes March 13 and 15, 2012 Instructor: Prof. Carole Hafner, 446 WVH Tel: Course Web site:

Parametric Statistics (numeric variables) Assumes a (near-enough-to) normal population distribution so these parameters make sense: μ = the population mean (unknown) σ 2 = the population variance σ = the population standard deviation Samples of size N are used to estimate these parameters M is the sample mean used to estimate μ Calculate: M = Σ X N

3 Relationship Between Population and Samples When a Treatment Had No Effect

4 Relationship Between Population and Samples When a Treatment Had An Effect

What we must decide Which one of these diagrams to believe ????? How to express belief in the first diagram How to express belief in the second diagram How do we make that decision ? How far apart do the sample means need to be? We calculate this relative to information about the variance !! ? Using a criterion alpha which is our tolerance for being wrong !!

Estimating population variance SS = Σ (X - M) 2 “Sum of Squares” SD 2 = Σ (X - M) 2 Sample variance N S 2 = Σ (X - M) 2 = SS Estimated population variance N – 1 N-1 σ 2 M = true variance of the sample means = σ 2 (unknown) N S 2 M = estimated variance of the sample means = S 2 N

Why do we care about the variance of the sample means ? Sampling Distribution –The distribution of means of every possible sample taken from a population (with size N) Sampling Error –The difference between a sample mean and the population mean: M - μ –The standard error of the mean is a measure of sampling error (std dev of distribution of means)

8 Understanding numeric measures Sources of variance –IV –Other uncontrolled factors (“error variance”) If (many) independent, random variables with the same distribution are added, the result approximately a normal curve –The Central Limit Theorem

9 The most important parts of the normal curve (for testing) Z=1.65 5%

10 The most important parts of the normal curve (for testing) Z= % Z= %

11 Hypothesis testing – two tailed Hypothesis: sample (of 1) will be significantly different from known population distribution Example – WizziWord experiment: –H1:  WizziWord   Word –  (two-tailed) –Population (Word users):  Word  –What level of performance do we need to see before we can accept H1?

12 Hypothesis testing – two tailed Hypothesis: sample (of 1) will be significantly different from known population distribution Example – WizziWord experiment: –H1:  WizziWord   Word –  (two-tailed) –Population (Word users):  Word  –What level of performance do we need to see before we can accept H1? Must see performance >1.96 stddevs above mean = 199 BUT, also if performance < 1.96 stddevs below mean = 101 Will reject H0.

13 Standard testing criteria for experiments  Two-tailed

14 Don’t try this at home You would never do a study this way. Why? –Can’t control extraneous variables through randomization. –Usually don’t know population statistics. –Can’t generalize from one individual.

Population  Mean?Variance? Sampling Sample of size N Mean values from all possible samples of size N aka “distribution of means”    Z M = ( M - 

Z tests and t-tests t is like Z: Z = M - μ / t = M – 0 / We use a stricter criterion (t) instead of Z because is based on an estimate of the population variance while is based on a known population variance.

Given info about population of change scores and the sample size we will be using (N) T-test with paired samples Now, given a particular sample of change scores of size N We can compute the distribution of means We compute its mean and finally determine the probability that this mean occurred by chance ?  = 0 S 2 est  2 from sample = SS/df df = N-1 S 2 M = S 2 /N

t test for independent samples Given two samples Estimate population variances (assume same) Estimate variances of distributions of means Estimate variance of differences between means (mean = 0) This is now your comparison distribution

Estimating the Population Variance S 2 is an estimate of σ 2 S 2 = SS/(N-1) for one sample (take sq root for S) For two independent samples – “pooled estimate”: S 2 = df 1 /df Total * S df 2 /df Total * S 2 2 df Total = df 1 + df 2 = (N1 -1) + (N2 – 1) From this calculate variance of sample means: S 2 M = S 2 /N needed to compute t statistic

t test for independent samples, continued This is your comparison distribution NOT normal, is a ‘t’ distribution Shape changes depending on df df = (N1 – 1) + (N2 – 1) Distribution of differences between means Compute t = (M1-M2)/SDifference Determine if beyond cutoff score for test parameters (df,sig, tails) from lookup table.

21 Effect size The amount of change in the DVs seen. Can have statistically significant test but small effect size.

22 Power Analysis Power –Increases with effect size –Increases with sample size –Decreases with alpha Should determine number of subjects you need ahead of time by doing a ‘power analysis’ Standard procedure: –Fix alpha and beta (power) –Estimate effect size from prior studies Categorize based on Table 13-8 in Aron (sm/med/lg) –Determine number of subjects you need –For Chi-square, see Table in Aron reading