Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Hypothesis Testing IV Chi Square.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
1. Estimation ESTIMATION.
Review: What influences confidence intervals?
Behavioural Science II Week 1, Semester 2, 2002
Topics: Inferential Statistics
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Chapter Sampling Distributions and Hypothesis Testing.
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
Today Concepts underlying inferential statistics
Using Statistics in Research Psych 231: Research Methods in Psychology.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
AM Recitation 2/10/11.
Hypothesis Testing:.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Hypothesis Testing II The Two-Sample Case.
Chapter 8 Introduction to Hypothesis Testing
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Chapter 15 Data Analysis: Testing for Significant Differences.
Chapter 8 Introduction to Hypothesis Testing
Individual values of X Frequency How many individuals   Distribution of a population.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
Chapter 7 Estimation Procedures. Basic Logic  In estimation procedures, statistics calculated from random samples are used to estimate the value of population.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Jeopardy Hypothesis Testing t-test Basics t for Indep. Samples Related Samples t— Didn’t cover— Skip for now Ancient History $100 $200$200 $300 $500 $400.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Chapter 6: Analyzing and Interpreting Quantitative Data
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Warsaw Summer School 2015, OSU Study Abroad Program Normal Distribution.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
m/sampling_dist/index.html.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Inferential Statistics. Population Curve Mean Mean Group of 30.
Inferential Statistics Psych 231: Research Methods in Psychology.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Other tests of significance. Independent variables: continuous Dependent variable: continuous Correlation: Relationship between variables Regression:
Review of Power of a Test
CHAPTER 9 Testing a Claim
Chi-Square X2.
Understanding Results
Inferential Statistics
Review: What influences confidence intervals?
CHAPTER 9 Testing a Claim
Hypothesis Testing.
Inferential Statistics
Psych 231: Research Methods in Psychology
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Presentation transcript:

Inferential Statistics Introduction

If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has its own column One table: insert the observed frequencies (# cases that share corresponding values of the ind. and dep. variables) Another table: insert %. Compute separately for each value of the independent varaible (columns total 100%) Bivariate (two variables) Additional independent (control) variables: Construct a first-order partial table for each level of the new variable Multivariate (three or more variables)

r =.87 r 2 =.76 Hypothesis: POVERTY  CRIME For additional (control) variables, use partial correlation or multiple regression If both variables are continuous, use the r (correlation) and r 2 (regression) statistics Bivariate (two variables) Multivariate (three or more variables)

Inferential statistics Go beyond simple evaluations, such as comparing proportions or calculating a correlation (r) Inferential statistics allow us to legitimately “generalize” our findings – apply the results from a sample to a population Since we are projecting our findings to a population, we must draw samples. If our sample is a population, we cannot use these methods. We must use probability sampling (i.e., random sampling) We calculate a “test statistic”, such as an r, X 2, t, F, Beta, etc. If this statistic is sufficiently large we can say that there is a “statistically significant” relationship between variables.

How inferential statistics work Always based on a POPULATION from which a probability (preferably – random) sample is taken Excepting Chi-Square, which has its own “sampling” distribution, it is assumed that the dependent variable scores in the population are normally distributed around a mean or centerpoint This centerpoint is the score on the dependent variable that one would expect by randomly drawing a single case from the population To assess the effect of independent variables, the computer analyzes the dependent variable scores for a particular sample. For example, in the difference between means test, the centerpoint is the average difference between all possible sample means. (This is automatically computed.) If the difference between a sample mean and the centerpoint is large enough to wind up in the distribution’s “tail” (shaded area), the relationship is deemed statistically “significant”. Essentially the same logic applies to all inferential statistics.

Systematic and Error Effects Whenever we “observe” that one variable may be causing corresponding changes in another variable, we are really seeing the sum of two things: –A “systematic” effect, meaning that portion of the relationship that is produced by the action of the independent variable –An “error” or “chance” effect, meaning that portion that is produced by random factors Imagine that the entire surface area of this shape represents the observed relationship between two variables How does removing what is due to chance affect our estimate of the strength of the relationship between the variables? Imagine that our calculations determine that this portion of the relationship is due to chance

The null hypothesis holds that the relationship between variables (including any difference between groups) is caused by chance. Essentially, that our working hypothesis is incorrect. To “reject” the null hypothesis we must demonstrate that the association between variables is substantially larger than would be produced by chance alone. The first step is to compute a “test statistic” – an r, X 2, t, F, Beta, etc. Based on the size of the test statistic, the sample size, etc. the computer calculates the probability that the null hypothesis is true. If at least ONE asterisk appears next to a test statistic (r, t, F, X 2, Beta, etc.), the statistic is sufficiently large to overcome the null hypothesis. [test statistic]* =.05 (5 chances in 100 that the null hypo. is true) [test statistic]** =.01 (1 chance in 100 the null hypo. is true) [test statistic]*** =.001 (1 chance in 1,000 the null hypo is true) Null Hypothesis

Tests of Significance The specific test to be used depends on the level of measurement of the variables All variables categorical: Chi-Square (X 2 ) Independent variable categorical, dependent variable continuous: T-test and ANOVA All variables continuous: Correlation and regression Tests of significance yield a statistic, which is the ratio of: Systematic effect T he “real” influence of the independent variables STATISTIC = (e.g., r, Chi-Square, Chance/random effect “t”, z, “F”, Beta) The apparent influence, actually produced by chance The LARGER the numerator, the smaller the denominator, the larger the statistic As statistics get bigger, the probability that the null hypothesis is true falls When the probability that the null hypothesis is true falls to five chances in hundred (.05) or less, the relationship between the variables is deemed “statistically significant”

Estimating the error between sample and population statistics Standard error of the mean –An estimate of the amount of the (chance) error we can expect between the means of repeated samples and the population “parameter” s (standard deviation of our sample) S x = _____ ___  n-1 (sq. root of sample size minus 1) –We can’t use s without adjustment because the standard deviation of a sample contains real (systematic) differences in addition to differences that are due to error

Confidence limit: The range of values within which we are confident the population mean will fall Center point for estimating this range is the mean of a single, randomly drawn sample cl =  x  z (S x ) How “confident” must we be? In social science research, we do not want to exceed a chance of more than 5 in 100 of being wrong –z-scores represent an actual value on the variable being measured – the more extreme the value, the more extreme the z-score –Cases with extreme values are so rare that we could say they do not “belong” to the population from which the sample was drawn Confidence Limit and z scores

95% of the area under a standard normal curve is in the range 1.96z to +1.96z. So, for z, we usually use 1.96 Since the means of repeated samples are normally distributed around the population mean, we can use the normal curve and z scores to find the confidence limits

Class exercise – confidence limit Using data from the offender sample drawn earlier, calculate a confidence limit into which the population parameter will fall 95 out of 100 times s Sx = _____ ___  n-1 cl =  x  z (Sx)

Police recruit IQ test Sample of 100 recruits tested for IQ ( it is well established that IQ scores are normally distributed in the general population ).  x = 120 n = 100 s = 16 Standard error of the mean: s S x = ___  n-1 Confidence limit: cl =  x  z (S x ) There is a 95% probability that the mean IQ of the population from which this sample was drawn falls between these scores: _______ and ________ lower limit upper limit

 x = 120 n = 100 s = 16 s 16 S x = _____ = ________ = 1.61 ___ ____  n-1  99 cl = 120  1.96 (1.61) 95% probability that, based on the statistics of this sample, the mean IQ of the population from which the sample was drawn falls between and