Statistics for the Social Sciences Psychology 340 Spring 2009 Introductions & review of some basics
This course How is this course different from PSY 138? Format: Longer lectures, but no separate lab times Content: review of PSY 138 and beyond (remember the “beyond 138”) Dealing with more complex situations (more than 2 variables) Hypothesis testing with: Correlation and regression Multiple regression Tests for differences with more than two groups
PSY 138
This course
Course Format Lectures - will take up most of class time Homework - practice items, ARE collected for credit Labs - practice items, NOT collected for credit Two parts: By hand step-by-step example(s) Large SPSS dataset practice question(s) Exams - three cumulative exams Quizzes - on-line Blackboard quizzes (about 10 of them), based primarily on textbook readings
Variability is key A the heart of research methodology and statistics is variability Variables - characteristics with values that aren’t constant (across individuals, time, place, etc.) We’re interested in explaining (predicting) the variability of variables We use experimental control to try to constrain variability to make it easier to see how different variables affect each other We use statistical procedures to examine which variables vary together (and which don’t)
Statistical analysis follows design Statistical analysis follows from the design of a study Our decision tree helps us ask the right design questions which will lead us to the appropriate statistical test
Statistical analysis follows design Decide if there is a difference Experimental & Quasi-experimental studies Testing for differences between groups (conditions) Vs. Decide if there is a relationship between variables Observational studies Testing for similarities between variables This is a generality, there are exceptions. Towards the end of the course we’ll see that the two may be considered essentially the same kinds of analyses
Basic Research Methods Observational study Researcher observes and measures variables of interest to find relationships between the variables No attempt is made to manipulate or influence responses Experimental methodology One (or more) independent variable(s) is manipulated while changes are observed in another variable (dependent) Used to establish cause-and-effect relationships between variables Uses extensive methods of control to minimize extraneous sources of variability Quasi-experimental methodology One (or more) of the independent variables is a pre-existing characteristic (e.g., sex, age, etc.)
Different basic methods Experimental versus Observational methods Experiments involve manipulation of variables Observational methods involve examining things as they already are
Example Issue: What’s the best way to study for a test? Observational Randomly select individuals Randomly assign to groups Crammed study group Distributed study group See how they do on a test Experimental Randomly select individuals Watch their study habits See how they do on a test
Experimental Control Our goal: to test the possibility of a relationship between the variability in our IV and how that affects our DV. Control is used to minimize excessive variability. To reduce the potential of confounds.
Logic of experimental control Sources of Total (T) Variability: T = NonRandomexp + NonRandomother + Random Variability in Test Performance Imprecision in manipulation (IV) & measurement (DV) & random varying extraneous variables Manipulated independent variables (IV) variables which covary with IV (condfounds) Study method: Crammed Distributed Distributed studiers never get to practice problmems Different study times, different study methods, etc.
Logic of experimental control Sources of Total (T) Variability: T = NonRandomexp + NonRandomother + Random Constrain variability by carefully levels of IV Eliminate counfounds Use good measures Experimental procedures are used to reduce R and NRother so that we can detect NRexp. That is, so we can see the changes in the DV that are due to the changes in the independent variable(s).
Weight analogy Imagine the different sources of variability as weights NR exp other R NR other Treatment group control group
Weight analogy If NRother and R are large relative to NRexp then detecting a difference may be difficult R NR other R NR exp other
Weight analogy But if we reduce the size of NRother and R relative to NRexp then detecting gets easier NR other R R NR exp other
Logic of observational approaches Suppose that you wish to predict exam performance using an observational method Sources of Total (T) Variability: T = NonRandomother + Random Variables that do covary with test performance Variables that don’t covary with test performance Observe and record variables, but don’t know which group they’ll fit into Hours of sleep Total study time Test time That’s what we’ll use statistics to find out Study topic Breakfast food
Logic of observational approaches Total variability it test performance Total study time r = .6 Unexplained variance 64% Some co-variance between the two variables If we know the total study time, we can predict 36% of the variance in test performance
Logic of observational approaches Total variability it test performance Total study time r = .6 Test time r = .1 Unexplained variance 51% A little co-variance between these test performance and test time If we add it to study time, then we can explain more the of variance in test performance
Logic of observational approaches Total variability it test performance breakfast r = .0 Total study time r = .6 Test time r = .1 Unexplained variance 51% No co-variance between these test performance and breakfast food If we add it to the other two, then we can NOT explain more the of variance in test performance
Logic of observational approaches Total variability it test performance breakfast r = .0 Total study time r = .6 Hrs of sleep r = .45 Test time r = .1 Unexplained variance 40% Some co-variance between these test performance and hours of sleep If we add it to study time, then we can explain more the of variance in test performance (but notice what happens with the overlap)
Statistical analysis follows design Statistical analysis follows from the design of a study The next question in the tree
Statistical analysis follows design Testing for a difference between a sample and a known population value Or within-groups designs 1 How many separate samples? Decide if there is a difference Testing for a difference between two samples Various “t-tests” 2 >2 Testing for a difference between two samples Various “ANOVA” designs
What are samples? Who do we test? Sample Population The set of all individuals of interest Typically we don’t have access to all of the population Sample A subset of the population from whom data is collected We test these folks and then generalize the results to the population as a whole
What are samples? condition C condition B condition A “Sample” condition D condition A “Sample” may also be used to refer to the participants (randomly) assigned to a particular condition of the experiment
Statistical analysis follows design Statistical analysis follows from the design of a study Next question in the tree
Statistical analysis follows design There is a pre-existing relationship between the groups “non-independent groups” “matched samples” Or the same subjects participate in multiple conditions “within-groups” “repeated-measures” related Are the samples related or independent? … independent There is no pre-existing relationship between the groups “between-groups”
Example Dr. Charles investigated the impact of three types of video taped teaching programs for two types of subjects (math and Spanish). 12 participants were randomly assigned to one type of teaching program and one subject. After two weeks of training Dr. Charles assessed their learning. What test should he use to analyze his data (which program works best for which subject)?
Example Dr. Charles investigated the impact of three types of video taped teaching programs for two types of subjects (math and Spanish). 12 participants were randomly assigned to one type of teaching program and one subject. After two weeks of training Dr. Charles assessed their learning. What test should he use to analyze his data (which program works best for which subject)?
Example Dr. Charles investigated the impact of three types of video taped teaching programs for two types of subjects (math and Spanish). 12 participants were randomly assigned to one type of teaching program and one subject. After two weeks of training Dr. Charles assessed their learning. What test should he use to analyze his data (which program works best for which subject)?
Example Dr. Charles investigated the impact of three types of video taped teaching programs for two types of subjects (math and Spanish). 12 participants were randomly assigned to one type of teaching program and one subject. After two weeks of training Dr. Charles assessed their learning. What test should he use to analyze his data (which program works best for which subject)?
Example Dr. Charles investigated the impact of three types of video taped teaching programs for two types of subjects (math and Spanish). 12 participants were randomly assigned to one type of teaching program and one subject. After two weeks of training Dr. Charles assessed their learning. What test should he use to analyze his data (which program works best for which subject)?
Using SPSS The design of a study also has an impact on how you need to set up your SPSS data file
Brief review of SPSS Two view windows: Data view This is where you type in all of the data To switch between the views click on the tabs
Brief review of SPSS Two view windows: Variable view This is where you specify the details about the variables
Variable view Type of variable: numeric, text, monetary, date, etc. Name of the variable, limited to 8 characters Type of variable: numeric, text, monetary, date, etc.
The Data View Each row corresponds to an experimental unit (called “cases” in SPSS lingo) Each column corresponds to a variable So each column in the data view corresponds to a row in the variable view
In-class lab With the remaining time, go ahead and work through the lab A few study descriptions, using the decision tree try to determine the appropriate statistical test Download the “majors.sav” datafile and open it up in SPSS.