Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row H Row I Row J Row K Row L Computer Storage Cabinet Cabinet Table broken desk
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated Learning Center (ILC) 10: :50 Mondays, Wednesdays & Fridays.
Reminder A note on doodling
Extra Credit - Due November 24 th - There are five parts 1. A one page report of your design (includes all of the information from the writing assignment) Describe your experiment: what is your question / what is your prediction? State your Independent Variable (IV), number of levels and the operational definition State your Dependent Variable (DV), and operational definition How many participants did you measure, and how did you recruit (sample) them Was this a between or within participant design (why?) 2. Gather the data Try to get at least 10 people (or data points) per level If you are working with other students in the class you should have 10 data points per level for each member of your group 3. Input data into Excel (hand in data) 4. Complete ANOVA analysis hand in ANOVA table 5. Statement of results (see next slide for example) and include a graph of your means (just like we did in the homework) This will be graded by attention to detail, creativity and interest of topic
Exam 3 – This past Friday Thanks for your patience and cooperation Grades are posted It went really well!
Remember… In a negatively skewed distribution: mean < median < mode Score on Exam Mean Frequency Mode Note: Always “frequency” 92 = mode = tallest point 80 = median = middle score 76 = mean = balance point Note: Label and Numbers Median
Schedule of readings Before next exam (Monday December 8 th ) Please read chapters 10 – 14 Please read Chapters 17, and 18 in Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions Study Guide is already up
Next couple of lectures 12/1/14 Use this as your study guide Logic of hypothesis testing with Correlations Interpreting the Correlations and scatterplots Simple and Multiple Regression
Labs continue this week with multiple regression
Homework due – Wednesday (December 3 rd ) Assignment 20 Completing Correlation Hypothesis Testing using Excel
Correlation Correlation: Measure of how two variables co-occur and also can be used for prediction Range between -1 and +1 Range between -1 and +1 The closer to zero the weaker the relationship and the worse the prediction The closer to zero the weaker the relationship and the worse the prediction Positive or negative Positive or negative Remember, We’ll call the correlations “r”
Positive correlation Positive correlation: as values on one variable go up, so do values for other variable pairs of observations tend to occupy similar relative positions higher scores on one variable tend to co-occur with higher scores on the second variable lower scores on one variable tend to co-occur with lower scores on the second variable scatterplot shows clusters of point from lower left to upper right Remember, Correlation = “r”
Negative correlation Negative correlation: as values on one variable go up, values for other variable go down pairs of observations tend to occupy dissimilar relative positions higher scores on one variable tend to co-occur with lower scores on the second variable lower scores on one variable tend to co-occur with higher scores on the second variable scatterplot shows clusters of point from upper left to lower right Remember, Correlation = “r”
Zero correlation as values on one variable go up, values for the other variable go... anywhere pairs of observations tend to occupy seemingly random relative positions scatterplot shows no apparent slope
Is it possible that they are causally related? Correlation does not imply causation Yes, but the correlational analysis does not answer that question What if it’s a perfect correlation – isn’t that causal? No, it feels more compelling, but is neutral about causality Number of Birthday Cakes Number of Birthdays Remember the birthday cakes!
Correlation - How do numerical values change? r = r = r = r = 0.61
Correlation The more closely the dots approximate a straight line, the stronger the relationship is. One variable perfectly predicts the other No variability in the scatter plot The dots approximate a straight line Perfect correlation = or -1.00
Finding a statistically significant correlation The result is “statistically significant” if: the observed correlation is larger than the critical correlation we want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) the p value is less than 0.05 (which is our alpha) we want our “p” to be small!! we reject the null hypothesis then we have support for our alternative hypothesis
Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule Alpha level? ( α =.05 or.01)? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger than critical r then reject null Step 5: Conclusion - tie findings back in to research problem Critical statistic (e.g. critical r) value from table? For correlation null is that r = 0 (no relationship) Degrees of Freedom = (n – 2) df = # pairs - 2
Five steps to hypothesis testing Problem 1 Is there a relationship between the: Price Square Feet We measured 150 homes recently sold
Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule – find critical r (from table) Alpha level? ( α =.05) null is that there is no relationship (r = 0.0) Degrees of Freedom = (n – 2) df = # pairs - 2 Is there a relationship between the cost of a home and the size of the home alternative is that there is a relationship (r ≠ 0.0) 150 pairs – 2 = 148 pairs
Critical r value from table df = # pairs - 2 df = 148 pairs α =.05 Critical value r (148) = 0.195
Five steps to hypothesis testing Step 3: Calculations
Five steps to hypothesis testing Step 3: Calculations
Five steps to hypothesis testing Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger than critical r then reject null r = Critical value r (148) = Observed correlation r (148) = Yes we reject the null > 0.195
Finding a statistically significant correlation The result is “statistically significant” if: the observed correlation is larger than the critical correlation we want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) the p value is less than 0.05 (which is our alpha) we want our “p” to be small!! we reject the null hypothesis then we have support for our alternative hypothesis
Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables 1.0** EducationAgeIQIncome IQ Age Education Income 1.0** 0.65** 0.52* 0.27* 0.41* 0.38* * p < 0.05 ** p < 0.01 Remember, Correlation = “r”
Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables EducationAgeIQIncome IQ Age Education Income 0.65** 0.52* 0.27* 0.41*0.38* * p < 0.05 ** p < 0.01
Finding a statistically significant correlation The result is “statistically significant” if: the observed correlation is larger than the critical correlation we want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) the p value is less than 0.05 (which is our alpha) we want our “p” to be small!! we reject the null hypothesis then we have support for our alternative hypothesis
Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of X with X Correlation of Y with Y Correlation of Z with Z Correlation matrices
Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of X with Y Correlation matrices p value for correlation of X with Y p value for correlation of X with Y Does this correlation reach statistical significance?
Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of X with Z p value for correlation of X with Z p value for correlation of X with Z Correlation matrices Does this correlation reach statistical significance?
Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of Y with Z p value for correlation of Y with Z p value for correlation of Y with Z Correlation matrices Does this correlation reach statistical significance?
What do we care about? Correlation matrices
What do we care about? We measured the following characteristics of 150 homes recently sold Price Square Feet Number of Bathrooms Lot Size Median Income of Buyers
Correlation matrices What do we care about?
Correlation matrices What do we care about?
Correlation matrices What do we care about?