Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays & Fridays.
Schedule of readings Before next exam (February 12 th ) Please read chapters in OpenStax textbook Please read Appendix D, E & F online On syllabus this is referred to as online readings 1, 2 & 3 Please read Chapters 1, 5, 6 and 13 in Plous Chapter 1: Selective Perception Chapter 5: Plasticity Chapter 6: Effects of Question Wording and Framing Chapter 13: Anchoring and Adjustment
Remember bring your writing assignment forms notebook and clickers to each lecture Register your clicker by February 1 st and receive extra credit! student.turningtechnologies.com (Please note there is no “www”)
Everyone will want to be enrolled in one of the lab sessions Labs continue next week
Project 1 - Likert Scale - Correlations - Comparing two means (bar graph) Questions?
By the end of lecture today 1/29/16 Use this as your study guide Frequency distributions and Frequency tables Guidelines for constructing frequency distributions 1. Classes should be mutually exclusive 2. Set of classes should be exhaustive 3. All classes should have equal intervals 4. Selecting number of classes is subjective (5 -15 will often work) 5. Class width should be round (easy) numbers 6. Try to avoid open ended classes Cumulative Frequency Relative Frequency and percentages Predicting frequency of larger sample based on relative frequency Pie Charts Relative Cumulative Frequency
Homework Assignment 5 Frequency Tables and Graphing with Excel Please print out and complete this homework worksheet And hand it in during class on Monday Due: Monday, February 1 st
Homework review You are looking to see if “class standing” affects the “level of sales”. Independent variable (IV):______________ Number of levels of IV: ________________ (how many means?) Quasi or True experiment:______________ Dependent variable: __________________ Between or within participant design: ______________ In this study, what is the operational definition of “class standing”? In this study, what is the operational definition of “level of sales”? Class standing Level of sales 4 Quasi Between Classification based on units earned Number of bags of peanuts sold
Homework review You are looking to see whether “type of program” has an effect on “body transformation”. Please identify the following variables: Independent variable (IV):______________ Number of levels of IV: _______________ (how many means?) Quasi or True experiment:______________ Dependent variable: __________________ Between or within participant design: ______________ What is the operational definition of “type of program”? What is the operational definition of “body transformation”? Type of program Body transformation 2 True Between Type of program = type of diet (regular versus programmatic diet) Body transformation = number of pounds lost
Homework review You are looking to see which driving choice is most efficient. So you ask each driver to drive each of the three routes and time themselves on how long it takes. Please identify the following variables: Independent variable (IV):______________ (how many means) Number of levels of IV: ________________ Dependent variable: __________________ Between or within participant design: ______________ What is the operational definition of “driving efficiency”? What is the operational definition of “driving choice”? Type of route driving efficiency 3 Within Driving efficiency = travel time (measured in minutes) Driving choice = route taken
Homework review
Notice that the operational definition of each construct matters
Homework review gender 2 quasi salary between nominal ratio
Name of City Quasi- experiment 3 Between Temperature Nominal Interval
Homework review city 3 quasi temperature between nominal interval Must be complete and must be stapled Hand in your homework
You’ve gathered your data…what’s the best way to display it??
Describing Data Visually Lists of numbers too hard to see patterns Organizing numbers helps Graphical representation even more clear This is a dot plot
Describing Data Visually Measuring the “frequency of occurrence” Then figure “frequency of occurrence” for the bins We’ve got to put these data into groups (“bins”)
Frequency distributions Frequency distributions an organized list of observations and their frequency of occurrence How many kids are in your family? What is the most common family size?
Another example: How many kids in your family? Number of kids in family
Frequency distributions Crucial guidelines for constructing frequency distributions: 1. Classes should be mutually exclusive: Each observation should be represented only once (no overlap between classes) 2. Set of classes should be exhaustive: Should include all possible data values (no data points should fall outside range) Wrong Correct Correct 0 - under under under 15 How many kids are in your family? What is the most common family size? Number of kids in family Wrong Correct No place for our families of 4, 5, 6 or 7
Frequency distributions Crucial guidelines for constructing frequency distributions: 3. All classes should have equal intervals (even if the frequency for that class is zero) Wrong Correct Correct 0 - under under under 15 How many kids are in your family? What is the most common family size? Number of kids in family
4. Selecting number of classes is subjective Generally will often work How about 6 classes? (“bins”) How about 8 classes? (“bins”) How about 16 classes? (“bins”)
5. Class width should be round (easy) numbers 6. Try to avoid open ended classes For example 10 and above Greater than 100 Less than 50 Clear & Easy Round numbers: 5, 10, 15, 20 etc or 3, 6, 9, 12 etc Lower boundary can be multiple of interval size Remember: This is all about helping readers understand quickly and clearly.
Let’s do one Scores on an exam If less than 10 groups, “ungrouped” is fine If more than 10 groups, “grouped” might be better How to figure how many values = 47 Step 1: List scores Step 2: List scores in order Step 3: Decide whether grouped or ungrouped Step 4: Generate number and size of intervals (or size of bins) Largest number - smallest number + 1 Sample size (n) 10 – – – – – – 1,024 Number of classes If we have 6 bins – we’d have intervals of 8 Whaddya think? Would intervals of 5 be easier to read? Let’s just try it and see which we prefer…
Scores on an exam Scores on an exam Score Frequency – Scores on an exam Score Frequency bins Interval of 5 6 bins Interval of 8 Let’s just try it and see which we prefer… Remember: This is all about helping readers understand quickly and clearly. Scores on an exam Score Frequency –
Scores on an exam Scores on an exam Score Frequency – Let’s make a frequency histogram using 10 bins and bin width of 5!!
Scores on an exam Score Frequency – Step 6: Complete the Frequency Table Scores on an exam Cumulative Frequency Relative Frequency Relative Cumulative Frequency bins Interval of 8 Just adding up the frequency data from the smallest to largest numbers Just dividing each frequency by total number to get a ratio (like a percent) Please note: 1 /28 = / 28 = /28 =.1429 Just adding up the relative frequency data from the smallest to largest numbers Please note: Also just dividing cumulative frequency by total number 1/28 = /28 = /28 =.1786
Data based on Gallup poll on 8/24/11 Who is your favorite candidate Candidate Frequency Hillary Clinton45 Bernie Sanders23 Joe Biden17 Jim Webb 1 Other/Undecided 14 Simple Frequency Table – Qualitative Data We asked 100 Democrats “Who is your favorite candidate?” Relative Frequency Just divide each frequency by total number Please note: 45 /100 = /100 = /100 = /100 =.0100 Percent 45% 23% 17% 1% 14% If 22 million Democrats voted today how many would vote for each candidate? Number expected to vote 9,900,000 5,060,000 3,740, ,000 3,080,000 Just multiply each relative frequency by 100 Please note:.4500 x 100 = 45%.2300 x 100 = 23%.1700 x 100 = 17%.0100 x 100 = 1% Just multiply each relative frequency by 22 million Please note:.4500 x 22m = 9,900k.2300 x 22m = 35,060k.1700 x 22m = 23,740k.0100 x 22m= 220k
Scores on an exam Scores on an exam Score Frequency – Remember Dot Plots Score on exam Step 4: Decide 10 for # bins (classes) 5 for bin width (interval size) Step 1: List scores Step 2: List scores in order Step 3: Decide grouped Step 5: Generate frequency histogram
Scores on an exam Scores on an exam Score Frequency – Score on exam Remember Dot Plots Step 4: Decide 10 for # bins (classes) 5 for bin width (interval size) Step 1: List scores Step 2: List scores in order Step 3: Decide grouped Step 5: Generate frequency histogram
Scores on an exam Scores on an exam Score Frequency – Score on exam Remember Dot Plots Step 4: Decide 10 for # bins (classes) 5 for bin width (interval size) Step 1: List scores Step 2: List scores in order Step 3: Decide grouped Step 5: Generate frequency histogram
Scores on an exam Scores on an exam Score Frequency – Score on exam Remember Dot Plots Step 4: Decide 10 for # bins (classes) 5 for bin width (interval size) Step 1: List scores Step 2: List scores in order Step 3: Decide grouped Step 5: Generate frequency histogram
Step 4: Decide 10 for # bins (classes) 5 for bin width (interval size) Scores on an exam Step 1: List scores Step 2: List scores in order Step 3: Decide grouped Scores on an exam Score Frequency – Step 5: Generate frequency histogram Score on exam