Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.

Slides:



Advertisements
Similar presentations
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
Advertisements

Slide 1 Spring, 2005 by Dr. Lianfen Qian Lecture 2 Describing and Visualizing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
Chapter 2 Frequency Distributions and Graphs 1 © McGraw-Hill, Bluman, 5 th ed, Chapter 2.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
Statistics 300: Introduction to Probability and Statistics Section 2-2.
Basic Descriptive Statistics Chapter 2. Percentages and Proportions Most used statistics Could say that 927 out of 1,516 people surveyed said that hard.
CHAPTER 2 Frequency Distributions and Graphs. 2-1Introduction 2-2Organizing Data 2-3Histograms, Frequency Polygons, and Ogives 2-4Other Types of Graphs.
© Copyright McGraw-Hill CHAPTER 2 Frequency Distributions and Graphs.
Basic Descriptive Statistics Percentages and Proportions Ratios and Rates Frequency Distributions: An Introduction Frequency Distributions for Variables.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A
MGMT 276: Statistical Inference in Management Fall, 2014 Green sheets.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
© Copyright McGraw-Hill CHAPTER 2 Frequency Distributions and Graphs.
Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.
Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.
Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.
Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.
Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M
Chapter(2) Frequency Distributions and Graphs
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Frequency Distributions and Graphs
Chapter 2 Frequency Distribution and Graph
Screen Stage Lecturer’s desk Gallagher Theater Row A Row A Row A Row B
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Please sit in your assigned seat INTEGRATED LEARNING CENTER
Please sit in your assigned seat INTEGRATED LEARNING CENTER
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2018 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2017 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Lecturer’s desk Projection Booth Screen Screen Harvill 150 renumbered
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2018 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Lecturer’s desk Projection Booth Screen Screen Harvill 150 renumbered
Frequency Distributions and Graphs
Class Data (Major) Ungrouped data:
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2017 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Presentation transcript:

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated Learning Center (ILC) 10: :50 Mondays, Wednesdays & Fridays.

Preview of Questionnaire Homework There are four parts: Statement of Objectives Questionnaire itself (which is the operational definitions of the objectives) Data collection and creation of database Creation of graphs representing results Hand in questionnaire project (Homework assignments 3 & 4)

Everyone will want to be enrolled in one of the lab sessions Remember: Bring electronic copy of your data (flash drive or it to yourself) Your data should have correct formatting See Lab Materials link on class website to double- check formatting of excel is exactly consistent Labs next week

Schedule of readings Before next exam (September 26 th ) Please read chapters in Ha & Ha textbook Please read Appendix D, E & F online On syllabus this is referred to as online readings 1, 2 & 3 Please read Chapters 1, 5, 6 and 13 in Plous Chapter 1: Selective Perception Chapter 5: Plasticity Chapter 6: Effects of Question Wording and Framing Chapter 13: Anchoring and Adjustment

Reminder A note on doodling

By the end of lecture today 9/12/14 Use this as your study guide Dot Plots Frequency Distributions - Frequency Histograms Frequency, relative frequency Guidelines for constructing frequency distributions Correlational methodology Positive, Negative and Zero correlation

Homework due – Monday (September 15 th ) On class website: please print and complete homework worksheet #5

Descriptive statistics - organizing and summarizing data Descriptive vs inferential statistics Inferential statistics - generalizing beyond actual observations making “inferences” based on data collected Sample versus population

Descriptive statistics - organizing and summarizing data Descriptive or inferential? Inferential statistics - generalizing beyond actual observations making “inferences” based on data collected What is the average height of the basketball team? In this class, percentage of students who support the death penalty? Based on the data collected from the students in this class we can conclude that 60% of the students at this university support the death penalty Measured all of the players and reported the average height Measured all of the students in class and reported percentage who said “yes” Measured only a sample of the players and reported the average height for team Measured only a sample of the students in class and reported percentage who said “yes”

Descriptive statistics - organizing and summarizing data Descriptive or inferential? Inferential statistics - generalizing beyond actual observations making “inferences” based on data collected Men are in general taller than women Shoe size is not a good predictor of intelligence Blondes have more fun The average age of students at the U of A is 21 Measured all of the citizens of Arizona and reported heights Measured all of the shoe sizes and IQ of students of 20 universities Asked 500 actresses to complete a happiness survey Asked all students in the fraternities and sororities their age

Descriptive statistics - organizing and summarizing data Descriptive vs inferential statistics Inferential statistics - generalizing beyond actual observations making “inferences” based on data collected To determine this we have to consider the methodologies used in collecting the data

You’ve gathered your data…what’s the best way to display it??

Describing Data Visually Lists of numbers too hard to see patterns Organizing numbers helps Graphical representation even more clear This is a dot plot

Describing Data Visually Measuring the “frequency of occurrence” Then figure “frequency of occurrence” for the bins We’ve got to put these data into groups (“bins”)

Frequency distributions Frequency distributions an organized list of observations and their frequency of occurrence How many kids are in your family? What is the most common family size?

Another example: How many kids in your family? Number of kids in family

Frequency distributions Crucial guidelines for constructing frequency distributions: 1. Classes should be mutually exclusive: Each observation should be represented only once (no overlap between classes) 2. Set of classes should be exhaustive: Should include all possible data values (no data points should fall outside range) Wrong Correct Correct 0 - under under under 15 How many kids are in your family? What is the most common family size? Number of kids in family Wrong Correct No place for our family of 14!

Frequency distributions Crucial guidelines for constructing frequency distributions: 3. All classes should have equal intervals (even if the frequency for that class is zero) Wrong Correct Correct 0 - under under under 15 How many kids are in your family? What is the most common family size? Number of kids in family

4. Selecting number of classes is subjective Generally will often work How about 6 classes? (“bins”) How about 8 classes? (“bins”) How about 16 classes? (“bins”)

5. Class width should be round (easy) numbers 6. Try to avoid open ended classes For example 10 and above Greater than 100 Less than 50 Clear & Easy Round numbers: 5, 10, 15, 20 etc or 3, 6, 9, 12 etc Lower boundary can be multiple of interval size Remember: This is all about helping readers understand quickly and clearly.

Let’s do one Scores on an exam If less than 10 groups, “ungrouped” is fine If more than 10 groups, “grouped” might be better How to figure how many values = 47 Step 1: List scores Step 2: List scores in order Step 3: Decide whether grouped or ungrouped Step 4: Generate number and size of intervals (or size of bins) Largest number - smallest number + 1 Sample size (n) 10 – – – – – – 1,024 Number of classes If we have 6 bins – we’d have intervals of 8 Whaddya think? Would intervals of 5 be easier to read? Let’s just try it and see which we prefer…

Scores on an exam Scores on an exam Score Frequency – Scores on an exam Score Frequency bins Interval of 5 6 bins Interval of 8 Let’s just try it and see which we prefer… Remember: This is all about helping readers understand quickly and clearly. Scores on an exam Score Frequency –

Scores on an exam Scores on an exam Score Frequency – Let’s make a frequency histogram using 10 bins and bin width of 5!!

Scores on an exam Score Frequency – Step 6: Complete the Frequency Table Scores on an exam Cumulative Frequency Relative Frequency Relative Cumulative Frequency bins Interval of 8 Just adding up the frequency data from the smallest to largest numbers Just dividing each frequency by total number to get a ratio (like a percent) Please note: 1 /28 = / 28 = /28 =.1429 Just adding up the relative frequency data from the smallest to largest numbers Please note: Also just dividing cumulative frequency by total number 1/28 = /28 = /28 =.1786

Scores on an exam Score Frequency – Cumulative Frequency Data Scores on an exam Cumulative Frequency Relative Frequency Cumulative Rel. Freq Cumulative Frequency Histogram Where are we?

Step 4: Decide 10 for # bins (classes) 5 for bin width (interval size) Scores on an exam Step 1: List scores Step 2: List scores in order Step 3: Decide grouped Scores on an exam Score Frequency – Step 5: Generate frequency histogram Score on exam

Scores on an exam Scores on an exam Score Frequency – Score on exam Generate frequency polygon Plot midpoint of histogram intervals Connect the midpoints

Scores on an exam Scores on an exam Score 95 – – Score on exam Frequency ogive is used for cumulative data Generate frequency ogive (“oh-jive”) Cumulative Frequency Connect the midpoints Plot midpoint of histogram intervals

Pareto Chart: Categories are displayed in descending order of frequency

Stacked Bar Chart: Bar Height is the sum of several subtotals

Simple Line Charts: Often used for time series data (continuous data) (the space between data points implies a continuous flow) Note: Can use a two-scale chart with caution Note: Fewer grid lines can be more effective Note: For multiple variables lines can be better than bar graph

Pie Charts: General idea of data that must sum to a total (these are problematic and overly used – use with much caution) Bar Charts can often be more effective Exploded 3-D pie charts look cool but a simple 2-D chart may be more clear Exploded 3-D pie charts look cool but a simple 2-D chart may be more clear

Data based on Gallup poll on 8/24/11 Who is your favorite candidate Candidate Frequency Rick Perry29 Mitt Romney17 Ron Paul13 Michelle Bachman10 Herman Cain 4 Newt Gingrich 4 No preference23 Simple Frequency Table – Qualitative Data We asked 100 Republicans “Who is your favorite candidate?” Relative Frequency Just divide each frequency by total number Please note: 29 /100 = /100 = /100 = /100 =.0400 Percent 29% 17% 13% 10% 4% 23% If 22 million Republicans voted today how many would vote for each candidate? Number expected to vote 6,380,000 3,740,000 2,860,000 2,200, ,000 5,060,000 Just multiply each relative frequency by 100 Please note:.2900 x 100 = 29%.1700 x 100 = 17%.1300 x 100 = 13%.0400 x 100 = 4% Just multiply each relative frequency by 22 million Please note:.2900 x 22m = 6,667k.1700 x 22m = 3,740k.1300 x 22m = 2,860k.0400 x 22m= 880k

Designed our study / observation / questionnaire Collected our data Organize and present our results

Scatterplot displays relationships between two continuous variables Correlation: Measure of how two variables co-occur and also can be used for prediction Range between -1 and +1 Range between -1 and +1 The closer to zero the weaker the relationship The closer to zero the weaker the relationship and the worse the prediction Positive or negative Positive or negative

Correlation Range between -1 and +1 Range between -1 and perfect relationship = perfect predictor perfect relationship = perfect predictor 0 no relationship = very poor predictor strong relationship = good predictor strong relationship = good predictor strong relationship = good predictor weak relationship = poor predictor weak relationship = poor predictor weak relationship = poor predictor

Height of Mothers by Height of Daughters Positive Correlation Height of Daughters Height of Mothers Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down

Brushing teeth by number cavities Negative Correlation Number Cavities Brushing Teeth Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down

Perfect correlation = or One variable perfectly predicts the other Negative correlation Positive correlation Height in inches and height in feet Speed (mph) and time to finish race

Correlation Perfect correlation = or The more closely the dots approximate a straight line, (the less spread out they are) the stronger the relationship is. One variable perfectly predicts the other No variability in the scatterplot The dots approximate a straight line

Correlation