Sociology 690 – Data Analysis Simple Quantitative Data Analysis.

Slides:



Advertisements
Similar presentations
What are Concepts and Variables? Book #2. DEVELOPING CONCEPTS EVENT OF INTEREST NOMINAL CONCEPT INDICATOR OPERATIONAL DEFINITION ELEMENTS EXAMPLE - 1.
Advertisements

Bivariate and Partial Correlations. X (HT) Y (WT) The Graphical View– A Scatter Diagram X’ Y’
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Frequency Distribution and Variation Prepared by E.G. Gascon.
Descriptive Statistics. Frequency Distributions a tallying of the number of times (frequency) each score value (or interval of score values) is represented.
BHS Methods in Behavioral Sciences I April 18, 2003 Chapter 4 (Ray) – Descriptive Statistics.
QUANTITATIVE DATA ANALYSIS
Lesson Fourteen Interpreting Scores. Contents Five Questions about Test Scores 1. The general pattern of the set of scores  How do scores run or what.
Methods and Measurement in Psychology. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
Analysis of Research Data
Learning Objectives for Section 11.3 Measures of Dispersion
SHOWTIME! STATISTICAL TOOLS IN EVALUATION DESCRIPTIVE VALUES MEASURES OF VARIABILITY.
Quote of the day Information is meaningless absent a language to communicate it. Statistics is that language. - J Schutte.
Measures of Central Tendency
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Today: Central Tendency & Dispersion
@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.
EPE/EDP 557 Key Concepts / Terms –Empirical vs. Normative Questions Empirical Questions Normative Questions –Statistics Descriptive Statistics Inferential.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Statistics Recording the results from our studies.
8.3 Measures of Dispersion  In this section, you will study measures of variability of data. In addition to being able to find measures of central tendency.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Descriptive Statistics
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Research Methods. Measures of Central Tendency You will be familiar with measures of central tendency- averages. Mean Median Mode.
Practice Page 65 –2.1 Positive Skew Note Slides online.
FREQUANCY DISTRIBUTION 8, 24, 18, 5, 6, 12, 4, 3, 3, 2, 3, 23, 9, 18, 16, 1, 2, 3, 5, 11, 13, 15, 9, 11, 11, 7, 10, 6, 5, 16, 20, 4, 3, 3, 3, 10, 3, 2,
Chapter 9 Statistics.
 Two basic types Descriptive  Describes the nature and properties of the data  Helps to organize and summarize information Inferential  Used in testing.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
IE(DS)1 Descriptive Statistics Data - Quantitative observation of Behavior What do numbers mean? If we call one thing 1 and another thing 2 what do we.
DATA ANALYSIS Indawan Syahri.
Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.
Descriptive Statistics. Outline of Today’s Discussion 1.Central Tendency 2.Dispersion 3.Graphs 4.Excel Practice: Computing the S.D. 5.SPSS: Existing Files.
Chapter 11 Data Descriptions and Probability Distributions Section 3 Measures of Dispersion.
Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Chapter 14 Statistics and Data Analysis. Data Analysis Chart Types Frequency Distribution.
MM150 ~ Unit 9 Statistics ~ Part II. WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores.
Chapter 2 The Mean, Variance, Standard Deviation, and Z Scores.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
STATS DAY First a few review questions. Which of the following correlation coefficients would a statistician know, at first glance, is a mistake? A. 0.0.
Descriptive Statistics ( )
Exploratory Data Analysis
Chapter 12 Understanding Research Results: Description and Correlation
Different Types of Data
Tips for exam 1- Complete all the exercises from the back of each chapter. 2- Make sure you re-do the ones you got wrong! 3- Just before the exam, re-read.
Descriptive Statistics: Presenting and Describing Data
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Description of Data (Summary and Variability measures)
STATS DAY First a few review questions.
Understanding Research Results: Description and Correlation
Descriptive Statistics
Basic Statistical Terms
Sociology 690 – Data Analysis
Lesson 12: Presentation and Analysis of Data
Descriptive Statistics
Sociology 690 – Data Analysis
Lecture 4 Psyc 300A.
Presentation transcript:

Sociology 690 – Data Analysis Simple Quantitative Data Analysis

Four Issues in Describing Quantity 1. Grouping/Graphing Quantitative Data 2. Describing Central Tendency 3. Describing Variation 4. Describing Co-variation

1. Grouping Quantitative Data Intervals and Real Limits Widths and midpoints Graphing grouped data If there are a large number of quantitative scores, one would not simply create a raw score frequency distribution, as that would contain too many unique scores and, therefore, not fulfill the data reduction goal.

Grouping Data - Intervals To group quantitative data, three rules are followed: – 1. Make the intervals no greater than the most amount of information you are willing to lose. – 2. Make the intervals in multiples of five. – 3. Make the distribution intervals few enough to be internalized at a glance.

Grouping Data – Intervals Example If these are the scores on a midterm: {9,13,18,19,22,25,31,34,35,36,36,38,41,43,44,45 } The corresponding grouped frequency distribution would look like: i f i Total 16

Grouping Data - Real Limits This implies the need for real limits as there are “gaps” in these intervals. The real limits of an interval are characterized by numbers that are plus and minus one- half unit on each side of stated limits: For example: – the interval becomes 10.5 – 20.5 – the interval 3.5 – 4.5 becomes 3.45 – 4.55

Grouped Data – Width and Midpoint The width of an interval is simply the difference between the upper and lower real limits. e.g  20.5 – 10.5 = 10 The midpoint is determined by calculating the interval width, dividing it by 2, and adding that number to the lower real limit. e.g. 10/ = 15.5

Graphing Grouped Data A Quantitative version of a bar graph is called an Histogram: When the frequencies are connected via a line, it is call a frequency polygon:

2. Describing Central Tendency Modes Medians Means Skew But we can do more than simply create a frequency distribution. We can also describe how these observations “bunch up” and how they “distribute”. Describing how they bunch up involves measures of

Central Tendency - Modes The mode for raw data is simply the most frequent score: e.g. {2,3,5,6,6,8}. The mode is 6. The mode for grouped data is the midpoint of the interval containing the highest frequency (35.5 here): i fi Total 16

Central Tendency - Medians The median for raw data is simply the score at the middle position. This involves taking the (N+1)/2 position and stating the associated value attached to it: e.g. {2,3,5,6,8} (5+1)/2  the third position score The third position score is 5. e.g. {2,3,5,8} (4+1)/2  the 2.5 position score The 2.5 position score is (3+5)/2 = 4

Medians for Grouped Data The median for grouped data is: For our previous distribution of scores, the answer would be: ((16/2-6)/6)*10 = = i f i Total 16

Central Tendency - Mean For raw data, the mean is simply the sum of the values divided by N: Suppose X i = { 2,3,5,6} The mean would be 16/4 = 4

Means for Grouped Data For grouped data, the mean would be the sum of the frequencies times midpoints for each interval, that sum divided by N: For our previous distribution, the answer would be: i f i (5.5)+3(15.5)+2(25.5)+6(35.5) (45.5) = 498 / 16 = Total 16

3. Describing Variation Range Mean Deviation Variance Standard Scores (Z score)

Describing Variation - Range The Range for raw scores is the highest minus the lowest score, plus one (i.e. inclusive) The Range for grouped scores is the upper real limit of the highest interval minus the lower real limit of the lowest interval. In the case of our previous distribution this would be = 50 i fi Total 16

Describing Variation – Mean Deviation The mean deviation is the sum of all deviations, in absolute numbers, divided by N. Consider the set of observations, {6,7,9,10} The mean is 8 and the MD is (|6-8|+|7-8|+|9-8|+|10-8|)/4 = 6/4 = 1.5

Mean Deviation for Grouped Data Again grouped data implies we substitute frequencies and midpoints for values: The mean would be $50,000 (satisfy yourself that that is true) and the MD would be (6|38-50|) + (8|43-50|) + (12|48-50|) + (12|53-50|) + (8|58-50|) +(4|63-50|) = = 304/50 = x 1000 = 6,080

Variation – The Variance The variance for raw data is the sum of the squared deviations divided by N Consider the set X i { 6,7,9,10} The mean is 8 and the variance is ((6-8) 2 +(7-8) 2 +(9-8) 2 +(10-8) 2 )/4 = 2.5

Variance for Grouped Data Frequencies and midpoints are still substituted for the values of X i. Again the mean is 50 and the Variance is 6(38-50) 2 + 8(43-50) (48-50) (53- 50) 2 + 8(58-50) 2 + 4(63-60) 2 = = 2690 / 50 = 53.8 x 1000 = $53,800. The Standard Deviation is the sq root of this.

4. Covariance and Correlation The Definition and Concept The Formula Proportional Reduction in Error and r 2

Correlation – Definition and Concept Visually we can observe the co-variation of two variables as a scatter diagram where the abscissa and ordinate are the quantitative continua and the points are simultaneously mapping of the pairs of scores.

Correlation - Formula Think of the correlation as a proportional measure of the relationship between two variables. It consists of the co-variation divided by the average variation:

Correlation and P.R.E. Consider this scatter diagram. The proportion of variation around the Y mean (variation before knowing X), less the proportion of variation around the regression line (variation after knowing x) is r 2

IV. Quantitative Statistical Example of Elaboration Step 1 – Construct the zero order Pearson’s correlations (r). Assume r xy =.55 where x = divorce rates and y = suicide rates. Further, assume that unemployment rates (z) is our control variable and that r xz =.60 and r yz =.40 Step 2 – Calculate the partial correlation ( r xy.z ) ==.42 Step 3 – Draw conclusions After z ( r xy.z ) 2 =.18 Before z (r xy ) 2 =.30 Therefore, Z accounts for ( ) or 12% of Y and (.12/.30) or 40% of the relationship between X&Y.55 – (.6) (.4) Partial Correlation