Organizing the Data Levin and Fox Elementary Statistics In Social Research Chapter 2
Frequency Distributions We organize our data into what are called frequency distribution tables so that We can make sense of the raw data that we have collected and So that we can transform them into meaningful sets of measures.
Frequency Distributions: Nominal Data Consists of two columns: Left Column: Characteristics being presented and the categories of analysis Right Column: Indicates the number of occurrences (or frequency) Types of Political Activityf Voting51 Writing letters to elected officials37 Canvassing14 Protesting12 N=114 Table 1: Types of Political Activities
Frequency Distributions: Nominal Data – Making Comparisons Types of Political Activity MalesFemalesf Voting Writing letters to elected officials Canvassing9514 Protesting6612 Totalsn=60n=54N=114
Proportions and Percentages Proportions provide a standardized way to compare across groups by comparing the number of cases in a given category to the total size of the distribution. Convert by dividing the number of cases in any given category f by the total number of cases in the distribution N. Frequency Distributions:
Divide the frequency (f) by the number of cases (N) to obtain the proportion. 51 of the 114 respondents participated by voting. (P=.447) 37 of the 114 respondents participated by writing letters. (P=.3246) 12 of 114 respondents participated by protesting. (P=.1053) Types of Political Activity MalesFemalesf Voting Writing letters to elected officials Canvassing9514 Protesting6612 Totalsn=60n=54N=114
Percentages: Is defined as The frequency of occurrences of a category per 100 cases. Simply multiply any proportion by 100. To calculate a percentage, we simply multiply any given proportion by 100. Percentages
Percentages: The frequency of occurrences of a category per 100 cases. Simply multiply any proportion by of 114 respondents participated by protesting. Proportion:.1053 Percentage: 10.53% 51 of the 114 respondents participated by voting. Proportion:.447 Percentage: 44.7% Percentages
Frequency Distributions: Ordinal and Interval Data Nominal Data: Not Ordered Remember that nominal data are simply labeled and not scaled so they can be arranged in any way. Ordinal and Interval Data: Ordered However, ordinal and interval data represent the presence or absence of a particular characteristic and are arranged to reflect their order.
Ordinal and Interval Frequency Distribution Tables Marital Status f Married12 Single10 Divorced8 Total30 Nominal Frequency Distribution Table Which is correct? Marital Status f Married10 Single12 Divorced8 Total30 Marital Status f Married12 Single8 Divorced10 Total30
Ordinal and Interval Frequency Distribution Tables Ordinal Frequency Distribution Table Tuition Hikef Slightly Favorable2 Somewhat Unfavorable21 Strongly Favorable0 Slightly Unfavorable4 Strongly Unfavorable10 Somewhat Favorable1 Total38 Tuition Hikef Strongly Favorable0 Somewhat Favorable1 Slightly Favorable2 Slightly Unfavorable4 Somewhat Unfavorable21 Strongly Unfavorable10 Total38 Which is correct?
Grouped Frequency Distributions of Interval Level Data Class Interval f% Total71100% Grouped Frequencies Used to reveal patterns in Raw Data. Each group is known as a class interval. Tips: Make the class interval a whole number rather than a decimal Make the lowest score in a class interval some multiple of ten.
Frequency Distributions Class Intervals: Midpoint (m) Points The midpoint (m) is the middlemost score in the class interval. It is the point at which an interval can be divided into two equal parts. 13 Class Intervalm The midpoint (m) can also be calculating as follows: m = Lowest Score Value + Highest Score Value 2 m = = 50 m = = 3.5 2
Obtaining a midpoint: Add the highest and lowest values of the interval and then divide that sum by 2. IntervalsfXXfXf (10+20)/2= 153 * 15 = (20+30)/2= 259 * 25 = (30+40)/2= 355 * 35 = 175 X= Midpoint, f = frequency Formula for finding the midpoint ( m )
Cumulative Distributions Cumulative frequencies (cf) for any category is obtained by adding the frequency in that category to the the total frequency for all categories below it. We can also construct a distribution that indicates cumulative percentage (c%) Class Intervalf%cfc% Total176100
Cross-tabulation: A cross-tabulation is a tabular summary of data for two variables. Cross-tabulation can be used when: one variable is qualitative and the other is quantitative, both variables are qualitative, or both variables are quantitative. The left and top margin labels define the classes for the two variables. Cross-Tabulation:
SB UseMaleFemaleTotal All the Time Most of the Time Some of the Time Seldom Never % % % % % % % % % % % % % % % Total % 63.2% 100.0% Seat Beat Use by Gender with Total Percents Column Row
Cross-Tabulation: SB UseMaleFemaleTotal All the Time Most of the Time Some of the Time Seldom Never % % % % % % % % % % % % % % % Total % 63.2% 100.0% Seat Beat Use by Gender with Total Percents Marginal
Cross-Tabulation: SB UseMaleFemaleTotal All the Time Most of the Time Some of the Time Seldom Never % % % % % % % % % % % % % % % Total % 63.2% 100.0% Seat Beat Use by Gender with Total Percents (Table 2.16)
Cross-Tabulation: SB UseMaleFemaleTotal All the Time Most of the Time Some of the Time Seldom Never % % % % % % % % % % % % % % % Total % 63.2% 100.0% Seat Beat Use by Gender with Row Percents (Table 2.17)
Cross-Tabulation: SB UseMaleFemaleTotal All the Time Most of the Time Some of the Time Seldom Never % % % % % % % % % % % % % % % Total % 63.2% 100.0% Seat Beat Use by Gender with Column Percents (Table 2.18)
Cross-Tabulation: Choosing among Total, Row and Column Percents When determining which percent to use, the rule of thumb is: when the IV is on the row, use row percents, when the IV is on the columns, use column percents. Determining the IV and DV It is not always easy to determine which variable is the Independent Variable and which is the Dependent Variable in a cross-tab.
Graphic Presentations Pie Charts: Circular graphs whose pieces add up to 100%
Bar Charts or Histograms: Accommodate any number of categories at any level of measurement Bar charts: Used to display the frequency or percentage distribution of variables whose categories do not represent a smooth continuum Histograms: Used to display more continuous measurements, especially at the interval level.
Frequency Polygons: Stresses continuity (rather than differences) along a scale
Frequency polygons can be useful for comparing data.
Three Distributions Representing Directions of Skewness Frequency polygons help us visualize the forms taken by frequency distributions. Positively skewed: Longer tail on the left Negatively skewed: Longer tail on the right
Some Variations on Kurtosis (Peakedness) among Symmetrical Distributions
30