Download presentation
Presentation is loading. Please wait.
Published byAnne Gardner Modified over 9 years ago
1
1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house and a collection of facts is not necessarily science. -Henri Poincare, French mathematician & physicist (1854 - 1912)
2
2 Data Analysis Telling your story about the data, whether quantitative or qualitative. What is typical or common in the data. All respondents are teenagers. Most have improved after an intervention. What is the extent of difference or variation? Income ranges from $2,300 to $56,000. Attitudes differ by income.
3
3 Data Verification Garbage in, Garbage out Ensure that data have been entered correctly. Data ordering Use array: order data in ascending or descending order by column and visually inspect data for possible errors. Use frequency distribution for values of each variable’s attributes.
4
4 Data Verification Scatterplots are used to identify outliers. Then need to determine whether error is data entry or data recording error. Scatterplot of Age Variable
5
5 Recoding data Collapse or create categories. When some attributes have only a few responses. When it makes theoretical sense. E.g., collapse Asian/Pacific Islander with “Other.”
6
6 Computing Scales Add items that are thought to measure the same concept, producing a total score. Substitute the mean for the item for any missing values. Cronbach’s alpha is a measure of the internal consistency of the scale. A scale should have an alpha of.60 or higher. Change scales to conform to 0-100 range. Count the number of completed items (a). Add the item responses (b). Compute the highest score for the item minus 1 (c). Calculate: (b – a) * (100)/ a * c.
7
7 Quantitative Data Descriptive statistics – summary measures Univariate Analysis: Describing one variable at a time. Measures of Central Tendency Mode: The most frequent value. What was the most frequently visited domestic violence shelter in the city? Unimodal – a distribution with one mode. Bimodal – when two values are most frequently reported. Multimodal – when more than two values are most frequently reported.
8
8 Quantitative Data (con’t.) Median – For interval or ratio level data. The value that divides the distribution in half. 50% of scores are on each side of the median. If even number of values, the median is the average of the two most central values Useful if extreme scores impact the mean. e.g., income or age. Mean – For interval or ratio level data The average. The sum of the values divided by the number of values. 1,2,2,1,4,2,1,3,2,1 = 19/10 = 1.9
9
9 Measures of Variability Minimum – lowest value Maximum – highest value Range – (highest value – lowest value) +1 Standard Deviation (SD) A statistical measure of the amount by which a set of values differs from the mean. Used to compare the variability of distributions. Used in statistical analysis. Used to interpret scores in the normal distribution.
10
10 The Normal Distribution Properties – Bell shaped curve. Symmetric, unimodal, with a mean of 0. 50% of data fall on either side of the mean. 68.26% fall within one SD; 95% within 2 SD. Scores approximate Normal
11
11 Skewed Distributions The distribution is not symmetrical when plotted. Positively skewed – scores trail out at high end. Negatively skewed – scores trail out at low end. Positive skew Negative Skew
12
12 Frequency Tables Summarize descriptive data GPA range*nPercent 1.5-1.9931.6 2.0 – 2.4973.7 2.5 – 2.99105.3 3.0 – 3.4912465.3 3.5 – 4.04624.2 TOTALN = 190100.0 Table 1: GPA of Black Feather Youth = 3.21; STD =.41; Median = 3.2; Mode = 3.0 Min. = 1.5; Max. = 4.0; Range =2.5
13
13 Using Graphs to Display Data Trend graph / Pie Chart Line graph illustrating Attitude Towards Police by Age Pie chart illustrating the proportion of males and females responding to the Black Feather Youth Survey (N = 190)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.