Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house.

Similar presentations


Presentation on theme: "1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house."— Presentation transcript:

1 1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house and a collection of facts is not necessarily science. -Henri Poincare, French mathematician & physicist (1854 - 1912)

2 2 Data Analysis  Telling your story about the data, whether quantitative or qualitative.  What is typical or common in the data.  All respondents are teenagers.  Most have improved after an intervention.  What is the extent of difference or variation?  Income ranges from $2,300 to $56,000.  Attitudes differ by income.

3 3 Data Verification  Garbage in, Garbage out Ensure that data have been entered correctly.  Data ordering  Use array: order data in ascending or descending order by column and visually inspect data for possible errors.  Use frequency distribution for values of each variable’s attributes.

4 4 Data Verification  Scatterplots are used to identify outliers.  Then need to determine whether error is data entry or data recording error. Scatterplot of Age Variable

5 5 Recoding data  Collapse or create categories.  When some attributes have only a few responses.  When it makes theoretical sense.  E.g., collapse Asian/Pacific Islander with “Other.”

6 6 Computing Scales  Add items that are thought to measure the same concept, producing a total score.  Substitute the mean for the item for any missing values.  Cronbach’s alpha is a measure of the internal consistency of the scale.  A scale should have an alpha of.60 or higher.  Change scales to conform to 0-100 range.  Count the number of completed items (a).  Add the item responses (b).  Compute the highest score for the item minus 1 (c).  Calculate: (b – a) * (100)/ a * c.

7 7 Quantitative Data  Descriptive statistics – summary measures  Univariate Analysis: Describing one variable at a time.  Measures of Central Tendency  Mode: The most frequent value.  What was the most frequently visited domestic violence shelter in the city?  Unimodal – a distribution with one mode.  Bimodal – when two values are most frequently reported.  Multimodal – when more than two values are most frequently reported.

8 8 Quantitative Data (con’t.)  Median – For interval or ratio level data.  The value that divides the distribution in half.  50% of scores are on each side of the median.  If even number of values, the median is the average of the two most central values  Useful if extreme scores impact the mean.  e.g., income or age.  Mean – For interval or ratio level data  The average.  The sum of the values divided by the number of values.  1,2,2,1,4,2,1,3,2,1 = 19/10 = 1.9

9 9 Measures of Variability  Minimum – lowest value  Maximum – highest value  Range – (highest value – lowest value) +1  Standard Deviation (SD)  A statistical measure of the amount by which a set of values differs from the mean.  Used to compare the variability of distributions.  Used in statistical analysis.  Used to interpret scores in the normal distribution.

10 10 The Normal Distribution  Properties – Bell shaped curve.  Symmetric, unimodal, with a mean of 0.  50% of data fall on either side of the mean.  68.26% fall within one SD; 95% within 2 SD. Scores approximate Normal

11 11 Skewed Distributions  The distribution is not symmetrical when plotted.  Positively skewed – scores trail out at high end.  Negatively skewed – scores trail out at low end. Positive skew Negative Skew

12 12 Frequency Tables  Summarize descriptive data GPA range*nPercent 1.5-1.9931.6 2.0 – 2.4973.7 2.5 – 2.99105.3 3.0 – 3.4912465.3 3.5 – 4.04624.2 TOTALN = 190100.0 Table 1: GPA of Black Feather Youth = 3.21; STD =.41; Median = 3.2; Mode = 3.0 Min. = 1.5; Max. = 4.0; Range =2.5

13 13 Using Graphs to Display Data  Trend graph / Pie Chart Line graph illustrating Attitude Towards Police by Age Pie chart illustrating the proportion of males and females responding to the Black Feather Youth Survey (N = 190)


Download ppt "1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house."

Similar presentations


Ads by Google