Download presentation
Presentation is loading. Please wait.
1
Organizing, Displaying and Interpreting Data
Chapter 3:Hawkes STAT 3090
2
Statistics: The science of data
Data: information about individuals Variables: characteristics of individuals The information Exploratory Data Analysis How to extract information from data? First step => “Plot the data” “A picture is worth 1000 words.”
3
What can we learn from a picture?
Distribution of the variable(s) Value and Frequency Shape Center Spread Variability Extreme values
4
Two Basic Types of Data Categorical Quantitative Qualitative
male/female, colors, phone numbers, race Places individual into one or several “categories” Have no “value” can’t perform math functions Quantitative Numerical values – temperature, money, time Can perform mathematical functions Can be discrete (finite) – number of children in famly. Or continuous – age, time, distance
5
Displaying Data Quantitative Categorical (Qualitative)
Numerical values Continuous or discrete Frequency plots Maintain original values Histograms Create ‘groups’ ‘bins’ Bars touch (continuous data) Stemplots Dot plots Time Series Categorical (Qualitative) Counts, Discrete Data Pie Charts Bar Graphs (bars do not touch) Frequency Tables Spreadsheets Relative Frequency Tables
6
Qualitative Data: The Common Displays
7
Frequency distribution
Summarizes data into classes and provides in tabular form a list of classes along with the number of observations in each class Must have a frequency distribution before any type of graph can be constructed Use Excel “count if” function
8
Frequency Table 60 15 5 20 65 35 Cell Phone
No Cell Phone Accidents 60 15 No Accidents 5 20 Column Total 65 35
9
Relative Frequency Distribution
The proportion (or percent) of observations within a category Found using the formula: ___frequency relative frequency = sum of all frequencies Relative Frequency Distribution lists each category of data with the relative frequency? What is the advantage of using a relative frequency distribution over simple counts?
10
Relative Frequency Table
Cell Phone % Accidents 60 92% No Accidents 5 7% Column Total 65 100
11
Bar graph (or chart) A simple graphical display in which each bar corresponds to the number of observations in a category Label each category of data on either horizontal or vertical axis Rectangles of equal width for each category Height of each rectangle represents category’ frequency OR relative frequency Bars don’t touch Bars should always start at zero Used for qualitative or discrete quantitative data
12
Example of Bar Graph with Time (side-by-side)
13
Pareto Chart A Pareto chart is a bar graph where the bars are drawn in decreasing order of frequency or relative frequency
14
Pie Charts A circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the data. Proportions (percents)MUST add to 1 (100%) Angle of the “wedges” : (frequency)/(total # observations) = proportion (proportion)(360) = Angle
15
Pie Chart (What’s Missing?)
16
QuantITative Data: The Common Displays
17
Frequency Distribution for Quantitative Data
Not appropriate to have a bar for each value so develop classes Number of classes generally more than 4 but less than 20 Class Width = (Largest Value – Smallest Value)/Number of Classes Should be rounded to whole numbers for ease of understanding Class boundaries: Subtract 0.5 from lower limit and add 0.5 to upper limit. (See page 93)
18
Other Frequency Distribution for Quantitative Data
Relative Frequency Distribution Same process as for Qualitative Data Cumulative Frequency Distribution Add frequencies in each successive class Total must equal total number of observations
19
Histogram A bar graph of a frequency or relative frequency distribution in which the heights of each bar corresponds to the frequency or relative frequency of each class. Edges touching Covers entire range of values of a variable May need to create “bins”
20
Guidelines for “Bins” Cover complete range of data
Group or bin size is basically arbitrary Can have open or closed bins Less than 5,5 to 10, 11 to 15, over 15 (for example) Bins are mutually exclusive Bins can be of equal or unequal size Reflects the clumping of the observations General formula for interval size I = H – L k Where H = value of highest observation, L = value of lowest, and k = number of classes (bins, groups)
21
Interpreting Histograms
Patterns of the data Shape, center, spread Shape Symmetrical Skewed right (tail stretches to the right) Skewed left (tail stretches to the left)
22
Stem – and – Leaf Display
Separate each observation into stem All but final digit And leaf Final digit Stems have as many digits as needed; each leaf is only one digit
23
Stemplot Data: 32, 37, 39, 40, 41, 41, 41, 42, 42, 43, 44, 45, 45, 45, 46, 47, 47, 49, 50, 51 3 | 2 7 9 4 | 5 | 0 1
24
Plots each observation against the time at which it was measured
Timeplots Plots each observation against the time at which it was measured Time on x-axis (horizontal) Variable on y-axis (vertical)
25
Scatterplot showing Sales by Region over Time
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.