Presentation is loading. Please wait.

Presentation is loading. Please wait.

Organizing, Displaying and Interpreting Data

Similar presentations


Presentation on theme: "Organizing, Displaying and Interpreting Data"— Presentation transcript:

1 Organizing, Displaying and Interpreting Data
Chapter 3:Hawkes STAT 3090

2 Statistics: The science of data
Data: information about individuals Variables: characteristics of individuals The information Exploratory Data Analysis How to extract information from data? First step => “Plot the data” “A picture is worth 1000 words.”

3 What can we learn from a picture?
Distribution of the variable(s) Value and Frequency Shape Center Spread Variability Extreme values

4 Two Basic Types of Data Categorical Quantitative Qualitative
male/female, colors, phone numbers, race Places individual into one or several “categories” Have no “value” can’t perform math functions Quantitative Numerical values – temperature, money, time Can perform mathematical functions Can be discrete (finite) – number of children in famly. Or continuous – age, time, distance

5 Displaying Data Quantitative Categorical (Qualitative)
Numerical values Continuous or discrete Frequency plots Maintain original values Histograms Create ‘groups’ ‘bins’ Bars touch (continuous data) Stemplots Dot plots Time Series Categorical (Qualitative) Counts, Discrete Data Pie Charts Bar Graphs (bars do not touch) Frequency Tables Spreadsheets Relative Frequency Tables

6 Qualitative Data: The Common Displays

7 Frequency distribution
Summarizes data into classes and provides in tabular form a list of classes along with the number of observations in each class Must have a frequency distribution before any type of graph can be constructed Use Excel “count if” function

8 Frequency Table 60 15 5 20 65 35 Cell Phone
No Cell Phone Accidents 60 15 No Accidents 5 20 Column Total 65 35

9 Relative Frequency Distribution
The proportion (or percent) of observations within a category Found using the formula: ___frequency relative frequency = sum of all frequencies Relative Frequency Distribution lists each category of data with the relative frequency? What is the advantage of using a relative frequency distribution over simple counts?

10 Relative Frequency Table
Cell Phone % Accidents 60 92% No Accidents 5 7% Column Total 65 100

11 Bar graph (or chart) A simple graphical display in which each bar corresponds to the number of observations in a category Label each category of data on either horizontal or vertical axis Rectangles of equal width for each category Height of each rectangle represents category’ frequency OR relative frequency Bars don’t touch Bars should always start at zero Used for qualitative or discrete quantitative data

12 Example of Bar Graph with Time (side-by-side)

13 Pareto Chart A Pareto chart is a bar graph where the bars are drawn in decreasing order of frequency or relative frequency

14 Pie Charts A circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the data. Proportions (percents)MUST add to 1 (100%) Angle of the “wedges” : (frequency)/(total # observations) = proportion (proportion)(360) = Angle

15 Pie Chart (What’s Missing?)

16 QuantITative Data: The Common Displays

17 Frequency Distribution for Quantitative Data
Not appropriate to have a bar for each value so develop classes Number of classes generally more than 4 but less than 20 Class Width = (Largest Value – Smallest Value)/Number of Classes Should be rounded to whole numbers for ease of understanding Class boundaries: Subtract 0.5 from lower limit and add 0.5 to upper limit. (See page 93)

18 Other Frequency Distribution for Quantitative Data
Relative Frequency Distribution Same process as for Qualitative Data Cumulative Frequency Distribution Add frequencies in each successive class Total must equal total number of observations

19 Histogram A bar graph of a frequency or relative frequency distribution in which the heights of each bar corresponds to the frequency or relative frequency of each class. Edges touching Covers entire range of values of a variable May need to create “bins”

20 Guidelines for “Bins” Cover complete range of data
Group or bin size is basically arbitrary Can have open or closed bins Less than 5,5 to 10, 11 to 15, over 15 (for example) Bins are mutually exclusive Bins can be of equal or unequal size Reflects the clumping of the observations General formula for interval size I = H – L k Where H = value of highest observation, L = value of lowest, and k = number of classes (bins, groups)

21 Interpreting Histograms
Patterns of the data Shape, center, spread Shape Symmetrical Skewed right (tail stretches to the right) Skewed left (tail stretches to the left)

22 Stem – and – Leaf Display
Separate each observation into stem All but final digit And leaf Final digit Stems have as many digits as needed; each leaf is only one digit

23 Stemplot Data: 32, 37, 39, 40, 41, 41, 41, 42, 42, 43, 44, 45, 45, 45, 46, 47, 47, 49, 50, 51 3 | 2 7 9 4 | 5 | 0 1

24 Plots each observation against the time at which it was measured
Timeplots Plots each observation against the time at which it was measured Time on x-axis (horizontal) Variable on y-axis (vertical)

25 Scatterplot showing Sales by Region over Time


Download ppt "Organizing, Displaying and Interpreting Data"

Similar presentations


Ads by Google