Graphs for Data Mrs. Watkins AP Statistics Chapter 3/4
Frequency Table VariableTallyFrequency 9 th III? 10 th IIIIIII? 11 th IIIIIIIII? 12 th IIIII? Purpose: To organize raw data
Relative Frequency Table VariableTallyFreq.Relative. Freq 9 th III3? 10 th IIIIIII6? 11 th IIIIIIIII9? 12 th IIIII5? Purpose: to show proportions/percents
Cumulative Relative Frequency VariableFreq.Rel. FreqCum. Rel.Freq 9 th 3?? 10 th 6?? 11 th 9?? 12 th 5?? Should add up to 100% or 1.00 (or close)
Bar Chart Purpose: To display counts or percentages for categorical data **should have space between bars
Contingency Table (two way table) Displays two categorical variables MaleFemale Yes No Vocabulary: cells (entries), variables, experimental units
Marginal Distribution—calculates relative frequency across one variable MaleFemale Yes No
Marginal Distribution by Gender Marginal Distribution by Instrument
Conditional Distributions Breakdown of ONE variable within the CONDITIONED category: Given MALE, what is distribution of instruments? _____Yes_______No Given NO instruments, what is distribution of gender? ______Male ______ Female
Segmented Bar Chart
Issues with Two-Way Tables and Categorical Data Analysis 1.Independence 2.Simpson’s Paradox 3.Relative Risk
Independence When the distribution of one variable is the same across the categories of another variable It means that one variable does not seem to affect the other Segmented bar graphs will be identical Example: text page 28 Eye Color
Simpson’s Paradox When using an overall percentage instead of breaking the data down into relevant categories **shows “unreal” preference Examples: hospital death rates, pilot errors, school admission (page 33 in text)
Relative Risk Usually used for disease: Incidence/Non-incidence Ratio Example: Risk of Heart attack between smokers and non-smokers RR for Female: 2.24:1 RR for Male: 1.43:1
More Relative Risk Injuries at Naval Academy (soccer, rugby and basketball) RR Female:Male :1 Injuries during military training at Naval Academy RR Female:Male—9.74:1
Graphs for Quantitative Data Dotplot Stem/Leaf Plot Histogram
Dot Plot Types: Quantitative; small data sets Best Used: Show distribution of discrete values; shows gaps
Stem/Leaf Plot Types: Quantitative; small data sets Best used: when data values should be preserved
Back to Back Stem/Leaf Plot
Histogram Types: Quantitative, large range of values Best Used: To display large amount of data
Histograms—two choices FREQUENCY showing actual counts for each variable value RELATIVE FREQUENCY showing proportion/percent for each variable value Affects the vertical axis only
BEWARE of CUMULATIVE Read histograms carefully…the vertical axis might be CUMULATIVE
GRAPH ANALYSIS--SOCS AP questions will ask you to “comment on the distribution”. S: SHAPE? Symmetric, skewed, bimodal O: OUTLIERS? Any unusual values, gaps C: CENTER ? Middle of the data S: SPREAD? range of data -- is range big or small?