DATA TABLES… KINDS OF VARIABLES… Qualitative vs. Quantitative Discrete vs. Continuous DISPLAYING a QUALITATIVE VARIABLE MORE about HISTOGRAMS
DATA TABLE parties politics salary singer 5 mod. 80000 1.5 1 liberal 35000 3 2 40000 4
VARIABLES CASES INDIVIDUALS whatever DATA TABLE Columns Rows parties politics salary singer 5 mod. 80000 4.5 1 liberal 35000 3 2 40000 4 Rows CASES INDIVIDUALS whatever
Kinds of Variables QUALITATIVE — Nominal — Just labels Ordinal — Order matters QUANTITATIVE — Discrete — Only certain values (for example, whole numbers) Continuous — Any value is possible (at least in some range)
DISPLAYING ONE QUALITATIVE VARIABLE Frequency table… Bar chart… Pie chart…
Frequency table - politics VALUE COUNT Liberal 20 Moderate 3 Conservative 2 TOTAL 25
Frequency table - politics VALUE COUNT PERCENT (of column) Liberal 20 80 % Moderate 3 12 % Conservative 2 8 % TOTAL 25 100%
Frequency table – politics (Swarthmore) VALUE COUNT PERCENT (of column) Liberal 46 64% Moderate 22 31% Conservative 4 6% TOTAL 72 100%
Frequency table – singing VALUE COUNT PERCENT 1 10 39 % 2 6 23 % 3 4 8 % 5 TOTAL 26 100 %
Frequency table – singing VALUE COUNT PERCENT 1 10 39 % 2 6 23 % 3 4 8 % 5 TOTAL 26 101 %
Frequency table – singing VALUE COUNT PERCENT 1 10 38 % 2 6 23 % 3 4 8 % 5 TOTAL 26 100 %
Frequency table – singing (Swarthmore) VALUE COUNT PERCENT 1 16 22% 2 3 23 32% 4 12 17% 5 7% TOTAL 72 100%
EQUAL-AREA PRINCIPLE Areas are in proportion to data values.
Salaries (Swarthmore) 600000 60000 50000 30000 very small 250000 75000 70000 80000 125000 35000 65000 200000 400000 72500 40000 150000 120000 20000 500000 45000 85000 90000 100000 1000000
Salaries - $20,000 bin sizes
$50,000 bin sizes
$100000 bin sizes
Constructing a histogram… Define bins All the same size ? Easier if they are, but… Boundaries – ENDPOINT RULE Make a frequency table for your chosen bins… Construct histogram…
Shape of a distribution… Symmetrical ? Skew – right or left Unimodal Bimodal Multimodal Outliers