Lecture 3. Data Compression for Two Variables: Scatterplots, Cross- Tabulations, and Correlation David R. Merrell Intermediate Empirical Methods for Public Policy and Management
Lecture 3: Agenda Review of Lecture 2 Cross-Tabulations Comparison Bar Charts Parallel Box Plots Scatterplots Correlation Coefficients
Review of Lecture 2 Mean or Median Models for Data
Mean or Median Complaints have reached the city manager that Tardy City is taking too long to pay its bills. Data are days taken to pay seven bills: Calculate the mean and median. What do you conclude?
Models for Data Data = Fit + Residual Fit as a Center Mean Median Mode Example: Number of Stat Courses Taken by Students in
Summary Statistics (Excel)
Summary Statistics (Minitab) Descriptive Statistics Variable N Mean Median Tr Mean StDev SE Mean C Variable Min Max Q1 Q3 C
Measures of Error
Data Compression for Two Variables...And More Two-Variable Description Cross-Tabulations Comparison Bar Charts Parallel Box Plots Scatterplots Scatterplot Matrix Correlation Coefficients
Two-Variable Description
Structure of a Cross-Tabulation
Street Repair Practices Study street repair practices of local government Cities and counties handle street repairs: using their own public employees exclusively by contracting out part of the work contracting out all the work
Table 1. Street Repair: Counts Type of Local Government Street Repair Practices by Type of Government: Public Employees and Contracting by Cities and Counties in the United States
Table 2. Street Repair: Percents Type of Local Government Street Repair Practices by Type of Government: Public Employees and Contracting by Cities and Counties in the United States
Educational Achievement Residents of Allegheny County that are in labor force Random sample survey of Allegheny County residents in labor force in 199? Variables: gender and highest educational achievement
Educational Achievement: Coding of Ordinal Variables 1 if grade 4 or less 2 if grades if grade 8 4 if high school incomplete (9-11) 5 if high school graduate (12) 6 if technical, trade, or business after high school 7 if college/ university incomplete 8 if college/university graduate or more
Educational Achievement Table
Bar Chart
Job Satisfaction and Income for Postal Employees
Five Number Summary Age of Allegheny County residents by location: individuals in labor force in 199?.
Parallel Box Plots o o o o The Mon ValleyPittsburgh Other
Scatterplots Creating via Excel ChartWizard Transformation of Variables Scatterplot Matrices
Scatterplot 1 Salary Years employed
Scatterplot 2 Salary Years employed
Scatterplot 3 Salary Years employed
Scatterplot Matrix
Correlation Coefficient, r
Properties of r
International Adoption Visas: 1991 vs 1988 r:/academic/90-786/ Chatterjee/ Adopt.dat
International Adoption Visas Country Etc.
Excel Calculation of r Use statistical function, correl Eliminate missing data values Identify X data Identify Y data Finish Value: r = (.88)
Minitab Calculation of r Correlations (Pearson) Correlation of log 1988 and log 1992 = 0.873
Next Time... Ethics and the Value of Data Social Value of Data Privacy Issues Confidentiality Applications in Health Care