Finding data for journalism Steve Doig
Sociology: “Color of Money”, census Weather disaster: “What Went Wrong”, Katrina Environment: “Boss Hog”, “Toxic Waters”, FL wetlands, “Ghost Factories”, “Smokestack Effect” Medical: “Culture of Resistance”, radiation errors, Medicare fraud, “Playing with Fire” Justice: Racial disparities in sentencing Safety: railroad crossings, aviation, Data-driven investigations
Search (browser and Google) Spreadsheet (Excel) Database manager (Access, MySQL) Statistical software (SPSS, R) Programming (SAS, perl, Python, et al.) Mapping (ArcMap, QGIS, Mapmaker) Visualization (Google Fusion Tables, R, Stata, et al.) Exotica: GPS, satellite imagery Technical Tools
Dog licenses Sports statistics Lottery winners “Personal” ads Data feature stories
Newsroom math: Percentage change, crowd counting, etc. Descriptive statistics: Mean, median, range Correlation and regression Understanding p-values and confidence intervals Indexes: ◦ Dissimilarity (measures segregation) ◦ Diversity (measures population mix) ◦ Benford’s Law (used in forensic accounting) ◦ HHI (measures market competitiveness) Methods
Search engine
Which government office – in China or elsewhere -- would have the data you want? Look for “data” links on websites Get on lists of agencies and organizations that interest you Join Investigative Reporters & Editors (IRE) Use search to find scholars who are expert in your topic of interest Gather your own data! Strategies for getting data
Questions?