Julia Lane New York University A city-based focus on data access – legal, ethical and technical aspects Julia Lane New York University
Key ideas New data => new scale of questions New sources => new scale of challenges Both combined => Major research AND engagement infrastructures
Outline Context Scale Access Training Next steps
Outline Context Scale Access Training Next steps
Federal
City and State
More city data
Outline Context Scale Access Training Next steps
Paradigm shift in scale of users social scientists government agency staff computer scientists private industry citizen scientists
Paradigm shift in scale of uses Research social science computer science other domain sciences Data analytics Operations management Evaluation Policy analysis
New challenges Define a research question (what are we measuring?) Think about what data are available and the measurement error (how are we measuring it?) Link datasets (what are we missing?) Statistical approaches (how can we draw inference?) Address Privacy and Confidentiality/Ethics (are we protecting human subjects?) Disseminate results (can people be reidentified?) Need access Need training
And here is just one reason why (thanks to Stefan)
Probabilistic Matching Machine Learning
Outline Context Scale Access Training Next steps
Engagement in environment
Juliazon Related to data you've viewed New data similar to data you've used What others have done with similar data (recipes) Recipes like yours Thank you Charlie Catlett
Outline Context Scale Access Training Next steps
Sandbox environment
11 million people move through 3,100 Jails $22 Billion in costs 64 % suffer from mental illness, 68% have a substance abuse disorder 44 % suffer from chronic health problems Section title New Section
Mental Health Services Of the top 200 predicted individuals Machine Learning systems can support targeted, preventative interventions to help people at risk of interactions with the criminal justice system Mental Health Services Of the top 200 predicted individuals 104 went to jail within 1 year 19 years total jail time Jail & Criminal Justice
Goals of the Course 1 Train the workforce in rigorous and modern computational data analysis methods and tools for decision-making 2 Develop new data products for government agencies 3 Create new integrated data to address cross- agency challenges 4 Establish new networks across agencies and geographies to address shared problems
Approach The program provides hands-on projects with real microdata in a secure environment so that participants can learn the basics of how to: Code and collect new data Work with spatial data Manage complex data, Apply machine learning, text and network analysis Visualize relationships Address inference issues Manage privacy and confidentiality
First set of courses Data on ex-offenders, welfare recipients and veterans Data on housing and transportation Trained Staff New Products New Networks Joined Up Datasets
Textbook
Outline Context Scale Access Training Next steps
Combine forces Technology Data Classes Outreach
Identify key priority areas Crime/justice Welfare recipiencts Veterans ….
Learn from other fields
One vision Networked set of university/city institutes – interoperable with federal statistical system and program areas Teams of city/federal staff working with researchers on empirical problems (like ag extension programs)
Key ideas New data => new scale of questions New sources => new scale of challenges Both combined => Major research AND engagement infrastructures
Thank you Julia.lane@nyu.edu