Visualization, analysis and mining of geo- spatial information in educational data sets using web-based tools Aniruddha Desai |Winter 2013 Presentation Center for Web and Data Science University of Washington, Tacoma
Outline Motivation / Background Data sets for OSPI projects Goals: Visualization, Analysis, Mining Goals: Tools, Strategies
Motivation / Background Paper: “Top-k popular routes based on Foursquare check-ins” Paper: “Top-k optimal store locations based on Foursquare check-ins” Class Project: Using GPS traces created by a user to infer mode of transportation OSPI Project: Educational data Is the domain data-rich and tools poor? Does data have geo-spatial dimensions?
OSPI – RTI Reports The RTI (Response to Intervention Project) website collects data using a standard rubric with a “rating scale”. Goal: Measuring effectiveness of interventions at individual school sites across various metrics.
OSPI – SNP Reports SNP (State Needs Projects) data dashboard collects data using web based surveys. Goal: Measuring effectiveness of professional development / training efforts in special needs education.
Collect, Analyze, Share Data SurveysData Field Evidence Text Audio / Video Geographically distributed Managing participants Sharing data Analysis Multiple databases Variety of data types
Data Visualization Goals Chloropleth maps visualize no. of educators trained by region at state & national level (for SNP)
Data Visualization Goals Lincoln County No. of Participants Trained: X No. of Responses Rcvd: X’ Population Density: Y Number of Trainings: Z Go to Results Interactive regions on Chloropleth map that display more information
Data Visualization Goals Visualize data points at individual school sites by zooming in RTI data is spread across several districts / counties across the state.
Data Visualization Goals Tahoma School District ESD No Go to Results Tahoma School District ESD No Go to Results At a low zoom factor, mouse-overs display more information about data points and link to RTI results
Data Analysis / Mining Goals Can visualizations answer some of these questions? – Can we predict which area needs more professional development training next year? – Is the response rate on surveys and participant attendance rate co-related? – High volume / variety of data (some of it geo- spatial): survey responses / qualitative assessments / user zip codes / school locations / district boundaries – how do we extract useful information?
Data Analysis / Mining Goals – Are demographic data (census), income levels, crime statistics, employment rates related to: the outcomes of intervention (for RTI)? the quality of professional development (for SNP)? – Data collection, reporting and visualization is the first step – finding patterns potentially the next step. – How do the visualizations scale up from state to national level?
Tools – Drupal CMS (already in-place) – Google Maps API – Gmap module to create an interface to the Google Maps API within Drupal – D3.JS (Data Driven Documents) visualizations such as Heat Maps, Chloropleth Maps, Bar charts, Pie charts – Open Street Maps API (Drupal integration?)
Strategy – Implement geographic map-based visualizations with appropriate amount of information at different zoom factors. – At high level of granularity link data points on map to bar charts / reports for more detail. – Analyze data visualizations for patterns.
Thank you! Q&A