Heuristic Search and Information Visualization Methods for School Redistricting University of Maryland Baltimore County Marie desJardins, Blazej Bulka, Ryan Carr, Andrew Hunt, Priyang Rathod, and Penny Rheingans This work was partially supported by NSF # IIS Thanks to David Drown and the Howard County Public School System for data and valuable inputs.
July 18, Overview The Problem: School Redistricting Searching for Good Plans Results Future Work and Conclusions
July 18, The Problem: School Redistricting
July 18, School Redistricting Assign neighborhoods (or planning polygons) within a school district to schools while considering multiple factors, such as busing costs, test score distribution, and school utilization Finding the best assignment (or plan) is a multiattribute optimization problem Also want to generate qualitatively different plans that represent tradeoffs among the criteria, and help users visualize these tradeoffs Search space is very large: O(s p ), where s is the number of schools (12 high schools; 30 elementary) p is the number of polygons (~263) Currently in Howard County, Maryland, the process is almost entirely manual
July 18, Evaluation Criteria Educational benefits for students Frequency with which students are redistricted Number and distance of students bused Total busing cost Demographics and academic performance of schools Number of students redistricted Maintenance of feeder patterns Changes in school capacity Impact on specialized programs Functional and operational capacity of school infrastructure Building utilization
July 18, Evaluation Criteria Educational benefits for students Frequency with which students are redistricted Number and distance of students bused Total busing cost Demographics and academic performance of schools Number of students redistricted Maintenance of feeder patterns Changes in school capacity Impact on specialized programs Functional and operational capacity of school infrastructure Building utilization
July 18, Evaluation Criteria 1.Number of students bused Students who can walk to a school should be assigned to that school 2.Busing cost Estimated as a population-weighted sum of polygon-school distances 3.Demographics FARM (Free and Reduced Meal) ratio at each school should ideally be the same as that of the county as a whole 4.Academic performance MSA (Maryland State Assessment) scores at each school should ideally be the same as those of the county as a whole 5.Capacity Each school should be between 90% and 110% of available capacity Penalty functions are defined for each of the five criteria above per-polygon, per-school, per-plan cost measure from 0 (good) to 1 (bad)
July 18, Selecting Multiple Plans: Diversity One plan dominates another if it is better along all dimensions Two plans are incomparable if each is better than the other along at least one dimension A good set of plans should: contain no dominated plans consist of qualitatively different plans We measure “qualitatively different” using Euclidean distance in the evaluation space: Div(P) = 1/|P| p i, p j P Dist(p i, p j )
July 18, Closest-School Plan Marriottsville High (new)
July 18, Closest [outer] vs. Recommended [inner]
July 18, Searching for Good Plans
July 18, Multiattribute Optimization Previous approaches to multiattribute optimization: Weighted methods: Combine attributes into a single weighted sum Priority-based methods: Optimize one attribute, then perform constrained optimization on the other attributes MOA* (and variations): Find all nondominated solutions using heuristic search Evolutionary methods: Use genetic search to explore the population space using recombination and fitness-based selection Redistricting domain: No single set of weights or prioritization scheme Very large search space can’t find all (or even most) nondominated solutions Use local search
July 18, Basic Hill-Climbing Baseline: Choose an initial plan as a starting point (seed) then hill-climb through “[weighted] sum of criteria” space Seed options: closest-school plan current plan random plan “breadth-first” assignment “minimum-spanning-tree” assignment
July 18, Biased Hill-Climbing General approach: Choose an initial plan as a seed Hill-climb through “dominated plan” space Save “incomparable plans” as they are encountered Stop at a local maximum Restart search starting from a plan in the incomparable list Blind bias: At restart, choose a plan from the incomparable list at random Diversity bias: At restart, choose the plan that is farthest in evaluation space from the solutions found so far
July 18, Results
July 18, Quality of Generated Plans Quality of generated plans is better than manually generated plans ...with respect to the particular evaluation criteria we’ve defined Original-plan seed does better than closest-plan seed leads to the “wrong” local maximum Compared to recommended plan, generated plans generally perform: ...better with respect to capacity ...comparably with respect to socioeconomic and academic measures ...better with respect to busing costs ...comparably with respect to walk utilization Outer: Recommended Inner: Best generated
July 18, Diversity of Generated Plans Baselines: Manual plans: closest, recommended, and alternative (diversity measure = ) Unweighted hill-climbing: multiple runs of basic hill-climbing with different initial seeds (diversity measure = ) Weighted hill-climbing: multiple runs of basic hill-climbing with different weight vectors (diversity measure = ) Biased hill-climbing: with blind bias: diversity measure = with diversity bias: diversity measure = Note: Biased hill-climbing yields a somewhat worse average unweighted sum, so the plans are not quite “as good” in a direct comparison
July 18, Future Work and Conclusions
July 18, Future Work Modeling additional evaluation criteria: Feeder statistics Redistricting frequency Incorporating projected future demographic shifts into evaluation, search, and visualization Extensions to search methods: Other definitions of diversity (e.g., dispersion, similar to k-means mean-squared error) Other multiattribute optimization methods (particularly genetic methods) Visualization extensions: Visualizing feeder patterns Computing and visualizing gradients in search space Deployment and user testing
July 18, Conclusions School redistricting is an important and challenging problem The multiattribute optimization framework is a good paradigm for this application Novel search techniques and evaluation methods are needed Diversity-biased hill climbing is a promising initial approach