Predicting permit activity with cellular automata calibrated with genetic algorithms Sushil J. LouisGary Raines Department of Computer Science US Geological Survey
Outline What is the problem? Calibrating a CA What is the technique? Genetic Algorithm What are the issues? Discretization Encoding Evaluation What are our results ?
What is the problem? Project mineral-related activity on public land to 2010 Predicting permit activity in an area Spatially explicit USGS and others have data on permit activity from 1989 – 1998 as well as data on natural resources Use cellular automata to model (predict) mineral activity over next ten years Problem: Takes weeks to tune CA rules to match available data
What is the problem? Can we automate calibrating a cellular automaton As good as CA calibrated by human In the same or less time
What is the problem?
Model calibration as search Search through the space of possible model parameters to find a parameter set that fits observed data Many search methods We use genetic algorithms
Genetic Algorithms Poorly understood problems (Holland, ‘75, Goldberg, ‘89) Empirical evidence to support their use in this kind of problem Physics models Physical Review Letters, Volume 88, Issue 4 Journal of Quantitative Spectroscopy and Radiative Transfer. Volume 75, 2002, Pgs Seismic models Congress on Evolutionary Computing 1999, pages Hydrology models In progress Proceedings of GECCO, CEC, …
Genetic algorithm calibration
What is a GA? Randomized, parallel search Models natural selection Population based Uses fitness to guide search
Genetic algorithm search
Genetic Algorithm Randomly initialize P(0) with candidate parameter sets Loop Select P(t+1) from P(t) Crossover and Mutate P(t+1) Evaluate P(t+1) run CA model t = t+1
Modified Annealed Voting Rule Probability of Life in Next Generation Number of Live Neighbors Status of Center Cell AliveDead > Annealing WindowVery LikelyLikely Annealing WindowLikelySomewhat Likely < Annealing WindowVery Somewhat Likely Unlikely
Definitions of Parameters ParametersDefinition Very LikelySquare root of Likely (Larger) LikelyA high probability of life. Somewhat LikelyAn intermediate probability of life Very Somewhat LikelySquare root of Somewhat Likely (Larger) UnlikelyA low probability of life Resource ThresholdMinimum fuzzy membership defining where a reasonable explorationist would explore Anneal WindowPosition and width control response of CA
GA Encoding GA usually works with string structures representing a candidate solution 2^36 = 64Gig possibilities Fitness = scaled match to observed data top bottomlikelyslikelyunlikely rt
GA Parameters Population sizes – 50 Elitist selection – next generation is best of parents and offspring Probability of crossover – 1.00 Probability of mutation Fitness scaling – 1.05
Model parameters 496 X 503 = 249,488 cell CA 4 or 5 years (iterations) Average over 3 runs Cell data imported from GIS
Results
Results
Results
Results
GA produces good parameter values (20% better than human) GA is a viable tool for model exploration Many different parameter sets give about the same fit ? Modeling rare events ?
Cross-Tabulation Number of Cells CA Trace Sum Actual Trace Sum