Evolving Pre-Sort Algorithms Joe Barker
Why Sorting? Sorting is used everywhere Classical methods for developing new algorithms has been pretty much mined out Genetic Programming (GP) will allow us to find new improvements
Why Pre-Sorting? When evolving programs, the question of generality arises How do we know the solution works for all possible inputs? Solution: Provide a method for graceful failure –GP creates pre-sort algorithms and classic sorts are used to “clean up”
Previously John Koza did major work on GP Kenneth Kinnear did some work evolving sort algorithms No obvious work with pre-sort algorithms
Previously Kinnear’s work was mostly examining the types(and generality) –Action nodes: Non-terminal: dobl, if, if-lt Terminal: order, swap –Expression nodes: Non-terminal: e-, e1-, e1+, wismaller, wislarger, less Terminal: *len*, index (2)
Previously –Genetic Operators Single Crossover Mutation Non-Fitness Single Crossover Hoist Create (3)
Design - Individual Actions –Non-Terminals: LOOP, DUAL –Terminals: ORDER,SWAP Expressions –Non-Terminals: +,-,*,/,%,LARGER,SMALLER –Terminals: SIZE, I(0-3), CONST(0-10)
Design - Fitness Every individual All individuals use same input Individual execution is terminated if the number of operations exceeds a limit Weighted sum of: –Compares in individual and “cleanup” sort –Swaps –Arithmetic operation executed
Design - Fitness Input –Array of random size n, –Initially either in-order or reverse-order –c*n random swaps performed (2)
Design - Evolution Selection: Blocked-rank selection, only mature Crossover –Make a list of sub-trees of each parent within a [min,max] “expected” depth –Choose a tree uniformly from each list to swap Mutation –Create a sub-tree with max. depth of tree to be replaced N-F Crossover, Hoist
Design - Population Initialization –Random trees –Size of action and expression different Maturity: >=30 steps Competition –Elitist (except immature ind.)
Design - Other No termination Signal causes program state and summary to be dumped to file State can be reloaded to continue run
Experiment Bubble sort (in-order best) Data distributions –Nearly sorted –Nearly reversed-sorted –Very random
Future Work Different sorts Node types to match sorts Run-time compilation of trees to speed execution
Questions?