1 Evolving Hyper-Heuristics using Genetic Programming Supervisor: Moshe Sipper Achiya Elyasaf.

Slides:



Advertisements
Similar presentations
Heuristic Functions By Peter Lane
Advertisements

Artificial Intelligence Presentation
An Introduction to Artificial Intelligence
Markov Decision Process
11 Human Competitive Results of Evolutionary Computation Presenter: Mati Bot Course: Advance Seminar in Algorithms (Prof. Yefim Dinitz)
Heuristics CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Informed Search Methods How can we improve searching strategy by using intelligence? Map example: Heuristic: Expand those nodes closest in “as the crow.
Artificial Intelligence Chapter 9 Heuristic Search Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
1 Heuristic Search Chapter 4. 2 Outline Heuristic function Greedy Best-first search Admissible heuristic and A* Properties of A* Algorithm IDA*
ICS-171:Notes 4: 1 Notes 4: Optimal Search ICS 171 Summer 1999.
Part2 AI as Representation and Search
Problem Solving Agents A problem solving agent is one which decides what actions and states to consider in completing a goal Examples: Finding the shortest.
Solving Problems by Searching Currently at Chapter 3 in the book Will finish today/Monday, Chapter 4 next.
5-1 Chapter 5 Tree Searching Strategies. 5-2 Satisfiability problem Tree representation of 8 assignments. If there are n variables x 1, x 2, …,x n, then.
GP Applications Two main areas of research Testing genetic programming in areas other techniques have been applied to. Applying genetic programming to.
Planning under Uncertainty
CPSC 322, Lecture 9Slide 1 Search: Advanced Topics Computer Science cpsc322, Lecture 9 (Textbook Chpt 3.6) January, 23, 2009.
CPSC 322 Introduction to Artificial Intelligence October 27, 2004.
Evolving Heuristics for Searching Games Evolutionary Computation and Artificial Life Supervisor: Moshe Sipper Achiya Elyasaf June, 2010.
Games with Chance Other Search Algorithms CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 3 Adapted from slides of Yoonsuck Choe.
Find a Path s A D B E C F G Heuristically Informed Methods  Which node do I expand next?  What information can I use to guide this.
Informed Search Methods How can we make use of other knowledge about the problem to improve searching strategy? Map example: Heuristic: Expand those nodes.
Problem Solving and Search in AI Heuristic Search
Blind Search-Part 2 Ref: Chapter 2. Search Trees The search for a solution can be described by a tree - each node represents one state. The path from.
1 Branch and Bound Searching Strategies 2 Branch-and-bound strategy 2 mechanisms: A mechanism to generate branches A mechanism to generate a bound so.
CSC344: AI for Games Lecture 4: Informed search
5-1 Chapter 5 Tree Searching Strategies. 5-2 Breadth-first search (BFS) 8-puzzle problem The breadth-first search uses a queue to hold all expanded nodes.
Uninformed Search (cont.)
Heuristic Search Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001.
State-Space Searches. 2 State spaces A state space consists of –A (possibly infinite) set of states The start state represents the initial problem Each.
State-Space Searches.
Vilalta&Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
FreeCell Solitaire Optimization Team Solitaire: Chapp Brown Kylie Beasley.
State-Space Searches. 2 State spaces A state space consists of A (possibly infinite) set of states The start state represents the initial problem Each.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Informed search algorithms Chapter 4. Best-first search Idea: use an evaluation function f(n) for each node –estimate of "desirability"  Expand most.
Informed Search Methods. Informed Search  Uninformed searches  easy  but very inefficient in most cases of huge search tree  Informed searches  uses.
For Friday Finish reading chapter 4 Homework: –Lisp handout 4.
For Monday Read chapter 4, section 1 No homework..
Search with Costs and Heuristic Search 1 CPSC 322 – Search 3 January 17, 2011 Textbook §3.5.3, Taught by: Mike Chiang.
Lecture 3: Uninformed Search
For Friday Finish chapter 6 Program 1, Milestone 1 due.
GAME PLAYING 1. There were two reasons that games appeared to be a good domain in which to explore machine intelligence: 1.They provide a structured task.
1 Branch and Bound Searching Strategies Updated: 12/27/2010.
Genetic Algorithms CSCI-2300 Introduction to Algorithms
For Wednesday Read chapter 6, sections 1-3 Homework: –Chapter 4, exercise 1.
For Wednesday Read chapter 5, sections 1-4 Homework: –Chapter 3, exercise 23. Then do the exercise again, but use greedy heuristic search instead of A*
Computer Science CPSC 322 Lecture 6 Iterative Deepening and Search with Costs (Ch: 3.7.3, 3.5.3)
Search (continued) CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 5: Power of Heuristic; non- conventional search.
Optimization Problems
CPSC 322, Lecture 6Slide 1 Uniformed Search (cont.) Computer Science cpsc322, Lecture 6 (Textbook finish 3.5) Sept, 17, 2012.
3.5 Informed (Heuristic) Searches This section show how an informed search strategy can find solution more efficiently than uninformed strategy. Best-first.
Branch and Bound Searching Strategies
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
For Monday Read chapter 4 exercise 1 No homework.
CMPT 463. What will be covered A* search Local search Game tree Constraint satisfaction problems (CSP)
Uniformed Search (cont.) Computer Science cpsc322, Lecture 6
For Monday Chapter 6 Homework: Chapter 3, exercise 7.
Artificial Intelligence Problem solving by searching CSC 361
Games with Chance Other Search Algorithms
Uniformed Search (cont.) Computer Science cpsc322, Lecture 6
EA C461 – Artificial Intelligence
Heuristic Search Thank you Michael L. Littman, Princeton University for sharing these slides.
Genetic Algorithms CSCI-2300 Introduction to Algorithms
State-Space Searches.
State-Space Searches.
Search.
Search.
State-Space Searches.
Presentation transcript:

1 Evolving Hyper-Heuristics using Genetic Programming Supervisor: Moshe Sipper Achiya Elyasaf

22 Overview  Introduction Searching Games State-Graphs Uninformed Search Heuristics Informed Search  Evolving Heuristics  Previous Work Rush Hour FreeCell

33 Representing Games as State-Graphs  Every puzzle/game can be represented as a state graph: In puzzles, board games etc., every piece move can be counted as a different state In computer war games etc. – the place of the player / the enemy, all the parameters (health, shield…) define a state

44 Rush-Hour as a state-graph

55 Searching Games State-Graphs Uninformed Search  BFS – Exponential in the search depth  DFS – Linear in the length of the current search path. BUT: We might “never” track down the right path. Usually games contain cycles  Iterative Deepening: Combination of BFS & DFS Iterative Deepening Each iteration DFS with a depth limit is performed. Limit grows from one iteration to another Worst case - traverse the entire graph

66 Searching Games State-Graphs Uninformed Search  Most of the game domains are PSPACE- Complete!  Worst case - traverse the entire graph  We need an informed-search!

77 Searching Games State-Graphs Heuristics  h:states -> Real. For every state s, h(s) is an estimation of the minimal distance/cost from s to a solution h is perfect: an informed search that tries states with highest h-score first – will simply stroll to solution For hard problems, finding h is hard Bad heuristic means the search might never track down the solution  We need a good heuristic function to guide informed search

88 Searching Games State-Graphs Informed Search  Best-First search: Like DFS but select nodes with higher heuristic value first Best-First search Not necessarily optimal Might enter cycles (local extremum)  A*: A* Holds closed and sorted (by h-value) open lists. Best node of all open nodes is selected Maintenance and size of open and closed is not admissible

99 Searching Games State-Graphs Informed Search (Cont.)  IDA*: Iterative-Deepening with A* IDA* The expanded nodes are pushed to the DFS stack by descending heuristic values Let g(s i ) be the min depth of state s i : Only nodes with f(s)=g(s)+h(s)<depth-limit are visited  Near optimal solution (depends on path-limit)  The heuristic need to be admissible

10 Iterative Deepening

11 Best-First Search

12 A*

13 Overview  Introduction Searching Games State-Graphs Uninformed Search Heuristics Informed Search  Evolving Heuristics  Previous Work Rush Hour FreeCell

14  For H 1, …,H n – building blocks (not necessarily admissible or in the same range), How should we choose the fittest heuristic? Minimum? Maximum? Linear combination?  GA/GP may be used for: Building new heuristics from existing building blocks Finding weights for each heuristic (for applying linear combination) Finding conditions for applying each heuristic H should probably fit stage of search E.g., “goal” heuristics when assuming we’re close Evolving Heuristics

15 Evolving Heuristics: GA W 1 =0.3W 2 =0.01W 3 =0.2…W n =0. 1

16 Evolving Heuristics: GP If And ≤ ≤ H1 0.4 ≥ ≥ H H2 * * H1 0.1 * * H5 / / H1 0.1 Condition True False

17 Evolving Heuristics: Policies ConditionResult Condition 1Heuristics Weights 1 Condition 2Heuristics Weights 2 Condition nHeuristics Weights n Default Heuristics Weights

18 Evolving Heuristics: Fitness Function

19 Overview  Introduction Searching Games State-Graphs Uninformed Search Heuristics Informed Search  Evolving Heuristics  Previous Work Rush Hour FreeCell

20 Rush Hour GP-Rush [Hauptman et al, 2009] Bronze Humie award

21 Domain-Specific Heuristics  Hand-Crafted Heuristics / Guides: Blocker estimation – lower bound (admissible) Blocker estimation Goal distance – Manhattan distance Goal distance Hybrid blockers distance – combine above two Hybrid blockers distance Is Move To Secluded – did the car enter a secluded area? Is Move To Secluded Is Releasing Move

22 Blockers Estimation  Lower bound for number of steps to goal  By: Counting moves to free blocking cars Example:  O is blocking RED  Need at least: Free A; Move A; Move B Move C; Move O  H = 4

23 Goal Distance 16  Deduce goal  Use “Manhattan Distance” from goal as h measure

24 Hybrid 16+8=24  “Manhattan Distance” + Blockers Estimation

25  Moving C and A to the secluded areas are always good moves! Is Move To Secluded CL2, AR1

26 Policy “Ingredients”  Functions & Terminals: ConditionsResults TerminalsIsMoveToSecluded, isReleasingMove, g, PhaseByDistance, PhaseByBlockers, NumberOfSyblings, DifficultyLevel, BlockersLowerBound, GoalDistance, Hybrid, 0, 0.1, …, 0.9, 1 BlockersLowerBound, GoalDistance, Hybrid, 0, 0.1, …, 0.9, 1 SetsIf, AND, OR, ≤, ≥+, *

27 Coevolving (Hard) 8x8 Boards RED H H F F G G MM PP II SS KK KK KK K K H H F F G G MM PP II SS KK KK KK K K H H F F G G MM PP II SS KK KK KK K K

28 Results  Average reduction of nodes required to solve test problems, with respect to the number of nodes scanned by a blind search: Heuristic: Problem IDH1H2H3HcPolicy 6x6100%28%6%-2%30%60% 8x8100%31%25%30%50%90%

29 Results (cont’d)  Time (in seconds) required to solve problems JAM01... JAM40:

30 FreeCell  FreeCell remained relatively obscure until Windows 95  There are 32,000 solvable problems (known as Microsoft 32K), except for game #11982, which has been proven to be unsolvable Evolving hyper heuristic-based solvers for Rush-Hour and FreeCell [Hauptman et al, SOCS 2010] GA-FreeCell: Evolving Solvers for the Game of FreeCell [Elyasaf et al, GECCO 2011]

31 FreeCell (cont’d)  As opposed to Rush Hour, blind search failed miserably  The best published solver to date solves 96% of Microsoft 32K  Reasons: High branching factor Hard to generate a good heuristic

32 Learning Methods: Random Deals Which deals should we use for training? First method tested - random deals This is what we did in Rush Hour Here it yielded poor results Very hard domain

33 Learning Methods: Gradual Difficulty Second method tested - gradual difficulty Sort the problems by difficulty Each generation test solvers against 5 deals from the current difficulty level + 1 random deal

34 Learning Methods: Hillis-Style Coevolution Third method tested - Hillis-style coevolution using “Hall-of-Fame”: A deal population is composed of 40 deals (=40 individuals) + 10 deals that represent a hall-of- fame Each hyper-heuristic is tested against 4 deal individuals and 2 hall-of-fame deals  Evolved hyper-heuristics failed to solve almost all Microsoft 32K! Why?

35 Learning Methods: Rosin-style Coevolution Fourth method tested - Rosin-style coevolution: Each deal individual consists of 6 deals Mutation and crossover: p1 p p

36 Results Learning Method Run Node Reduction Time Reduction Length Reduction Solved -HSD100% 96% Gradual Difficulty GA-123%31%1%71% GA-227%30%-3%70% GP---- Policy28%36%6%36% Rosin-style coevolution GA 87%93%41%98% Policy 89%90%40%99%

37 Additional Proposed Research Search Aspects of Our Method  In the literature: Generally, in non-optimal search the objective is to find as short as possible solutions Admissible heuristics and memory-based heuristics are commonly used Advisors are hardly used and critical domain knowledge is lost

38 Additional Proposed Research Search Aspects of Our Method (cont’d)  Our objective is to reduce search resources  We use a wide range of domain knowledge, including: Boolean advisors and highly underestimating advisors  We believe that our method can outperform previous ones  We wish to introduce our acquired knowledge within the search community (in addition to the EA community)

39 Additional Proposed Research Search Aspects of Our Method (cont’d)  To achieve this goal we plan multiple sets of experiments for comparing several search algorithms with different types of heuristics: admissible non-admissible Underestimating advisors Overestimating advisors Boolean advisors

40 Additional Proposed Research Algorithmic Framework  We aim to create an algorithmic framework based upon evolutionary algorithms  Using this framework one could create a solver for different models and domains  For each model or domain an automatic strategy will be evolved allowing efficient solutions

41 Additional Proposed Research Domain-Independent Planner  An immediate extension would be altering the method to evolve a planner for solving problems from different domains, without knowing a-priori the domains  Algorithms for generating and maintaining agendas, policies, interfering sub-goals, relaxed problems, and other methodologies are readily available, provided we encode problems (e.g., Rush Hour, FreeCell) as a planning domains  However, using evolution in conjunction with these techniques is non-trivial

42 Additional Proposed Research Domain-Independent Planner (cont’d)  To achieve this goal we need to define domain-independent heuristics to be used as building blocks for the evolutionary process  There are several state-of-the-art heuristics for domain- independent planners, such as FF and HSP  Problem: Using these heuristics as building blocks will take an unreasonable amount of time to calculate all of them  Possible solution: It is crucial to find easy-to-calculate heuristics and advisors for independent domains  A possible direction might be using the taxonomic syntax database described by Yoon et al. [Learning control knowledge for forward search planning]taxonomic syntax database described

43 Learning control knowledge for forward search planning Yoon et al.

44 Additional Proposed Research Probabilistic Models: MDP & POMDP  Markov Decision Process (MDP) & Partially Observable Markov Decision Process (POMDP) are used to describe more-generalized problems: In MDP we don’t know how our action will end up (what the state will be) In addition to MDP, the agent in a POMDP world doesn’t even know for sure where he is  We believe that our model can be altered to support MDP & POMDP as well

45 Thank you for listening any questions?