Download presentation
Presentation is loading. Please wait.
1
Lecture 3: Environs and Algorithms
CSCI 4310 Lecture 3: Environs and Algorithms
2
Task Environments Fully Observable Partially Observable D
Sensors detect all aspects of the environment relevant to choosing an action Partially Observable D Some unknowns
3
Task Environments Deterministic Stochastic (Non-Deterministic) D
Next state of the environment is completely determined by the current state and agent action Stochastic (Non-Deterministic) D Identical world state and sensor information may result in different agent actions each time Strategic Deterministic except for actions of other agents
4
Task Environments Episodic Sequential D
Agent perceives and acts in a distinct episode Next episode does not depend on previous Part-picking robot Sequential D Current decision could affect all future decisions More difficult – must look ahead
5
Task Environments Dynamic D Static
Environment can change while algorithm is deciding course of action Changing environment can lead to algorithm inaction Static Environment does not change
6
Task Environments Discrete Continuous D
A finite number of states Chess environment Continuous D A range of continuous values Driving sensors Can be applied to environment state, sensors, actions Article in relation to games
7
Task Environments Single agent Multi-agent D Roomba Competitive
Zero-sum games: Chess Cooperative Co-Op game play Working together maximizes individual performance measures
8
Task Environments If you have a partially observable, stochastic, sequential, dynamic, continuous, multi-agent task Just surrender However, this is the real world.
9
Broad Algorithmic Categories
10
Alternatives to Optimization
Heuristics Finding “good” methods to apply to problems Algorithm Design Humans are remarkably adept TSP
11
Greedy Algorithms Used for optimization problems
Make the most promising decision at any given time Never reconsider or reverse your decision Does this always yield the optimal solution?
12
Greedy Algorithms What problems does this work well on?
Finding directions in a GPS mapping service
13
Divide and Conquer Algorithms
Top down: Start with the entire problem Decompose a problem into a number of smaller instances of the same problem Merge the solutions to obtain the solution to the original instance Have a ‘base case’
14
Divide and Conquer Algorithms
Ex: Recursion Ex: Merge Sort
15
Dynamic Programming Algorithms
Bottom up: Start by obtaining solutions to the smallest sub-instances Combine these solutions to get solutions to larger instances Benefit: use tables to store results calculated so far
16
Dynamic Programming Algorithms
Avoids the duplication that can hurt the performance of divide and conquer strategies You will often re-compute the same solution.
17
Backtracking Algorithms
“Just start solving and hope for the best” Usually involves a depth first search Analogous to how humans would find their way around research park Just start walking and hope I find the lab Drop a crumb at turns If you dead-end, return to most recently dropped crumb and try a different direction
18
Backtracking Algorithms
Ex: 8 Queens problem
19
Branch-and-Bound Algorithms
Similar to Backtracking Do not keep trying a path that you already know is worse than the best answer Pruning Can slow things down
20
Branch-and-Bound Algorithms
Ex: Many game tree searches Estimate the upper and lower bound of all choices in a certain tree branch If maximizing the function value If branch Aupper < Blower prune branch A
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.