Presentation is loading. Please wait.

Presentation is loading. Please wait.

Department of Computer Science Ben-Gurion University

Similar presentations


Presentation on theme: "Department of Computer Science Ben-Gurion University"— Presentation transcript:

1 Department of Computer Science Ben-Gurion University
Decision Process with Non-Markovian Reward Benjamin Berend & Amihay Elboher Supervisor: Prof. Ronen Brafman Department of Computer Science Ben-Gurion University

2 Graphical infrastructure for running, planning and learning algorithms
Goal: clean the stains and collect the fruits to the basket

3 RL – reinforcement learning
Reinforcement learning- optimizing a behavior by learning from retributions RL – reinforcement learning Known Environment Unknown Environment Policy Iteration Q-Learning R-Max Automata Learning

4 Known Environment MDP – Markov Decision Process
In a known environment, the MDP is fully observable.

5 ? ! Q-Learning Experience
The learning is experience based: the agent starts from any policy, and adjusts it’s behavior according the rewards it gets.

6 Automata Learning Sample: 1, 10, 100
0,1 In order to learn a non-Markovian reward we construct an automaton that accepts all “words” that lead to the reward. The algorithm finds the “most logical” automaton and combines it in the state. Sample: 1, 10, 100 1 *10 1 1, 10, 100 }}


Download ppt "Department of Computer Science Ben-Gurion University"

Similar presentations


Ads by Google