Model-Free Episodic Control

Model-Free Episodic Control
Name Ze Liu Data

CONTENTS 01 02 03 04 05 INTRODUCTION INSPIRATION AND WHAT TO SOLVE
ALGORITHMS 04 EXPERIMENTAL RESULTS 05 REFERENTIAL VALUE

Model-Based or Model-Free Episodic Control INTRODUCTION PART 01

MDP S: a set of states A: a set of actions
PART ONE INTRODUCTION MDP S: a set of states A: a set of actions Ps′s,a: the probability that action a in state s will lead to state s' Rs,a: the immediate reward received after transitioning from state s to state s', due to action a γ: the discount factor

PART ONE INTRODUCTION Model-Free Model-based

PART ONE INTRODUCTION Episodic Control

INSPIRATION AND WHAT TO SOLVE PART 02

Traditional RL is data inefficient and too slow to train A
PART TWO INSPIRATION AND WHAT TO SOLVE Traditional RL is data inefficient and too slow to train A Traditional RL algorithms take many millions of interactions to attain human-level performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. B Research on the brain Study find that the hippocampal system may be used to guide sequential decision-making by co-representing environment states with the returns achieved from the various possible actions

ALGORITHMS PART 03

PART THREE ALGORITHMS

PART THREE ALGORITHMS Writing Look-up

PART THREE ALGORITHMS

PART THREE ALGORITHMS RP

PART THREE ALGORITHMS VAE

EXPERIMENTAL RESULTS PART 04

PART FOUR EXPERIMENTAL RESULTS

REFERENTIAL VALUE PART 05

THANK YOU FOR WATCHING

Model-Free Episodic Control

Similar presentations

Presentation on theme: "Model-Free Episodic Control"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Model-Free Episodic Control

Similar presentations

Presentation on theme: "Model-Free Episodic Control"— Presentation transcript:

Similar presentations

About project

Feedback