Using Adalines to Approximate Q-functions in Reinforcement Learning

Using Adalines to Approximate Q-functions in Reinforcement Learning
Steven Wyckoff December 6, 2006

The Problem Timing traffic lights for optimal traffic flow is hard
It would be really nice if there was a good way to have the traffic lights learn the best timing

Green Light District “Intelligent Traffic Light Control”
Wiering, van Veenen, Vreeken, Koopman Built a test-bed for traffic light controller algorithms Based on Reinforcement Learning

Green Light District TLController fills out a table with the ‘gains’ for each lane SimModel picks the best legal light configuration Cars are allowed to move (or not) and the TLController gets to listen in on their movement Repeat

Existing Algorithms Random Most Cars TC-1 GenNeural (And more)
Totally random gains Most Cars Based on presence of at least one car TC-1 Real-Time Dynamic Programming Based on probabilities of progress / reward GenNeural Genetically evolve a 3-layer network Uses only traffic densities (And more)

My Algorithm Use a neural network instead of dynamic programming Good:
Network can deal with continuous input Might be able to recognize traffic patterns that are not available using a table lookup Bad: Hard to tell what the network will learn Hard to figure out useful input Hard to tell what the ‘right’ output is for training

Pitfalls / Solutions Don’t know if we will be red or green Input
Two adalines to predict reward if the light is red or green—gain is the difference Input (for each lane): number of cars, traffic density, is a given lane full Rewards Reward for cars moving, passing through intersections Shared reward for other lanes in the intersection

Results: “Split” “Adaline” did slightly better than “Most Cars”
“TC-1” did the best

Results: “Complex” “Adaline” did the worst “TC-1” did the best

What I Wish Was Different
Infrastructure Inputs and rewards are all discrete Seems like the network would do better with access to the light configurations Rewards It would be nice to give rewards for no waiting Network Arguably a multi-layer network could perform better

Demo Time

Using Adalines to Approximate Q-functions in Reinforcement Learning

Similar presentations

Presentation on theme: "Using Adalines to Approximate Q-functions in Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using Adalines to Approximate Q-functions in Reinforcement Learning

Similar presentations

Presentation on theme: "Using Adalines to Approximate Q-functions in Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback