Multi Armed Bandits chalpert@meetup.com.

Name: Multi Armed Bandits chalpert@meetup.com.
Uploaded: 2017-10-04T00:09:32+00:00
Duration: PTM13S36
Channel: Ashly Corbell
Description: Multi Armed Bandits chalpert@meetup.com.

Multi Armed Bandits

Survey

Click Here

Click-through Rate (Clicks / Impressions) 20%
Click Here Click-through Rate (Clicks / Impressions) 20%

Click Here Click Here

Click Here Click Here Click-through Rate 20% ?

AB Test Randomized Controlled Experiment
Show each button to 50% of users Click Here Click Here Click-through Rate 20% ?

After Test (show winner)
AB Test Timeline Time Exploration Phase (Testing) Exploitation Phase (Show Winner) Before Test AB Test After Test (show winner)

Click Here Click Here Click-through Rate 20% ?

Click Here Click Here Click-through Rate 20% 30%

10,000 impressions/month Need 4,000 clicks by EOM 30% CTR won’t be enough

Need to keep testing (Exploration)

Each variant would be assigned with probability 1/N
Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here ABCDEFG... Test Each variant would be assigned with probability 1/N N = # of variants Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here

Not everyone is a winner

Each variant would be assigned with probability 1/N
Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here ABCDEFG... Test Each variant would be assigned with probability 1/N N = # of variants Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here Click Here

Need to keep testing (Exploration)
Need to minimize regret (Exploitation)

Balance of Exploitation & Exploration
Multi Armed Bandit Balance of Exploitation & Exploration

Bandit Algorithm Balances Exploitation & Exploration
Time Discrete Exploitation & Exploration Phases Before Test AB Test After Test Continuous Exploitation & Exploration Before Test Multi Armed Bandit Bandit Favors Winning Arm

Bandit Algorithm Reduces Risk of Testing
AB Test Best arm exploited with probability 1/N More Arms: Less exploitation Bandit Best arm exploited with determined probability Reduced exposure to suboptimal arms

Borrowed from Probabilistic Programming & Bayesian Methods for Hackers
Demo Borrowed from Probabilistic Programming & Bayesian Methods for Hackers

AB test would have cost 4.3 percentage points
Split Test Still sending losers Bandit AB test would have cost 4.3 percentage points Winner Breaks Away!

How it works Epsilon Greedy Algorithm ε = Probability of Exploration
Click Here Exploration ε 1 / N ε / N Start of round Click Here Epsilon Greedy with ε = 1 = AB Test 1 - ε 1-ε Exploitation (show best arm) Click Here

Epsilon Greedy Issues Constant Epsilon: No prior knowledge
Initially under exploring Later over exploring Better if probability of exploration decreases with sample size (annealing) No prior knowledge

Some Alternatives Epsilon-First Epsilon-Decreasing Softmax
UCB (UCB1, UCB2) Bayesian-UCB Thompson Sampling (Bayesian Bandits)

Bandit Algorithm Comparison
Regret:

Thompson Sampling Setup: Assign each arm a Beta distribution with parameters (α,β) (# Success, # Failures) Beta(α,β) Beta(α,β) Beta(α,β) Click Here Click Here Click Here

Thompson Sampling Setup: Initialize priors with ignorant state of Beta(1,1) (Uniform distribution) - Or initialize with an informed prior to aid convergence Beta(1,1) Beta(1,1) Beta(1,1) Click Here Click Here Click Here

Thompson Sampling For each round:
1: Sample random variable X from each arm’s Beta Distribution 2: Select the arm with largest X 3: Observe the result of selected arm 4: Update prior Beta distribution for selected arm Success! X 0.7 0.2 0.4 Beta(1,1) Beta(1,1) Beta(1,1) Click Here Click Here Click Here

1: Sample random variable X from each arm’s Beta Distribution 2: Select the arm with largest X 3: Observe the result of selected arm 4: Update prior Beta distribution for selected arm Success! X 0.7 0.2 0.4 Beta(2,1) Beta(1,1) Beta(1,1) Click Here Click Here Click Here

1: Sample random variable X from each arm’s Beta Distribution 2: Select the arm with largest X 3: Observe the result of selected arm 4: Update prior Beta distribution for selected arm Failure! X 0.4 0.8 0.2 Beta(2,1) Beta(1,1) Beta(1,1) Click Here Click Here Click Here

1: Sample random variable X from each arm’s Beta Distribution 2: Select the arm with largest X 3: Observe the result of selected arm 4: Update prior Beta distribution for selected arm Failure! X 0.4 0.8 0.2 Beta(2,1) Beta(1,2) Beta(1,1) Click Here Click Here Click Here

Posterior after 100k pulls (30 arms)

Bandits at Meetup

Meetup’s First Bandit

Control: Welcome To Meetup. - 60% Open Rate Winner: What
Control: Welcome To Meetup! - 60% Open Rate Winner: What? Winner: Hi - 75% Open Rate (+25%) 76 Arms

Avoid Linkbaity Subject Lines

Coupon 16 Arms Control: Save 50%, start your Meetup Group – 42% Open Rate Winner: Here is a coupon – 53% Open Rate (+26%)

398 Arms

210% Click-through Difference:
Best: Looking to start the perfect Meetup for you? We’ll help you find just the right people Start the perfect Meetup for you! Worst: Launch your own Meetup in January and save 50% Start the perfect Meetup for you 50% off promotion ends February 1st.

Choose the Right Metric of Success
Success tied to click in last experiment Sale end & discount messaging had bad results Perhaps people don’t know that hosting a Meetup costs $$$? Better to tie success to group creation

More Issues Email open & click delay New subject line effect
Problem when testing notifications Monitor success trends to detect weirdness

Seasonality Thompson Sampling should naturally adapt to seasonal changes Learning rate can be added for faster adaptation Winner all other times Click Here Click Here

Bandit or Split Test? AB Test good for: Bandit good for:
- Biased Tests - Complicated Tests Bandit good for: - Unbiased Tests - Many Variants - Time Restraints - Set It And Forget It

Thanks!

Multi Armed Bandits chalpert@meetup.com.

Similar presentations

Presentation on theme: "Multi Armed Bandits chalpert@meetup.com."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multi Armed Bandits chalpert@meetup.com.

Similar presentations

Presentation on theme: "Multi Armed Bandits chalpert@meetup.com."— Presentation transcript:

Similar presentations

About project

Feedback