AD Click Prediction a View from the Trenches

AD Click Prediction a View from the Trenches
Google paper 2013 윤철환

Google Ad

System Overview

FTRL-Proximal Algorithm
Online Gradient Descent(OGD) + Regularized Dual Averaging(RDA) Gradient Learning Late ,

Per Coordinate Learning Rates
N : negative events P: Positive events p= P / ( N + P )

Memory saving tech Probabilisitic feature inclusion
Subsampling training data Encoding values with fewer bits

Probabilistic Feature Inclusion
Poisson Inclusion New feature are inserted with probability p Bloom Filter Inclusion Once a feature has occurred more than n times (according to the filter), we add it to the model

Subsampling Training Data
Any query for which at least one of the ads was clicked. A fraction r ∈ (0, 1] of the queries where none of the ads were clicked. The expected contribution of a randomly chosen event t in the unsampled data to the sub-sampled objective function

Encoding Values with Fewer Bits
Naive implementations of the Online Gradient Descent algorithm use 32 or 64 bit floating point encodings. For their Regularized Logistic Regression models, such encodings waste memory. Use fixed point (q2.13 % 16bit) encoding instead. No measurable loss in precision and 75% RAM savings

GridViz

AD Click Prediction a View from the Trenches

Similar presentations

Presentation on theme: "AD Click Prediction a View from the Trenches"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

AD Click Prediction a View from the Trenches

Similar presentations

Presentation on theme: "AD Click Prediction a View from the Trenches"— Presentation transcript:

Similar presentations

About project

Feedback