Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Estimating Rates of Rare Events at Multiple Resolutions Deepak Agarwal Andrei Broder Deepayan Chakrabarti Dejan Diklic Vanja Josifovski Mayssam Sayyadian.

Similar presentations


Presentation on theme: "1 Estimating Rates of Rare Events at Multiple Resolutions Deepak Agarwal Andrei Broder Deepayan Chakrabarti Dejan Diklic Vanja Josifovski Mayssam Sayyadian."— Presentation transcript:

1 1 Estimating Rates of Rare Events at Multiple Resolutions Deepak Agarwal Andrei Broder Deepayan Chakrabarti Dejan Diklic Vanja Josifovski Mayssam Sayyadian

2 2 Estimation in the “tail” Contextual Advertising  Show an ad on a webpage (“impression”)  Revenue is generated if a user clicks  Problem: Estimate the click-through rate (CTR) of an ad on a page Most (ad, page) pairs have very few impressions, if any, and even fewer clicks  Severe data sparsity

3 3 Estimation in the “tail” Use an existing, well-understood hierarchy  Categorize ads and webpages to leaves of the hierarchy  CTR estimates of siblings are correlated  The hierarchy allows us to aggregate data Coarser resolutions  provide reliable estimates for rare events  which then influences estimation at finer resolutions

4 4 System overview Retrospective data [URL, ad, isClicked] Crawl URLs Classify pages and ads Rare event estimation using hierarchy a sample of URLs Impute impressions, fix sampling bias

5 5 Sampling of webpages Naïve strategy: sample at random from the set of URLs  Sampling errors in impression volume AND click volume Instead, we propose:  Crawling all URLs with at least one click, and  a sample of the remaining URLs  Variability is only in impression volume

6 6 Imputation of impression volume Ad classes Page classes sums to #impressions on ads of this ad class [column constraint] sums to ∑n ij + K.∑m ij [row constraint] sums to Total impressions (known) #impressions = n ij + m ij + x ij Clicked pool Sampled Non-clicked pool Excess impressions (to be imputed)

7 7 Imputation of impression volume Level 0 Level i Page hierarchy Ad hierarchy Region = (page node, ad node) Region Hierarchy  A cross-product of the page hierarchy and the ad hierarchy Page classes Ad classes Region

8 8 Imputation of impression volume sums to [block constraint] Level i Level i+1

9 9 Imputing x ij Level i Level i+1 Iterative Proportional Fitting [Darroch+/1972] Initialize x ij = n ij + m ij Iteratively scale x ij values to match row/col/block constraint Ordering of constraints: top- down, then bottom-up, and repeat block Page classes Ad classes

10 10 Imputation: Summary Given  n ij (impressions in clicked pool)  m ij (impressions in sampled non-clicked pool)  # impressions on ads of each ad class in the ad hierarchy We get  Estimated impression volume Ñ ij = n ij + m ij + x ij in each region ij of every level

11 11 System overview Retrospective data [page, ad, isclicked] Crawl Pages Classify pages and ads Rare event estimation using hierarchy a sample of pages Impute impressions, fix sampling bias

12 12 Rare rate modeling 1.Freeman-Tukey transform:  y ij = F-T(clicks and impressions at ij) ≈ transformed-CTR  Variance stabilizing transformation: Var(y) is independent of E[y]  needed in further modeling

13 13 S ij S parent(ij) Rare rate modeling 2.Generative Model (Tree-structured Markov Model) y ij y parent(ij) covariates β ij variance V ij Unobserved “state” variance W ij V parent(ij) β parent(ij) W parent(ij)

14 14 Rare rate modeling Model fitting with a 2-pass Kalman filter:  Filtering: Leaf to root  Smoothing: Root to leaf Linear in the number of regions

15 15 Experiments 503M impressions 7-level hierarchy of which the top 3 levels were used Zero clicks in  76% regions in level 2  95% regions in level 3 Full dataset DFULL, and a 2/3 sample DSAMPLE

16 16 Experiments Estimate CTRs for all regions R in level 3 with zero clicks in DSAMPLE Some of these regions R >0 get clicks in DFULL A good model should predict higher CTRs for R >0 as against the other regions in R

17 17 Experiments We compared 4 models  TS: our tree-structured model  LM (level-mean): each level smoothed independently  NS (no smoothing): CTR proportional to 1/Ñ  Random: Assuming |R >0 | is given, randomly predict the membership of R >0 out of R

18 18 Experiments TS Random LM, NS

19 19 Experiments Enough impressions  little “borrowing” from siblings Few impressions  Estimates depend more on siblings

20 20 Related Work Multi-resolution modeling  studied in time series modeling and spatial statistics [Openshaw+/79, Cressie/90, Chou+/94] Imputation  studied in statistics [Darroch+/1972] Application of such models to estimation of such rare events (rates of ~10 -3 ) is novel

21 21 Conclusions We presented a method to estimate  rates of extremely rare events  at multiple resolutions  under severe sparsity constraints Our method has two parts  Imputation  incorporates hierarchy, fixes sampling bias  Tree-structured generative model  extremely fast parameter fitting

22 22 Rare rate modeling 1.Freeman-Tukey transform  Distinguishes between regions with zero clicks based on the number of impressions  Variance stabilizing transformation: Var(y) is independent of E[y]  needed in further modeling ~~ # clicks in region r # impressions in region r

23 23 Rare rate modeling Generative Model  S ij values can be quickly estimated using a Kalman filtering algorithm  Kalman filter requires knowledge of β, V, and W  EM wrapped around the Kalman filter filtering smoothing

24 24 Rare rate modeling Fitting using a Kalman filtering algorithm  Filtering: Recursively aggregate data from leaves to root  Smoothing: Propagate information from root to leaves Complexity: linear in the number of regions, for both time and space filtering smoothing

25 25 Rare rate modeling Fitting using a Kalman filtering algorithm  Filtering: Recursively aggregate data from leaves to root  Smoothing: Propagates information from root to leaves Kalman filter requires knowledge of β, V, and W  EM wrapped around the Kalman filter filtering smoothing

26 26 Imputing x ij Z (i) Z (i+1) Iterative Proportional Fitting [Darroch+/1972] Initialize x ij = n ij + m ij Top-down: Scale all x ij in every block in Z (i+1) to sum to its parent in Z (i) Scale all x ij in Z (i+1) to sum to the row totals Scale all x ij in Z (i+1) to sum to the column totals Repeat for every level Z(i) Bottom-up: Similar block Page classes Ad classes


Download ppt "1 Estimating Rates of Rare Events at Multiple Resolutions Deepak Agarwal Andrei Broder Deepayan Chakrabarti Dejan Diklic Vanja Josifovski Mayssam Sayyadian."

Similar presentations


Ads by Google