Computational Advertising and

Slides:



Advertisements
Similar presentations
Data Mining and Text Analytics Advertising Laura Quinn.
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.
Combinatorial Algorithms
DISTRIBUTED COMPUTING & MAP REDUCE CS16: Introduction to Data Structures & Algorithms Thursday, April 17,
Peer-to-Peer Distributed Search. Peer-to-Peer Networks A pure peer-to-peer network is a collection of nodes or peers that: 1.Are autonomous: participants.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
1 Maximal Independent Set. 2 Independent Set (IS): In a graph, any set of nodes that are not adjacent.
1 Complexity of Network Synchronization Raeda Naamnieh.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
CS 345 Data Mining Online algorithms Search advertising.
Recent Development on Elimination Ordering Group 1.
Approximation Algorithms
THE QUERY COMPILER 16.6 CHOOSING AN ORDER FOR JOINS By: Nitin Mathur Id: 110 CS: 257 Sec-1.
Chain: Operator Scheduling for Memory Minimization in Data Stream Systems Authors: Brian Babcock, Shivnath Babu, Mayur Datar, and Rajeev Motwani (Dept.
CS 345 Data Mining Online algorithms Search advertising.
Online Matching for Internet Auctions DIMACS REU Presentation 20 June 2006 Slide 1 Online Matching for Internet Auctions Ben Sowell Advisor: S. Muthukrishnan.
ASC Program Example Part 3 of Associative Computing Examining the MST code in ASC Primer.
Fundamental Techniques
Choosing an Order for Joins (16.6) Neha Saxena (214) Instructor: T.Y.Lin.
Distributed Constraint Optimization * some slides courtesy of P. Modi
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
HAL R VARIAN FEBRUARY 16, 2009 PRESENTED BY : SANKET SABNIS Online Ad Auctions 1.
Introduction to Hadoop and HDFS
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Online algorithms Typically when we solve problems and design algorithms we assume that we know all the data a priori. However in many practical situations.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Data Structures and Algorithms in Parallel Computing Lecture 4.
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original.
Vertex Coloring Distributed Algorithms for Multi-Agent Networks
Data Structures and Algorithms in Parallel Computing
© The McGraw-Hill Companies, Inc., Chapter 12 On-Line Algorithms.
11 -1 Chapter 12 On-Line Algorithms On-Line Algorithms On-line algorithms are used to solve on-line problems. The disk scheduling problem The requests.
Models of Greedy Algorithms for Graph Problems Sashka Davis, UCSD Russell Impagliazzo, UCSD SIAM SODA 2004.
1 VLDB, Background What is important for the user.
Analysis of Massive Data Sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
On the Ability of Graph Coloring Heuristics to Find Substructures in Social Networks David Chalupa By, Tejaswini Nallagatla.
PPC Tutorial For Beginners. Content  PPC  Paid & Organic Advertisement  What is Search Engine?  How to set up account in Google Adwords? Target Your.
Search Engine Optimization
Unsupervised Learning
Computational Advertising
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Greedy Method 6/22/2018 6:57 PM Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.
Parallel Programming By J. H. Wang May 2, 2017.
Data Virtualization Tutorial… Semijoin Optimization
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
PREGEL Data Management in the Cloud
AdWords and Generalized On-line Matching
13 Text Processing Hongfei Yan June 1, 2016.
What is Search Engine optimization
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Cloud Computing Group 7.
Data Structures and Algorithms in Parallel Computing
Review of Bulk-Synchronous Communication Costs Problem of Semijoin
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
CPU Scheduling G.Anuradha
Online algorithms Search advertising
Use of Electronic and Internet advertising options
Brian Babcock, Shivnath Babu, Mayur Datar, and Rajeev Motwani
Advertising on the web.
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Chapter 6 Network Flow Models.
Review of Bulk-Synchronous Communication Costs Problem of Semijoin
Approximation Algorithms
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Unsupervised Learning
Presentation transcript:

Computational Advertising and Comparison between Mapreduce and Bulk Synchronous systems

Competitive Algorithms Picking the Best Advertisement Computational Advertising Find the most relevant ads matching a particular context on the Web. The context depends on the type of advertising and could mean the content where the ad is shown, the user who is viewing the ad, or the social network of the user. Greedy Algorithms Competitive Algorithms Picking the Best Advertisement The Balance Algorithm

Data can be obtained when the algorithm runs. Online Algorithms Offline Algorithms Data can be obtained when the algorithm runs. Decision making does not take place until all the information are obtained Data needed is provided to algorithm initially Data access can be of any order, algorithm provides an answer once executed

Greedy Algorithms ● These algorithms make their decision to an object one step at a time, at each step choosing the locally best option. ● Problem solving heuristic to optimize choice. ● In some cases, greedy algorithms construct the globally best choice by repeatedly choosing the locally best option.

Advantages and Disadvantages ● Pros ○ Simplicity -- greedy algorithms are easier to describe and code than others ○ Efficiency -- greedy algorithms can be implemented more efficiently than others ● Cons ○ Simplicity -- it works to construct local optimal steps and not the global optimal result (short sighted optimization) ○ Efficiency -- when its wrong its severely wrong

Question Give some advantages of Greedy Algorithms

Competitive Algorithms When comparing online and offline algorithms, we can expect that there will be some constant c less than 1 such that on any input, the result of an on-line algorithm is at least c times the result of the optimal offline algorithm. Comparing the performance of online algorithm with corresponding offline algorithm for same problem instance. Competitive ratio = min over all possible inputs sequences( | M online | / | M opt| ) where | M online | = cardinality of online algorithm where | M opt| = cardinality of optimal offline algorithm

Real time example: Google Adwords Online Advertising Service where advertisers bid for queries on the search engine. Search engine selects and displays ads for a small subset of the bidders in a purpose of achieving the goal of maximizing search engine’s revenue Advertisers: A and B; A bids on query x and B bids on queries x and y Sequence of queries: xxyy Query assigned to advertiser who has the maximum remaining budget Competitive Ratio = ¾

Could you name some examples for competitive algorithms?

Picking the Best Ad History Advertisers bid on search keywords. Advertiser pays only when ad is clicked Adwords Problem Many Advertisers bid on each query Search engine must pick a subset of advertisers whose ads are shown

Analysing Balance Example: two advertisers, A1 and A2, each with budget B greater than 1, an even number. optimal solution exhausts both advertisers’ budgets. I.e, revenue to search engine =2B Balance must exhaust at least one advertiser’s budget, If not, we can allocate more queries

Question What does the Search Engine guarantee its advertisers ?

Balance Algorithm This is a simple improvement to greedy algorithm •Improved competitive ratio than greedy algorithm •The competitive ratio is 75% for the case of two advertisers •The competitive ratio is about 63% for a greater number of advertisers •For each query, the advertiser with the largest remaining budget will be picked •Break the ties arbitrarily

Balance Algorithm Example •Example: Consider two advertisers A1 and A2, each bidding on query q. •A1: bid = 1, balance = 110. •A2: bid = 10, balance = 100. •First 10 occurrences of query q, all go to A1, and A1 then gets 10 q’s for every one that A2 gets. •What if there are only 10 occurrences of q? •Optimal yields $100 whereas Balance yields only $10.

Question: Which is best algorithm : Greedy Algorithm or Balanced Algorithm.

Comparison of MapReduce with Bulk-Synchronous Systems Review of Bulk-Synchronous Communication Costs Problem of Semijoin

What is a Graph? Graph is a non linear data structure Collection of vertices and edges Examples of Graphs Hyperlink structure of the Web Physical structure of computers on the Internet Interstate highway system Social networks

The Graph Model Views all computation as a recursion on some graph. Graph nodes send messages to one another. Messages bunched into supersteps, where each graph node processes all data received. Sending individual messages would result in far too much overhead. Due to the size of graph data, graphs often require usage of parallel processing for efficient computation

Bulk Synchronous Parallel Solution to Graph Model Create a graph node for every tuple in R and S Create a graph node for every B-value. Tuples from S send a message to its corresponding B-value Similarly, Tuples from R send a message to its corresponding b in B-value. Now, the node for 'b' sends message to R, if it has received at least a single request from S Atlast, the traditional map-reduce operation is performed without considering the non-dangling tuples in S

Applications of Bulk-Synchronous Some examples of Bulk-Synchronous Systems are: Pregel - Google Giraph - opensource Pregel build on Hadoop GraphX - for sparks GraphLab - is built to handle higher degree nodes

QUESTION?? Name one application of Bulk Synchronous technique.

SemiJoin What is Semijoin? When a dataset R is large then many records in R may not be actually referenced by any records in a said table L. Therefore semi-join can be used to avoid sending the records in R that will not join with L. Although semi-join avoids sending the records in R that will not join with L, it does this at the cost of an extra scan of said table L.

Problem with Semijoin Consider a example of Join of T(P,Q) and S(Q,R), where : A is really large field - a video ; Q is the video ID ; S(Q,R)is small number of streaming requests, where R is the destination. In semjoin - Find all values of Q in S and filter those (p,q) in T that are dangling -- will not join anything in S Then Map need not move dangling tuples to any reducer. The problem with semijoin is that it also requires that very T-tuple be sent to its mapper to some reducer.

Mention one problem of Semijoin? QUESTION?? Mention one problem of Semijoin?