Web Science & Technologies University of Koblenz ▪ Landau, Germany Online Advertising Steffen Staab.

1 Web Science & Technologies University of Koblenz ▪ Landau, Germany Online Advertising Steffen Staab

2 Introduction to Web ScienceSlide 2 of 71 Topics  Introduction to online advertisement  Understanding the participants and their roles.  Targeted advertising.  Privacy Issues  Solutions  User based solutions  Collaborative solutions  Conclusions

3 Introduction to Web ScienceSlide 3 of 71 Introduction  Online Advertising plays a critically important role in the Internet world.  advertising is the main way of profiting from the Internet, the history of Internet advertising developed alongside the growth of the medium itself

4 Introduction to Web ScienceSlide 4 of 71 Facts and short history  First internet banner, 1994, AT&T.  Also in 1994, the first commercial spam, a "Green Card Lottery".  The first ad server was developed by FocaLink Media Services and introduced on 1995.  In March 2008, Google acquired DoubleClick for US$3.1 billion in cash.

5 Introduction to Web ScienceSlide 5 of 71 Parties  Advertiser  Got money, wants publicity  e.g., Coca-Cola  Publisher  Got content, wants money   Ad-network  Got advertising infrastructure, wants money  e.g., Google AdSense, Yahoo  Consumer  Wants free content

6 Introduction to Web ScienceSlide 6 of 71 Ad embedding

7 Introduction to Web ScienceSlide 7 of 71 Business Model  CPM = Cost Per thousand impressions  Impression: user just sees the ad.  Rates vary from $0.25 to $100  CPC = Cost Per Click  This is the cost charged to an advertiser every time their ad is "clicked" on  Rates around 0.3$ per click

8 Introduction to Web ScienceSlide 8 of 71 Click fraud  clicking on an ad for the purpose of generating a charge per click without having actual interest.  Might be:  The publisher  Advertiser’s competitor  The publisher’s competitor  Ad-networks deal with it by trying to identify who clicks on the ads.

9 Introduction to Web ScienceSlide 9 of 71 Online Advertising and Ad Auctions at Google Vahab Mirrokni Google Research, New York

10 Introduction to Web ScienceSlide 10 of 71 At the beginning: Traditional Ads  Posters, Magazines, Newspapers, Billboards. What is being Sold:  Pay-per-Impression: Price depends on how many people your ad is shown to (whether or not they look at it) Pricing:  Complicated Negotiations (with high monthly premiums...)  Form a barrier to entry for small advertisers Traditional Advertising

11 Introduction to Web ScienceSlide 11 of 71 Online Ads:  Banner Ads, Sponsored Search Ads, Pay-per-Sale ads. Targeting:  Show to particular set of viewers. Measurement:  Accurate Metrics: Clicks, Tracked Purchases. What is being Sold:  Pay-per-Click, Pay-per-Action, Pay-per-Impression Pricing:  Auctions Advertising on the Web

12 Introduction to Web ScienceSlide 12 of 71 1994: Banner ads, pay-per-impression Banner ads for Zima and AT&T appear on 1998: Sponsored search, pay-per-click 1 st -price auction develops keyword- based advertising with pay-per- click sales. 2002: Sponsored search, pay-per-click 2 nd -price auction Google introduces AdWords, a second-price keyword auction with a number of innovations. 1996: Affiliate marketing, pay-per-acquisition Amazon/EPage/CDNow pay hosts for sales generated through ads on their sites. History of Online Advertising

13 Introduction to Web ScienceSlide 13 of 71 Banner Ads

14 Introduction to Web ScienceSlide 14 of 71 Pay-per-1000 impressions (PPM): advertiser pays each time ad is displayed  Models existing standards from magazine, radio, television  Main business model for banner ads to date  Corresponds to inventory host sells Exposes advertiser to risk of fluctuations in market  Banner blindness: effectiveness drops with user experience Barrier to entry for small advertisers  Contracts negotiated on a case-by-case basis with large minimums (typically, a few thousand dollars per month) Pay-Per-Impression

15 Introduction to Web ScienceSlide 15 of 71 Pay-per-click (PPC): advertiser pays only when user clicks on ad  Common in search advertising  Middle ground between PPM and PPA Does not require host to trust advertiser Provides incentives for host to improve ad displays Pay-PerClick

16 Introduction to Web ScienceSlide 16 of 71 Advertisements sold automatically through auctions: advertisers submit bids indicating value for clicks on particular keywords  Low barrier-to-entry  Increased transparency of mechanism Keyword bidding allowed increased targeting opportunities Auction Mechanism

17 Introduction to Web ScienceSlide 17 of 71 Initial GoTo model: first-price auction  Advertisers displayed in order of decreasing bids  Upon a click, advertiser is charged a price equal to his bid  Used first by Overture/Yahoo! Google model: stylized second-price auction  Advertisers ranked according to bid and click-through- rate (CTR), or probability user clicks on ad  Upon a click, advertiser is charged minimum amount required to maintain position in ranking Auction Mechanism

18 Introduction to Web ScienceSlide 18 of 71 Graph from [Zhang 2006] Bidding history in Yahoo! First-Price Auction: Bidding Patterns

19 Introduction to Web ScienceSlide 19 of 71 Graph from [Zhang 2006] Bidding Patterns

20 Introduction to Web ScienceSlide 20 of 71 4 Targeting Populations Advert Creation Keyword Selection Bids and Budget 3 2 1 “You don’t get it, Daddy, because they’re not targeting you.” Bidding Process

21 Introduction to Web ScienceSlide 21 of 71

22 Introduction to Web ScienceSlide 22 of 71

23 Introduction to Web ScienceSlide 23 of 71

24 Introduction to Web ScienceSlide 24 of 71

25 Introduction to Web ScienceSlide 25 of 71 4 Targeting Populations Advert Creation Keyword Selection Bids and Budget “Here it is – the plain unvarnished truth. Varnish it.” 3 2 1 Bidding Process

26 Introduction to Web ScienceSlide 26 of 71 Ad title Ad text Display url

27 Introduction to Web ScienceSlide 27 of 71 4 Targeting Populations Advert Creation Keyword Selection Bids and Budget “Now, that’s product placement!” 3 2 1 Bidding Process

28 Introduction to Web ScienceSlide 28 of 71

29 Introduction to Web ScienceSlide 29 of 71

30 Introduction to Web ScienceSlide 30 of 71

31 Introduction to Web ScienceSlide 31 of 71 4 Targeting Populations Advert Creation Keyword Selection Bids and Budget 3 2 1 Bidding Process

32 Introduction to Web ScienceSlide 32 of 71 Daily Budget

33 Introduction to Web ScienceSlide 33 of 71

34 Introduction to Web ScienceSlide 34 of 71 A repeated mechanism! Upon each search,  Interested advertisers are selected from database using keyword matching algorithm  Budget allocation algorithm retains interested advertisers with sufficient budget  Advertisers compete for ad slots in allocation mechanism  Upon click, advertiser charged with pricing scheme CTR updated according to CTR learning algorithm for future auctions Auction Mechanism

35 Introduction to Web ScienceSlide 35 of 71 Click-through rate (CTR): a parameter estimating the probability that a user clicks on an ad A separate parameter for each ad/keyword pair Assumption: CTR of an ad in a slot is equal to the CTR of the ad in slot 1 times a scaling parameter which depends only on the slot and not the ad CTR learning algorithm uses a weighted averaging of past performance of ad to estimate CTR Click-Through Rates

36 Introduction to Web ScienceSlide 36 of 71 Advertiser A B C BidAllocationPrice $102$5 $50 X 1 $0 $10 per click! Ad slot 1 Ad slot 2 Keywor d Algorithmic search results (Old) Yahoo! 2 nd -Price Auction

37 Introduction to Web ScienceSlide 37 of 71 Advertiser A B C BidCTRBid x CTRAllocationPrice $100.101.02$5 $50 0.50 0.01 2.5 0.5 1 X $2 $0 (expected bid per impression) per click! Ad slot 1 Ad slot 2 Keywor d Algorithmic search results Google Single-Shot Auction

38 Introduction to Web ScienceSlide 38 of 71 Exact match: keyword phrase equals search phrase Phrase match: keyword phrase appears in search (“red roses” matches to “red roses for valentines”) Broad match: each word of keyword phrase appears in search (“red roses” matches to “red and white roses”) Issues:  Tradeoff between relevance and competition  How to handle spelling mistakes Keyword Matching

39 Introduction to Web ScienceSlide 39 of 71 Basic algorithm  Spread monthly budget evenly over each day  If budget leftover at end of day, allocate to next day  When advertiser runs out of budget, eliminate from auctions Issues:  Need to smooth allocation through-out day  Allocation of budget across keywords Budget Allocation

40 Introduction to Web ScienceSlide 40 of 71 Keyword Price in 3 rd slot# of Keywords $20-$502 $10.00 - $19.9922 $5.00 - $9.99206 $3.00 - $4.99635 $1.00 - $2.993,566 $0.50 - $0.994,946 $0.25 - $0.495,501 $0.11 - $0.245,269 PPC of most popular searches in Google, 4/06 Typical Parameters

41 Introduction to Web ScienceSlide 41 of 71 KeywordTop Bid2 nd Bid mesothelioma$100 structured settlement$100$52 vioxx attorney$38 student loan consolidation$29$9 Bids on some valuable keywords CTRs are typically around 1% Typical Parameters

42 Introduction to Web ScienceSlide 42 of 71 Avoiding click fraud Bidding with budget constraints Externalities between advertisers User search models Typical Parameters

43 Introduction to Web ScienceSlide 43 of 71 Adwords FrontEnd: Bid Simulations  Clicks and Cost for other bids. Google Analytics  Traffic Patterns, Site visitors. Search insights:  Search Patterns Interest-Based Advertising  Indicate your interests so that you get more relevant ads Measurement: Information

44 Introduction to Web ScienceSlide 44 of 71 AdWords FrontEnd

45 Introduction to Web ScienceSlide 45 of 71

46 Introduction to Web ScienceSlide 46 of 71 Web Analytics

47 Introduction to Web ScienceSlide 47 of 71 47 Distinguish Causality and Correlation. Experimentation:  Ad Rotation: 3 different creatives  Website Optimizer  E.g. 6000 search quality experiments, 500 of which were launched. Repeated experimentation:  Continuous Improvement (Multi-armed bandit) Re-acting to Metrics

48 Introduction to Web ScienceSlide 48 of 71 48 Google Ad Systems:  Sponsored Search: AdWord Auctions.  Contextual Ads (AdSense) & Display Ads (DoubleClick)  Ad Exchange  Social Ads, YouTube, TV ads. Bid Management & Campaign Optimization for Advertisers  Short-term vs. Long-term effect of ads. Planning: Ad Auctions & Ad Reservations.  Stochastic/Dynamic Inventory Planning  Pricing: Auctions vs Contracts Ad Serving  Online Stochastic Assignment Problems Other Online Advertising Aspects

49 Introduction to Web ScienceSlide 49 of 71 49 Efficiency, Fairness, Smoothness. Sponsored Search: Repeated Auctions, Budget Constraints, Throttling, Dynamics(?) Display Ads: Online Stochastic Allocation  Impressions arrive online, and should be assigned to Advertisers (with established contracts) Online Primal-Dual Algorithms. Offline Optimization for Online Stochastic Optimization: Power of Two Choices.  Learning+Optimization: Exploration vs Exploitation?? Ad Exchange Ad Serving: Bandwidth Constraints. Social Ads: Ad Serving over Social Networks Ad Serving

50 Introduction to Web ScienceSlide 50 of 71 Itay Gonshorovitz Foundation of privacy TARGETED ONLINE ADVERTISING

51 Introduction to Web ScienceSlide 51 of 71 Online behavioral advertising  Online behavioral advertising refers to the practice of ad- networks tracking users across web sites in order to learn user interests and preferences.  Benefits  Advertisers targets a more focused audience which increases the effectively.  Consumer is “bothered” by more relevant and interesting ads.

52 Introduction to Web ScienceSlide 52 of 71 How ad-networks match ads  Most behavioral targeting systems work by categorizing users into one or more audience segments.  Profiling users based on collected data  Search history – analyzing search keywords  Browse history - analyzing content of visited pages  Purchase history  Social networks  Geography

53 Introduction to Web ScienceSlide 53 of 71 How Ad-Networks track users  Cookies  3 rd Party cookies  Flash cookies  Web bug  IP address  User-agent Headers  Browser + OS  More than 24,000 signatures

54 Introduction to Web ScienceSlide 54 of 71 case study

55 Introduction to Web ScienceSlide 55 of 71 case study

56 Introduction to Web ScienceSlide 56 of 71 Privacy  Tracking and categorizing users by the ad-networks tend to violate user’s privacy.  The gathered information, linked with the users real identity, form a violation of privacy in its most basic form.  For example, if a person is searching the web for information on a serious genetic disease, that information can be collected and stored along with that consumer's other information - including information that can uniquely identify the consumer.

57 Introduction to Web ScienceSlide 57 of 71 So… What we have so far?  User - Preserve his privacy  Ad-Network & Publisher –  Maintain targeting and preserve their effectiveness and income  Still want to be able to fight click fraud  Questions:  Do the two goals necessarily conflict?  Or can they be both achieved?

58 Introduction to Web ScienceSlide 58 of 71 Naive (paranoid) solution  Surf only across anonymizing proxies.  TOR  Surf in private mode  Advantages  Effective from the user’s perspective.  Disadvantages  Are proxies really anonymizing?  Very awkward  Slower  Damages targeted advertising

59 Introduction to Web ScienceSlide 59 of 71 TrackMeNot (Howe, Nissenbaum, 2005)  Implemented as a Firefox plugin.  Achieves privacy through obfuscation.  Generates noisy queries.  Starts with fixed a seed query list and evolve queries base on previous results.  Mimics user behavior so fake queries be indistinguishable:  Query timing  Click through behavior

60 Introduction to Web ScienceSlide 60 of 71 TrackMeNot  Advantages  Simple  Disadvantages  Still the real queries can be connected to real identity.  Might have problems with offensive contents.  Again, damages targeted advertising

61 Introduction to Web ScienceSlide 61 of 71 Privad (Guha, Reznichenko, Tang, et al., 2009)  Require client software:  saves locally database of ads (served by the ad-network)  Learn user interests in order to match ads.  Match add from the local database according to the User interests.

62 Introduction to Web ScienceSlide 62 of 71 Privad  Introduce new party – Dealer:  Proxies anonymously all communication between the user and the ad-network.  might be government regulatory agency.  hides user’s identity from the ad-network, but itself does not learn any profile information about the user since all messages between the user and ad-network are encrypted.

63 Introduction to Web ScienceSlide 63 of 71

64 Introduction to Web ScienceSlide 64 of 71 Privad  Advantages  Ad-Networks can still target ads without violates user privacy.  Disadvantages  Complicated to add the new party.  Ad-Network has to trust the dealer in order to fight click- fraud which might unmotivated them to cooperate.

65 Introduction to Web ScienceSlide 65 of 71 Adnostic (Toubina, Narayanan, Boneh, et al., 2009)  Two party solution:  Client side: Implemented as a Firefox plugin.  Server side: requires Ad-Network support  User’s preferences and interests are stored locally by the plugin, instead of at the Ad-network.  The targeted ad is selected by the plugin locally at the users computer, instead of at the Ad-Network servers.

66 Introduction to Web ScienceSlide 66 of 71 Adnostic - Accounting  “charge per click” model remains unchanged.  “charge per impression” is harder.  It uses homomorphic encryption scheme.  given the public key and ciphertexts, anyone can calculate  given the public key and ciphertexts, and scalar c, can be calculated.

67 Introduction to Web ScienceSlide 67 of 71 Adnostic - charge per impression protocol  Client: Track user activity and maintains the data locally.  Visits an Ad supported website.  Server: Sends a list of n ads ids along with public key  The browser chooses an ad to display to the user. Then creates that matches the selected ad, then send, Along with zero-knowledge proof that and each is 0 or 1.

68 Introduction to Web ScienceSlide 68 of 71 Adnostic - charge per impression protocol  Validates the proof. If the proof is valid then using homomorphic encryption calculates when c is the price of viewing the ad.  The server save encrypted counter for each ad and add to it the previous values. Only one counter’s real value change.  At the end of the billing period, say a month, each counter is decrypted (should be done by trusted authority) and the advertisers pays for the ad- network.

69 Introduction to Web ScienceSlide 69 of 71 Adnostic  Advantages  Ad-networks can still target ads without violates user privacy.  Ad-networks can still detect click fraud though it will be difficult without gathering information on IP even for a short time.  Disadvantages  Ad-networks become weaker.  Ad-networks can still track user if they are willing to, and the protocol is built on trust.

70 Introduction to Web ScienceSlide 70 of 71 Measurements Pricing Experimentation Other form of Advertising:  TV Ads  Ad Exchanges  Social Ads Future of Online Advertising

71 Introduction to Web ScienceSlide 71 of 71 Conclusions  In my opinion, It is hard to believe that ad-networks will give up the power of tracking users without legislation.  Nevertheless, There are reasonable solutions that still support targeted advertising without violating users privacy.

