Download presentation
Presentation is loading. Please wait.
1
1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research
2
2 What is web search? Access to “heterogeneous”, distributed information –Heterogeneous in creation –Heterogeneous in accuracy –Heterogeneous in motives Multi-billion dollar business –Source of new opportunities in marketing Strains the boundaries of trademark and intellectual property laws A source of unending technical challenges
3
Yahoo! Research 3 The coarse-level dynamics Content creators Content aggregators Feeds Crawls Content consumers Advertisement Editorial Subscription Transaction
4
Yahoo! Research 4
5
5 Brief (non-technical) history Early keyword-based engines –Altavista, Excite, Infoseek, Inktomi, Lycos, ca. 1995-1997 Paid placement ranking: Goto (morphed into Overture Yahoo!) –Your search ranking depended on how much you paid –Auction for keywords: casino was expensive!
6
Yahoo! Research 6 Brief (non-technical) history 1998+: Link-based ranking pioneered by Google –Blew away all early engines except Inktomi –Great user experience in search of a business model –Meanwhile Goto/Overture’s annual revenues were nearing $1 billion
7
Yahoo! Research 7 Brief (non-technical) history Result: Google added “paid-placement” ads to the side, separate from search results 2003: Yahoo follows suit, acquiring Overture (for paid placement) and Inktomi (for search)
8
Yahoo! Research 8 Algorithmic results CPC Advertisements
9
Yahoo! Research 9 Editorial User reviewsAds
10
Yahoo! Research 10 Types of content Editorial content: books, music, professionally-produced websites User-generated content: blogs, reviews, bulletin boards, groups, etc. Total web growth: 1-3M pages/day –Not “real” growth –Think text content… Courtesy: Andrew Tomkins
11
Yahoo! Research 11 Total text content 6B people type 4 hrs/day at 100 wpm Storage: 52PB/yr = Cost: $25M/yr In another 5 years, this looks about like the cost of having 10 people on your payroll Conclusion 1: any company with tens of people can store every bit of text produced by every human on the planet Conclusion 2: no scale-based differentiation around text content (of course, not all content is text…) Courtesy: Andrew Tomkins
12
Yahoo! Research 12 User generated content (UGC) New content –2 billion pages of editorial content –Tiny number of songs, etc –5-10 billion pages UGC exist (already ~10% of consumption), growing Note: UGC did not exist as a category a couple of years ago! –Rapidly becoming a key growth area of consumed web content – but we don’t know how to process it! Courtesy: Andrew Tomkins
13
13 Tags The simplest form of UGC Is the Turing test always the right question?
14
Yahoo! Research 14
15
Yahoo! Research 15 The power of social tagging Flickr – community phenomenon Millions of users share and tag each others’ photographs (why???) The wisdom of the crowd can be used to search The principle is not new – anchor text used in “standard” searchanchor Don’t try to pass the Turing test?
16
Yahoo! Research 16 Anchor text When indexing a document D, include anchor text from links pointing to D. www.ibm.com Armonk, NY-based computer giant IBM announced today Joe’s computer hardware links Compaq HP IBM Big Blue today announced record profits for the quarter
17
Yahoo! Research 17 Challenges in tag-based search How do we use these tags for better search? How do you cope with spam? What’s the ratings and reputation system?ratings and reputation The bigger challenge: where else can you exploit the power of the people? What are the incentive mechanisms? –Luis von Ahn (CMU): The ESP GameESP Game
18
Yahoo! Research 18
19
19 More UGC: Social search Indexing the knowledge in people’s heads
20
Yahoo! Research 20
21
Yahoo! Research 21
22
Yahoo! Research 22 Social content Social capital
23
Yahoo! Research 23 Incentives
24
Yahoo! Research 24 Incentives What assignment of incentives leads to good user behavior? –What’s “good” user behavior? –Good questions, good answers, new questions? Whom do you trust and why? –Propagation of trust and distrust.
25
Yahoo! Research 25 Ratings and reputation Node reputation: Given a DAG with –a subset of nodes called GOOD – another subset called BAD –Find a measure of goodness for all other nodes. Node pair reputation: Given a DAG with a real-valued trust on the edges –Predict a real-valued trust for ordered node pairs not joined by an edge Metric labelling
26
26 CPC advertisements What pays the bills
27
Yahoo! Research 27 Ads
28
Yahoo! Research 28 Generic questions Of the various advertisers for a keyword, which one(s) get shown? What do they pay on a click through? The answers turn out to draw on insights from microeconomics
29
Yahoo! Research 29 Ads go in slots like this one and this one.
30
Yahoo! Research 30 Advertisers generally prefer this slot to this one to this one.
31
Yahoo! Research 31 Click through rate r 1 = 200 per hour r 2 = 150 per hour r 3 = 100 per hour etc.
32
Yahoo! Research 32 Why did witbeckappliance win over ristenbatt?
33
Yahoo! Research 33 First-cut assumption Click-through rate depends only on the slot, not on the advertisement In fact not true; more on this later.
34
Yahoo! Research 34 Advertiser’s value We assume that an advertiser j has a value v j per click through –Some measure of downstream profit Say, click-through followed by 96% of the time, no purchase 0.7% buy Dishwasher, profit $500 1.2% buy Vacuum Cleaner, profit $200 2.1% buy Cleaning agents, profit $1 $ 5.921
35
Yahoo! Research 35 Example For the keyword miele, say an advertiser has a value of $10 per click. How much should he bid? How much should he be charged? The value of a slot for an advertiser, what he bids and what he is charged, may all be different.
36
Yahoo! Research 36 Advertiser’s payoff in ad slot i (Click-through rate) x (Value per click) – (Payment to search engine) = r i v j – (Payment to Engine) = r i v j – p ij Payment of advertiser j in slot i Function of all other bids.
37
Yahoo! Research 37 Two auction pricing mechanisms First price: The winner of the auction is the highest bidder, and pays his bid. Second price: The winner is the highest bidder, but pays the second- highest bid. Engine decides and announces pricing. What should an advertiser bid? Not truthful.
38
Yahoo! Research 38 Second-price = Vickrey auction Consider first a single advt slot Winner pays the second-highest bid Vickrey: Truth-telling is a dominant strategy for each player (advertiser) –No incentive to “game” or fake bids
39
Yahoo! Research 39 Auctions and pricing: multiple slots Overture’s model: –Ads displayed in order of decreasing bid –E.g., if advertiser A bids 10, B bids 2, C bids 4 – order ACB How do you price slots? Generalized Vickrey?Vickrey –Generalized second-price (GSP) –Vickrey-Clark-Groves (VCG): each advertiser pays the externality he imposes on othersVCG
40
Yahoo! Research 40 Bidder A, $10 Bidder C, $4 Bidder B, $2 Pays 4 Pays 2 Generalized Second Price auction pricing
41
Yahoo! Research 41 VCG pricing Suppose click rates are 200 in the top slot, 100 in the second slot VCG payment of the second player (C) is 2 x 100 = 200 For the first player, 4x(200-100) + 200 Externality on third player B. Externality on C.Externality on B.
42
Yahoo! Research 42 VCG and GSP Truth-telling is a dominant strategy under VCG … Truth-telling not dominant under GSP! Edelman, Ostrovsky, Schwarz
43
Yahoo! Research 43 VCG and GSP Static equilibrium of GSP is locally envy-free: no advertiser can improve his payoff by exchanging bids with advertiser in slot above. Depending on the mechanism, revenue varies: GSP ≥ VCG. Edelman, Ostrovsky, Schwarz Locally envy-free mechanisms correspond to Stable Marriage solutions.
44
Yahoo! Research 44 GSP for bid-ordering What’s good about bid-ordering and GSP? –Advertisers like transparency What’s wrong with bid-ordering?
45
Yahoo! Research 45 Brand advertising? Bid ordering (former Yahoo! order)
46
Yahoo! Research 46
47
Yahoo! Research 47 Revenue ordering Simplified version of Google’s ordering –Each ad j has an expected click- through denoted CTR j –Advertiser j’s bid is denoted b j Then, expected revenue from this advertiser is R j = b j+1 x CTR j Order advertisers by R j –Payment by GSP
48
Yahoo! Research 48
49
Yahoo! Research 49 “current” Yahoo! ordering
50
Yahoo! Research 50 “Squashed” ordering Overture/Old Yahoo! scheme –Order ads by bid Google (puportedly) –Order by bid click-through rate (CTR) Squashing (Lahaie/Pennock) Key – advertisers react to mechanism! s=0s=1 Order by bid*(CTR) s OvertureGoogle?
51
Yahoo! Research 51
52
Yahoo! Research 52 Where do we go next? Premise: –People don’t want to search –People want to get tasks done I want to book a vacation in Tuscany. StartFinish
53
Yahoo! Research 53 What is missing? Information integration –Information extraction –Schema normalization Mining social structure –Tags, UGC Welcome to The Savoy Located on The Strand in the heart of the West End theatre district, hotel near leicester square Search
54
Yahoo! Research 54 Computational microeconomics Reputation and incentive mechanisms Matching marketplaces –Jobs, dates, … –Online matching everywhere Hardest part is estimating the payoffs, not the matching algorithm “Network effects” –Are 500 million users 500 times as valuable as a million users? 5000 times?
55
Yahoo! Research 55 A new convergence Monetization and economic value an intrinsic part of system design –Not an afterthought –Mistakes are costly! Computing meets humanities like never before – sociology, economics, anthropology …
56
56 Thank you. Questions? pragh@yahoo-inc.com http://research.yahoo.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.