Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research.

Similar presentations


Presentation on theme: "1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research."— Presentation transcript:

1 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

2 2 What is web search? Access to “heterogeneous”, distributed information –Heterogeneous in creation –Heterogeneous in accuracy –Heterogeneous in motives Multi-billion dollar business –Source of new opportunities in marketing Strains the boundaries of trademark and intellectual property laws A source of unending technical challenges

3 Yahoo! Research 3 The coarse-level dynamics Content creators Content aggregators Feeds Crawls Content consumers Advertisement Editorial Subscription Transaction

4 Yahoo! Research 4

5 5 Brief (non-technical) history Early keyword-based engines –Altavista, Excite, Infoseek, Inktomi, Lycos, ca. 1995-1997 Paid placement ranking: Goto (morphed into Overture  Yahoo!) –Your search ranking depended on how much you paid –Auction for keywords: casino was expensive!

6 Yahoo! Research 6 Brief (non-technical) history 1998+: Link-based ranking pioneered by Google –Blew away all early engines except Inktomi –Great user experience in search of a business model –Meanwhile Goto/Overture’s annual revenues were nearing $1 billion

7 Yahoo! Research 7 Brief (non-technical) history Result: Google added “paid-placement” ads to the side, separate from search results 2003: Yahoo follows suit, acquiring Overture (for paid placement) and Inktomi (for search)

8 Yahoo! Research 8 Algorithmic results CPC Advertisements

9 Yahoo! Research 9 Editorial User reviewsAds

10 Yahoo! Research 10 Types of content Editorial content: books, music, professionally-produced websites User-generated content: blogs, reviews, bulletin boards, groups, etc. Total web growth: 1-3M pages/day –Not “real” growth –Think text content… Courtesy: Andrew Tomkins

11 Yahoo! Research 11 Total text content 6B people type 4 hrs/day at 100 wpm Storage: 52PB/yr = Cost: $25M/yr In another 5 years, this looks about like the cost of having 10 people on your payroll Conclusion 1: any company with tens of people can store every bit of text produced by every human on the planet Conclusion 2: no scale-based differentiation around text content (of course, not all content is text…) Courtesy: Andrew Tomkins

12 Yahoo! Research 12 User generated content (UGC) New content –2 billion pages of editorial content –Tiny number of songs, etc –5-10 billion pages UGC exist (already ~10% of consumption), growing Note: UGC did not exist as a category a couple of years ago! –Rapidly becoming a key growth area of consumed web content – but we don’t know how to process it! Courtesy: Andrew Tomkins

13 13 Tags The simplest form of UGC Is the Turing test always the right question?

14 Yahoo! Research 14

15 Yahoo! Research 15 The power of social tagging Flickr – community phenomenon Millions of users share and tag each others’ photographs (why???) The wisdom of the crowd can be used to search The principle is not new – anchor text used in “standard” searchanchor Don’t try to pass the Turing test?

16 Yahoo! Research 16 Anchor text When indexing a document D, include anchor text from links pointing to D. www.ibm.com Armonk, NY-based computer giant IBM announced today Joe’s computer hardware links Compaq HP IBM Big Blue today announced record profits for the quarter

17 Yahoo! Research 17 Challenges in tag-based search How do we use these tags for better search? How do you cope with spam? What’s the ratings and reputation system?ratings and reputation The bigger challenge: where else can you exploit the power of the people? What are the incentive mechanisms? –Luis von Ahn (CMU): The ESP GameESP Game

18 Yahoo! Research 18

19 19 More UGC: Social search Indexing the knowledge in people’s heads

20 Yahoo! Research 20

21 Yahoo! Research 21

22 Yahoo! Research 22 Social content Social capital

23 Yahoo! Research 23 Incentives

24 Yahoo! Research 24 Incentives What assignment of incentives leads to good user behavior? –What’s “good” user behavior? –Good questions, good answers, new questions? Whom do you trust and why? –Propagation of trust and distrust.

25 Yahoo! Research 25 Ratings and reputation Node reputation: Given a DAG with –a subset of nodes called GOOD – another subset called BAD –Find a measure of goodness for all other nodes. Node pair reputation: Given a DAG with a real-valued trust on the edges –Predict a real-valued trust for ordered node pairs not joined by an edge Metric labelling

26 26 CPC advertisements What pays the bills

27 Yahoo! Research 27 Ads

28 Yahoo! Research 28 Generic questions Of the various advertisers for a keyword, which one(s) get shown? What do they pay on a click through? The answers turn out to draw on insights from microeconomics

29 Yahoo! Research 29 Ads go in slots like this one and this one.

30 Yahoo! Research 30 Advertisers generally prefer this slot to this one to this one.

31 Yahoo! Research 31 Click through rate r 1 = 200 per hour r 2 = 150 per hour r 3 = 100 per hour etc.

32 Yahoo! Research 32 Why did witbeckappliance win over ristenbatt?

33 Yahoo! Research 33 First-cut assumption Click-through rate depends only on the slot, not on the advertisement In fact not true; more on this later.

34 Yahoo! Research 34 Advertiser’s value We assume that an advertiser j has a value v j per click through –Some measure of downstream profit Say, click-through followed by 96% of the time, no purchase 0.7% buy Dishwasher, profit $500 1.2% buy Vacuum Cleaner, profit $200 2.1% buy Cleaning agents, profit $1 $ 5.921

35 Yahoo! Research 35 Example For the keyword miele, say an advertiser has a value of $10 per click. How much should he bid? How much should he be charged? The value of a slot for an advertiser, what he bids and what he is charged, may all be different.

36 Yahoo! Research 36 Advertiser’s payoff in ad slot i (Click-through rate) x (Value per click) – (Payment to search engine) = r i v j – (Payment to Engine) = r i v j – p ij Payment of advertiser j in slot i Function of all other bids.

37 Yahoo! Research 37 Two auction pricing mechanisms First price: The winner of the auction is the highest bidder, and pays his bid. Second price: The winner is the highest bidder, but pays the second- highest bid. Engine decides and announces pricing. What should an advertiser bid? Not truthful.

38 Yahoo! Research 38 Second-price = Vickrey auction Consider first a single advt slot Winner pays the second-highest bid Vickrey: Truth-telling is a dominant strategy for each player (advertiser) –No incentive to “game” or fake bids

39 Yahoo! Research 39 Auctions and pricing: multiple slots Overture’s model: –Ads displayed in order of decreasing bid –E.g., if advertiser A bids 10, B bids 2, C bids 4 – order ACB How do you price slots? Generalized Vickrey?Vickrey –Generalized second-price (GSP) –Vickrey-Clark-Groves (VCG): each advertiser pays the externality he imposes on othersVCG

40 Yahoo! Research 40 Bidder A, $10 Bidder C, $4 Bidder B, $2 Pays 4 Pays 2 Generalized Second Price auction pricing

41 Yahoo! Research 41 VCG pricing Suppose click rates are 200 in the top slot, 100 in the second slot VCG payment of the second player (C) is 2 x 100 = 200 For the first player, 4x(200-100) + 200 Externality on third player B. Externality on C.Externality on B.

42 Yahoo! Research 42 VCG and GSP Truth-telling is a dominant strategy under VCG … Truth-telling not dominant under GSP! Edelman, Ostrovsky, Schwarz

43 Yahoo! Research 43 VCG and GSP Static equilibrium of GSP is locally envy-free: no advertiser can improve his payoff by exchanging bids with advertiser in slot above. Depending on the mechanism, revenue varies: GSP ≥ VCG. Edelman, Ostrovsky, Schwarz Locally envy-free mechanisms correspond to Stable Marriage solutions.

44 Yahoo! Research 44 GSP for bid-ordering What’s good about bid-ordering and GSP? –Advertisers like transparency What’s wrong with bid-ordering?

45 Yahoo! Research 45 Brand advertising? Bid ordering (former Yahoo! order)

46 Yahoo! Research 46

47 Yahoo! Research 47 Revenue ordering Simplified version of Google’s ordering –Each ad j has an expected click- through denoted CTR j –Advertiser j’s bid is denoted b j Then, expected revenue from this advertiser is R j = b j+1 x CTR j Order advertisers by R j –Payment by GSP

48 Yahoo! Research 48

49 Yahoo! Research 49 “current” Yahoo! ordering

50 Yahoo! Research 50 “Squashed” ordering Overture/Old Yahoo! scheme –Order ads by bid Google (puportedly) –Order by bid  click-through rate (CTR) Squashing (Lahaie/Pennock) Key – advertisers react to mechanism! s=0s=1 Order by bid*(CTR) s OvertureGoogle?

51 Yahoo! Research 51

52 Yahoo! Research 52 Where do we go next? Premise: –People don’t want to search –People want to get tasks done I want to book a vacation in Tuscany. StartFinish

53 Yahoo! Research 53 What is missing? Information integration –Information extraction –Schema normalization Mining social structure –Tags, UGC Welcome to The Savoy Located on The Strand in the heart of the West End theatre district, hotel near leicester square Search

54 Yahoo! Research 54 Computational microeconomics Reputation and incentive mechanisms Matching marketplaces –Jobs, dates, … –Online matching everywhere Hardest part is estimating the payoffs, not the matching algorithm “Network effects” –Are 500 million users 500 times as valuable as a million users? 5000 times?

55 Yahoo! Research 55 A new convergence Monetization and economic value an intrinsic part of system design –Not an afterthought –Mistakes are costly! Computing meets humanities like never before – sociology, economics, anthropology …

56 56 Thank you. Questions? pragh@yahoo-inc.com http://research.yahoo.com


Download ppt "1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research."

Similar presentations


Ads by Google