Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 9 – SEARCH AND ADVERTISING SEAN J. TAYLOR.

Similar presentations


Presentation on theme: "INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 9 – SEARCH AND ADVERTISING SEAN J. TAYLOR."— Presentation transcript:

1 INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 9 – SEARCH AND ADVERTISING SEAN J. TAYLOR

2 ADMINISTRATIVIA Assignment 2 online due Saturday 2/25 at 1am Assignment 2 resources Assignment 3 preview Guest speaker on Tuesday 2/28: Chrys Wu discussing IT and Journalism Substitute on Thursday 3/1 Professor Dylan Walker

3 LEARNING OBJECTIVES 1.Learn how search engines rank pages 2.Learn how to design effectively for high rankings 3.Learn how online advertising works, especially search ads and keyword auctions 4.The future of search

4 SEARCH ENGINES AND WEB DIRECTORIES Resources on the Web that help you find sites with the information and/or services you want. Directory search engine - organizes listings of Web sites into hierarchical lists. Search engine - uses software agent technologies (or “spiders”, or “bots”) to search the Web for key words and place them into indexes.

5 WEB DIRECTORIES EXAMPLE Advantages? Disadvantages?

6 SEARCH ENGINE EXAMPLES Advantages? Disadvantages?

7 SEARCH ENGINES DRIVE ECOMMERCE!

8 WHERE IS CONSUMERS ATTENTION?

9

10 EYETRACKING STUDY OF GOOGLE RESULTS

11 – Search engines discover new pages by following links – Keep track of words that appear in pages and when you enter a query, the search engine returns a ranked list – Text content is important! But is not enough! (Why?) How do search engines rank pages? (why does this matter?) HOW SEARCH ENGINES WORK

12 PAGERANK IS REALLY A “RANDOM SURFER” MODEL Random Surfer Model: What about getting stuck in loops? Let’s count the surfer’s that pass through each point:

13 MEASURING IMPORTANCE OF LINKING PageRank Algorithm Idea: important pages are pointed to by other important pages Method: Each link from one page to another is counted as a “vote” for the destination page The number of incoming links is important! But it is not enough! But each “vote” is different! PageRank places more importance to votes that come from pages with large number of votes (and so on, and so on) Compare, for example, the cases for the circled page in cases A and B A B

14 People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C (ignoring damping factor for illustration) COMPUTING PAGERANK

15 People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C COMPUTING PAGERANK (ignoring damping factor for illustration)

16 PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.250 (ignoring damping factor for illustration)

17 PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.250.250/3.250.250/3.250/2.250.250/3.250/2 (ignoring damping factor for illustration)

18 PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.250/3.250.250/3.250/2.250.250/3.250/2.375.083.458 (ignoring damping factor for illustration)

19 PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.375/3.083.375/3.083/2.458.375/3.083/2.375.083.458 (ignoring damping factor for illustration)

20 PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.375/3.083.375/3.083/2.458.375/3.083/2.500.125.250 (ignoring damping factor for illustration)

21 PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.400.133.333.400/3.133.400/3.133/2.333.400/3.133/2 (ignoring damping factor for illustration)

22 GAMING PAGERANK AND TRUST TrustRank Algorithm Initial votes come only from trusted pages Compare, for example, the cases for the circled page in cases A and B B trusted page Links from untrusted sourcesA

23 SIMULATING CHANGES IN PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C ChangePR of APR of C C cuts link to A0.180.50 C links to B0.380.33 C links to D0.240.40 C links to B & D0.220.38.400.133.333

24 IMPORTANCE OF ANCHOR TEXT INFOSYS 141 A terrific course on search engines The anchor text summarizes what the website is about.

25 OTHER RANKING FACTORS Location, Location, Location...and Frequency Query words in title, or in first few sentences The more frequent the query words, the better Click through measurement How often users click on your URL, when they see it How long do they stay (using toolbars!)

26 OUTLINE 1.Learn how search engines rank pages 2.Learn how to design effectively for high rankings 3.Learn how online advertising works, especially search ads and keyword auctions 4.The future of search

27 ACHIEVING HIGHER RESULTS RANKINGS Position your keywords (title, headings, early on page) Make text visible (no tiny fonts, no white-on-white) Frames can kill Have relevant content Do not change topics Just say no to search engine spamming Submit your key pages Verify your listing often

28 Motives Commercial, political, religious, lobbies Promotion funded by advertising budget Operators Contractors (Search Engine Optimizers) for lobbies, companies Web masters Hosting services What are the techniques used by rankings manipulators? MANIPULATING RANKINGS

29 MANIPULATION TECHNOLOGIES Cloaking Serve fake content to search engine robot DNS cloaking: Switch IP address. Impersonate Doorway pages Pages optimized for a single keyword that re-direct to the real target page Keyword Spam Misleading meta-keywords, excessive repetition of a term, fake “anchor text” Hidden text with colors, CSS tricks, etc. Link spamming Mutual admiration societies, hidden links, awards Domain flooding: numerous domains that point or re-direct to a target page Robots Fake click stream Fake query stream Is this a Search Engine spider? N Y SPAM Fake Doc Cloaking Meta-Keywords = “… London hotels, hotel, holiday inn, hilton, discount, booking, reservation, sex, mp3, britney spears, viagra, …” Risky to use any of these as search engines are getting better at detecting and punishing them

30 OUTLINE 1.Learn how search engines rank pages 2.Learn how to design effectively for high rankings 3.Learn how online advertising works, especially search ads and keyword auctions 4.The future of search

31 PAID RANKING Keyword bidding for targeted ads Pay-per-click Higher bids result in higher ranks for the ad Higher percentage of clicks on the ad, increase the rank as well (why?) Google's AdWords is the biggest player Google’s 2007 revenue was more than $16 Billion, 2008 ~ $22 Billion, mostly from such ads Promoting without Manipulation: Paid placement

32 EXAMPLE AdWords Placement AdWords Placement Most relevant sites

33

34 FUND YOUR WEBSITE: ADSENSE Google also delivers ads to other websites Sign-up for Google AdSense, and Google delivers ads to your website (common source of income for “professional” bloggers) How ads are delivered: If website best for targeted keywords If users of website click on results Strategies for successful ads: Place the ads on top Blend with the rest of the website Ads at the bottom are ignored consistently

35 EXAMPLE: WASHINGTON POST WEBSITE

36 Analysis of Washington Post Website

37 TARGETING BANNER ADS Request for Ad from Ad Server IP Address Country, Domain, Company Browser, Operating System Surfing Behavior from cookies Demographic Data? Targeted Ad is Delivered to User Context: Movie reviews User Profile: NYU user New York

38 User Visits Publisher Sites Ads Delivered By Dart For Advertisers DART For Advertisers Boomerang Captures User Action Data Data Analysis Databank Boomerang Compiles & Reports Response For Future Targeting User Clicks & Visits Advertiser’s Site CLOSED LOOP MARKETING Source: Doubleclick, Inc.

39 FUTURE OF SEARCH 1.Information Extraction: Search on Structured Data 2.Social Search 3.Privacy Preserving Search

40 INFORMATION EXTRACTION Information extraction applications extract structured relations from unstructured text May 19 1995, Atlanta -- The Centers for Disease Control and Prevention, which is in the front line of the world's response to the deadly Ebola epidemic in Zaire, is finding itself hard pressed to cope with the crisis… DateDisease NameLocation Jan. 1995MalariaEthiopia July 1995Mad Cow DiseaseU.K. Feb. 1995PneumoniaU.S. May 1995EbolaZaire Disease Outbreaks in The New York Times Information Extraction System (e.g., NYU’s Proteus)

41 RETURN STRUCTURED ANSWERS, NOT WEBPAGES

42 FUTURE OF SEARCH 1.Information Extraction: Search on Structured Data 2.Social Search 3.Privacy Preserving Search

43 Y! ANSWERS Launched in second half of 2005 Incentive system based on points and voting for best answers Questions grouped by category Some statistics: over 60 million users over 120 million answers, available in 18 countries and in 6 languages

44

45 Y! ANSWERS

46

47 LONG-TERM PROSPECTS Questions follow a power-law: Large number of questions will be asked by many people (20% of questions  80% of requests) We only need one answer for each question Acquire quickly high-quality answers for 80% of queries …people will take care in time of the “long tail” of the remaining questions

48 FUTURE OF SEARCH 1.Information Extraction: Search on Structured Data 2.Social Search 3.Privacy Preserving Search

49 PRIVACY PRESERVING SEARCH

50 NEXT CLASS: SOCIAL NETWORKS Work on Assignment 2


Download ppt "INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 9 – SEARCH AND ADVERTISING SEAN J. TAYLOR."

Similar presentations


Ads by Google