Download presentation
Presentation is loading. Please wait.
Published byNigel Manning Modified over 9 years ago
1
INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 9 – SEARCH AND ADVERTISING SEAN J. TAYLOR
2
ADMINISTRATIVIA Assignment 2 online due Saturday 2/25 at 1am Assignment 2 resources Assignment 3 preview Guest speaker on Tuesday 2/28: Chrys Wu discussing IT and Journalism Substitute on Thursday 3/1 Professor Dylan Walker
3
LEARNING OBJECTIVES 1.Learn how search engines rank pages 2.Learn how to design effectively for high rankings 3.Learn how online advertising works, especially search ads and keyword auctions 4.The future of search
4
SEARCH ENGINES AND WEB DIRECTORIES Resources on the Web that help you find sites with the information and/or services you want. Directory search engine - organizes listings of Web sites into hierarchical lists. Search engine - uses software agent technologies (or “spiders”, or “bots”) to search the Web for key words and place them into indexes.
5
WEB DIRECTORIES EXAMPLE Advantages? Disadvantages?
6
SEARCH ENGINE EXAMPLES Advantages? Disadvantages?
7
SEARCH ENGINES DRIVE ECOMMERCE!
8
WHERE IS CONSUMERS ATTENTION?
10
EYETRACKING STUDY OF GOOGLE RESULTS
11
– Search engines discover new pages by following links – Keep track of words that appear in pages and when you enter a query, the search engine returns a ranked list – Text content is important! But is not enough! (Why?) How do search engines rank pages? (why does this matter?) HOW SEARCH ENGINES WORK
12
PAGERANK IS REALLY A “RANDOM SURFER” MODEL Random Surfer Model: What about getting stuck in loops? Let’s count the surfer’s that pass through each point:
13
MEASURING IMPORTANCE OF LINKING PageRank Algorithm Idea: important pages are pointed to by other important pages Method: Each link from one page to another is counted as a “vote” for the destination page The number of incoming links is important! But it is not enough! But each “vote” is different! PageRank places more importance to votes that come from pages with large number of votes (and so on, and so on) Compare, for example, the cases for the circled page in cases A and B A B
14
People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C (ignoring damping factor for illustration) COMPUTING PAGERANK
15
People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C COMPUTING PAGERANK (ignoring damping factor for illustration)
16
PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.250 (ignoring damping factor for illustration)
17
PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.250.250/3.250.250/3.250/2.250.250/3.250/2 (ignoring damping factor for illustration)
18
PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.250/3.250.250/3.250/2.250.250/3.250/2.375.083.458 (ignoring damping factor for illustration)
19
PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.375/3.083.375/3.083/2.458.375/3.083/2.375.083.458 (ignoring damping factor for illustration)
20
PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.375/3.083.375/3.083/2.458.375/3.083/2.500.125.250 (ignoring damping factor for illustration)
21
PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C.400.133.333.400/3.133.400/3.133/2.333.400/3.133/2 (ignoring damping factor for illustration)
22
GAMING PAGERANK AND TRUST TrustRank Algorithm Initial votes come only from trusted pages Compare, for example, the cases for the circled page in cases A and B B trusted page Links from untrusted sourcesA
23
SIMULATING CHANGES IN PAGERANK People who bought this also bought… BOOK A book B book C book D People who bought this also bought… BOOK D book C People who bought this also bought… BOOK C book A People who bought this also bought… BOOK B book A book C ChangePR of APR of C C cuts link to A0.180.50 C links to B0.380.33 C links to D0.240.40 C links to B & D0.220.38.400.133.333
24
IMPORTANCE OF ANCHOR TEXT INFOSYS 141 A terrific course on search engines The anchor text summarizes what the website is about.
25
OTHER RANKING FACTORS Location, Location, Location...and Frequency Query words in title, or in first few sentences The more frequent the query words, the better Click through measurement How often users click on your URL, when they see it How long do they stay (using toolbars!)
26
OUTLINE 1.Learn how search engines rank pages 2.Learn how to design effectively for high rankings 3.Learn how online advertising works, especially search ads and keyword auctions 4.The future of search
27
ACHIEVING HIGHER RESULTS RANKINGS Position your keywords (title, headings, early on page) Make text visible (no tiny fonts, no white-on-white) Frames can kill Have relevant content Do not change topics Just say no to search engine spamming Submit your key pages Verify your listing often
28
Motives Commercial, political, religious, lobbies Promotion funded by advertising budget Operators Contractors (Search Engine Optimizers) for lobbies, companies Web masters Hosting services What are the techniques used by rankings manipulators? MANIPULATING RANKINGS
29
MANIPULATION TECHNOLOGIES Cloaking Serve fake content to search engine robot DNS cloaking: Switch IP address. Impersonate Doorway pages Pages optimized for a single keyword that re-direct to the real target page Keyword Spam Misleading meta-keywords, excessive repetition of a term, fake “anchor text” Hidden text with colors, CSS tricks, etc. Link spamming Mutual admiration societies, hidden links, awards Domain flooding: numerous domains that point or re-direct to a target page Robots Fake click stream Fake query stream Is this a Search Engine spider? N Y SPAM Fake Doc Cloaking Meta-Keywords = “… London hotels, hotel, holiday inn, hilton, discount, booking, reservation, sex, mp3, britney spears, viagra, …” Risky to use any of these as search engines are getting better at detecting and punishing them
30
OUTLINE 1.Learn how search engines rank pages 2.Learn how to design effectively for high rankings 3.Learn how online advertising works, especially search ads and keyword auctions 4.The future of search
31
PAID RANKING Keyword bidding for targeted ads Pay-per-click Higher bids result in higher ranks for the ad Higher percentage of clicks on the ad, increase the rank as well (why?) Google's AdWords is the biggest player Google’s 2007 revenue was more than $16 Billion, 2008 ~ $22 Billion, mostly from such ads Promoting without Manipulation: Paid placement
32
EXAMPLE AdWords Placement AdWords Placement Most relevant sites
34
FUND YOUR WEBSITE: ADSENSE Google also delivers ads to other websites Sign-up for Google AdSense, and Google delivers ads to your website (common source of income for “professional” bloggers) How ads are delivered: If website best for targeted keywords If users of website click on results Strategies for successful ads: Place the ads on top Blend with the rest of the website Ads at the bottom are ignored consistently
35
EXAMPLE: WASHINGTON POST WEBSITE
36
Analysis of Washington Post Website
37
TARGETING BANNER ADS Request for Ad from Ad Server IP Address Country, Domain, Company Browser, Operating System Surfing Behavior from cookies Demographic Data? Targeted Ad is Delivered to User Context: Movie reviews User Profile: NYU user New York
38
User Visits Publisher Sites Ads Delivered By Dart For Advertisers DART For Advertisers Boomerang Captures User Action Data Data Analysis Databank Boomerang Compiles & Reports Response For Future Targeting User Clicks & Visits Advertiser’s Site CLOSED LOOP MARKETING Source: Doubleclick, Inc.
39
FUTURE OF SEARCH 1.Information Extraction: Search on Structured Data 2.Social Search 3.Privacy Preserving Search
40
INFORMATION EXTRACTION Information extraction applications extract structured relations from unstructured text May 19 1995, Atlanta -- The Centers for Disease Control and Prevention, which is in the front line of the world's response to the deadly Ebola epidemic in Zaire, is finding itself hard pressed to cope with the crisis… DateDisease NameLocation Jan. 1995MalariaEthiopia July 1995Mad Cow DiseaseU.K. Feb. 1995PneumoniaU.S. May 1995EbolaZaire Disease Outbreaks in The New York Times Information Extraction System (e.g., NYU’s Proteus)
41
RETURN STRUCTURED ANSWERS, NOT WEBPAGES
42
FUTURE OF SEARCH 1.Information Extraction: Search on Structured Data 2.Social Search 3.Privacy Preserving Search
43
Y! ANSWERS Launched in second half of 2005 Incentive system based on points and voting for best answers Questions grouped by category Some statistics: over 60 million users over 120 million answers, available in 18 countries and in 6 languages
45
Y! ANSWERS
47
LONG-TERM PROSPECTS Questions follow a power-law: Large number of questions will be asked by many people (20% of questions 80% of requests) We only need one answer for each question Acquire quickly high-quality answers for 80% of queries …people will take care in time of the “long tail” of the remaining questions
48
FUTURE OF SEARCH 1.Information Extraction: Search on Structured Data 2.Social Search 3.Privacy Preserving Search
49
PRIVACY PRESERVING SEARCH
50
NEXT CLASS: SOCIAL NETWORKS Work on Assignment 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.