CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google.

Slides:



Advertisements
Similar presentations
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
Advertisements

1 S.E.O Search Engine Optimization. 2 History of Google Began January 1996 Stanford University California Larry Page and Sergey Brin “BackRub” used a.
1 The PageRank Citation Ranking: Bring Order to the web Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Presented by Fei Li.
Our purpose Giving a query on the Web, how can we find the most authoritative (relevant) pages?
Authoritative Sources in a Hyperlinked environment Jon M. Kleinberg
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Web Search – Summer Term 2006 VI. Web Search - Ranking (cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 3 March 23, 2005
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.
CPSC156: The Internet Co-Evolution of Technology and Society Lecture 10: February 15, 2007 Search.
Authoritative Sources in a Hyperlinked Environment Hui Han CSE dept, PSU 10/15/01.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 21: Link Analysis.
CS155a: E-Commerce Lecture 20: November 27, 2001 Web Searching and Google.
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Web IR.
CPSC156a: The Internet Co-Evolution of Technology and Society Lecture 9: October 2, 2003 Web Searching and Google.
Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented By: Talin Kevorkian Summer June
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 3 April 2, 2006
1 CS 502: Computing Methods for Digital Libraries Lecture 16 Web search engines.
Valuing Google © James Dow, 2009 Do not distribute without permission.
CS155b: E-Commerce Lecture 15: March 6, 2003 Web Searching and Google.
ISP 433/633 Week 7 Web IR. Web is a unique collection Largest repository of data Unedited Can be anything –Information type –Sources Changing –Growing.
Link Structure and Web Mining Shuying Wang
CS 345 Data Mining Lecture 1 Introduction to Web Mining.
Order Out of Chaos Analyzing the Link Structure of the Web for Directory Compilation and Search. Presented by Benjy Weinberger.
1 COMP4332 Web Data Thanks for Raymond Wong’s slides.
Link Analysis HITS Algorithm PageRank Algorithm.
Motivation When searching for information on the WWW, user perform a query to a search engine. The engine return, as the query’s result, a list of Web.
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
Search Engine Optimization
The PageRank Citation Ranking: Bringing Order to the Web Larry Page etc. Stanford University, Technical Report 1998 Presented by: Ratiya Komalarachun.
HITS – Hubs and Authorities - Hyperlink-Induced Topic Search A on the left is an authority A on the right is a hub.
The PageRank Citation Ranking: Bringing Order to the Web Presented by Aishwarya Rengamannan Instructor: Dr. Gautam Das.
Fall 2006 Davison/LinCSE 197/BIS 197: Search Engine Strategies 7-1 Module II Overview PLANNING: Things to Know BEFORE You Start… Why SEM? Goal Analysis.
Using Hyperlink structure information for web search.
The Business Model and Strategy of MBAA 609 R. Nakatsu.
1 Navigation Section - Google must be pretty important if we’re going to devote time in class to study it… IT IS. A recent search of WSJ articles showed:
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Presented by, Lokesh Chikkakempanna Authoritative Sources in a Hyperlinked environment.
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
Link Analysis on the Web An Example: Broad-topic Queries Xin.
CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.
Overview of Web Ranking Algorithms: HITS and PageRank
The Business Model of Google MBAA 609 R. Nakatsu.
Search Engine Marketing SEM = Search Engine Marketing SEO = Search Engine Optimization optimizing (altering/changing) your page in order to get a higher.
Lecture #10 PageRank CS492 Special Topics in Computer Science: Distributed Algorithms and Systems.
BING!-Microsoft's new search engine Launched May 28, 2009 Appealing interface A “decision engine” not just a search engine *Shopping, health, travel, local.
Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg ACM-SIAM Symposium, 1998 Krishna Venkateswaran 1.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
- Murtuza Shareef Authoritative Sources in a Hyperlinked Environment More specifically “Link Analysis” using HITS Algorithm.
1 CS 430: Information Discovery Lecture 5 Ranking.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
CS 540 Database Management Systems Web Data Management some slides are due to Kevin Chang 1.
A Sublinear Time Algorithm for PageRank Computations CHRISTIA N BORGS MICHAEL BRAUTBA R JENNIFER CHAYES SHANG- HUA TENG.
Never miss an opportunity. Dynamic Search Ads Dynamic Search Ads. Never miss an opportunity.
Dynamic Search Ads. Never miss an opportunity.. What are Dynamic Search Ads? Dynamic Search Ads (DSA) are generated for you by Google. They target relevant.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google.
HITS Hypertext-Induced Topic Selection
Methods and Apparatus for Ranking Web Page Search Results
PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google.
Lecture 22 SVD, Eigenvector, and Web Search
Anatomy of a search engine
CS 440 Database Management Systems
Information retrieval and PageRank
Improved Algorithms for Topic Distillation in a Hyperlinked Environment (ACM SIGIR ‘98) Ruey-Lung, Hsiao Nov 23, 2000.
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Authoritative Sources in a Hyperlinked environment Jon M. Kleinberg
Lecture 22 SVD, Eigenvector, and Web Search
Lecture 22 SVD, Eigenvector, and Web Search
Presentation transcript:

CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

WWW Digraph More than 1 Billion Nodes (Pages) Average Degree (links/Page) is (Hard to Compute!) Massive, Distributed, Explicit Digraph (Not Like Call Graphs)

“Hot” Research Area Graph Representation Duplicate Elimination Clustering Ranking Query Results A. Broder & M. Henzinger

“Abundance” Problem ….. kleinber.html Given a query find: –Good Content (“Authorities”) –Good Sources of Links (“Hubs”) Mutually Reinforcing Simple (Core) Algorithm A H

T = {n Pages}, A = {Links} X p   >0, p  T non-negative “Authority Weights” Y p   >0, p  T non-negative “Hub Weights” I operation Update Authority Weights X p   Y q O operation Update Hub Weights Y p   X q Normalize:  X 2 =  Y 2 = 1 (q,p)  A (p,q)  A p p  T p

Core Algorithm Z  (1,1,…,1) X  Y  Z Repeat until Convergence Apply I /* Update Authority weights */ Apply O /* Update Hub Weights */ Normalize Return Limit (X*, Y*)

Convergence of (X i, Y i ) = (OI) i (Z,Z) A = n x n “Adjacency Matrix” Rewrite I and O: X  A T Y ;Y  AX X i = (A T A) i-1 A T Z ;Y i = (AA T ) i Z AA T Symm., Non-negative and Z = (1,1,…, 1)  X* = lim X i =  1 (A T A) Y* = lim Y i =  1 (AA T ) i  8 8

Whole Algorithm (k,d,c) q  Search Engine  |S| < k Base Set T: (In S, S ,  S) and <d links/page Remove “Internal Links” Run Core Algorithm on T From Result (X,Y), Select C pages with max X* values C pages with max Y* values

Examples (k= 200, d=5) q = censorship + net q = Gates [Compares well with Yahoo, Galaxy, etc.]

Approach to “Massiveness”: Throw Out Most of G!! Non-principal Eigenvectors correspond to “Non-principal Communities” Open (?): Objective Performance Criteria Dependence on Search Engine Nondeterministic Choice of S and T

Google History Founded in 1998 by Larry Page and Sergey Brin, two Stanford Ph.D. candidates. Privately held company, whose backers include Kleiner Perkins Caufield & Byers and Sequoia Capital. Continues to win top awards for Search Engines. Computer Scientists love it!!!

Major Partners Yahoo! Palm Nextel Netscape Cisco Systems Virgin Net Netease.com RedHat Virgilio Washingtonpost.com

Business Model The company delivers services through its own web site at and by licensing its search technology to commercial sites Advertising: –Premium Sponsorship – Purchase a keyword –AdWords – Manage your Ad text

I’d like to buy a Keyword The advertiser’s text-based ad will appear at the top of a Google results page whenever the keyword they have purchased is included in a user's search. The ads appear adjacent to, but are distinguished from, the results listings.

Category purchase Google uses a classification system to create an ongoing "Virtual Directory" of categories an advertiser can purchase. Advertisers can select the categories most appropriate to their business and Google will match the most relevant category ads to each user's search. The advantage of this approach is that it covers a broader audience that might be missed through the purchase of keywords alone.