Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gregor Gisler-Merz 23.07.2003 1 How to hit in google The anatomy of a modern web search engine.

Similar presentations


Presentation on theme: "Gregor Gisler-Merz 23.07.2003 1 How to hit in google The anatomy of a modern web search engine."— Presentation transcript:

1 Gregor Gisler-Merz 23.07.2003 1 How to hit in google The anatomy of a modern web search engine

2 Gregor Gisler-Merz 23.07.2003 2 Why do we need search engines3 Design goals of a search engine 3 What are the benefits of a basic Web Search Engine knowledge? 4 System Anatomy: Google Architecture Overview5 Searching6 How do I practically benefit from the new insights. Search tips7 How do I get listed in google7 References8 Content:

3 Gregor Gisler-Merz 23.07.2003 3 The amount of information is growing rapidly - over 3 billion indexed documents till now - over 150 million queries per day Human maintained indices cover not every topic, are expensive to build and maintain. Automated search engines that rely on keyword matching usually return too many low quality matches. A lot of advertisers take measures to mislead automated search engines. Why do we need search engines: Improve search quality Easy usage Novel research activities on large scale web data Design goals of a search engine:

4 Gregor Gisler-Merz 23.07.2003 4 Know what you can expect from your searches. Get a listing of your own web site. Build a reasonable Intranet Search Engine. Improve your search infrastructure in your own applications. What are the benefits of a basic Web Search Engine knowledge? :

5 Gregor Gisler-Merz 23.07.2003 5 Most of Google is implemented in C/C++. Downloading of web pages by several distributed web crawlers. Every stored web page has an associated ID (docID). The Indexer reads the repository, uncompresses the documents, and parses them. Parsing/Scanning is done by a lexical analyzer (generated with flex) Google Architecture Overview:

6 Gregor Gisler-Merz 23.07.2003 6 The Google Query Evaluation 1 Parse the query 2 Convert words into wordIDs. 3 Seek to the start of the doclist in the short barrel for every word. 4 Scan through the doclists until there is a document that matches all the search terms. 5 Compute the rank of that document for the query. 6 If we are in the short barrels and at the end of any doclist, seek to the start of the doclist in the full barrel for every word and go to step 4. 7 If we are not at the end of any doclist go to step 4. Sort the documents that have matched by rank and return the top k. The ranking system includes hitlists, anchor text and the PageRank. Google always tries to balance out on thes factors. Page Ranking is backed by a lot of mathematics (graph theory, linear algebra and so on) Searching :

7 Gregor Gisler-Merz 23.07.2003 7 Specify your search as much as you can. Use exact phrases “Säuliämtler Seifenkistenrennen” Look for Zürich with StopWords +Zürich Exclude unwanted words with the - operator Search tips: How do I get listed in google? Choose the correct keywords for your site and raise the keyword density. Place your most important keyword phrase toward the beginning of the title tag. Use Description and Keyword Meta Tags. Use Header Tags. Incorporate keywords in the alt tag of your images and place keywords to Page links. Create a site map and a contact page. Put only Quality Content on your Site (250-300 word per page). Create for one keyword only one doorway page. Do not use hidden text, repair broken links. Attention with FRAMES: Add a lot of keyword rich text to the NOFRAMES tag. Get reciprocal links and cross link your site (if possible). Now get your web site listed in the major search engines and get a good ranking!!

8 Gregor Gisler-Merz 23.07.2003 8 google http://www.google.com/addurl.html altavista http://www.altavista.com/addurl.html alltheweb http://www.alltheweb.com/add_url.php Tipps for getting listed: http://www.totalsubmission.co.uk, http://www.amigos.org PageRank Uncovered: http:// www.supportforums.org/PageRank.pdf PageRank Computation and the Structure of the Web: Experiments and Algorithms http://www2002.org/CDROM/poster/173.pdf The Anatomy of a Large-Scale Hypertextual Web Search Engine http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm flex scanner generator: http://www.gnu.org/software/flex/flex.html References :


Download ppt "Gregor Gisler-Merz 23.07.2003 1 How to hit in google The anatomy of a modern web search engine."

Similar presentations


Ads by Google