Download presentation
Presentation is loading. Please wait.
Published byLogan Hawkins Modified over 9 years ago
1
Computer Information Technology – Section 3-2
2
The Internet Objectives: The Student will: 1. Understand Search Engines and how they work 2. Understand the pros and cons or various popular search engines 3. Understand the definitions of terms associated Search Engines. 4. Perform a basic search and compare results from different search engines
3
How does a search work? Google give a quick tour of how a search works: http://www.google.com/intl/en/insidesearch/howsearchw orks/thestory/index.html
4
Search Engines Search Engine: A program that searches documents for specified keywords and returns a list of the documents where the keywords were found. Without search engines you would never be able to find anything on the web Typically, a search engine works by sending out a spider to fetch as many documents as possible. Spider: A program that automatically fetches Web pages. Spiders are used to feed pages to search engines. It's called a spider because it crawls over the Web. Another term for these programs is webcrawler. Because most Web pages contain links to other pages, a spider can start almost anywhere. As soon as it sees a link to another page, it goes off and fetches it.
5
Search Engines Spiders or Crawlers visit a Web site, read the information on the actual site, read the site's meta tags and also follow the links that the site connects to performing indexing on all linked Web sites as well. meta tags: A special HTML tag that provides information about a Web page. You can’t see meta tags on the web page. They provide information such as who created the page, how often it is updated, what the page is about, and which keywords represent the page's content. The crawler returns all that information back to a central depository, where the data is indexed. This is the data the search engine searches! This is why search engines return links that are no longer valid.
6
Search Engines Crawlers rely entirely on links from other web pages, so if a web page is never linked to in any other page, search engine spiders cannot find it. Crawlers will return to web pages periodically to update the database
7
Search Engines – Why they give different results Not all indices are going to be exactly the same. It depends on what the spiders find (or what the humans submitted). Not every search engine uses the same algorithm to search through the indices. The algorithm is what the search engines use to determine the relevance of the information in the index to what the user is searching for. Algorithm: A formula or set of steps for solving a particular problem.
8
Search Engines – Why they give different results Google has one of the largest databases but studies indicate that less than ½ of the searchable web is searchable in Google. Studies also show that more than 80% of the pages in a major search engine's database exist only in that database. When doing research try different search engines!
9
Search Engines – Comparison Search Engine GoogleYahooAsk.com Size, type. HUGE. Size not disclosed in any way that allows comparison. Probably the biggest. HUGE. Claims over 20 billion total "web objects." LARGE. Claims to have 2 billion fully indexed, searchable pages. Noteworthy features Many additional databases including Book Search, Scholar (journal articles), Blog Search, Patents, Images, etc. Shortcuts give quick access to dictionary, synonyms, patents, traffic, stocks, encyclopedia, and more. Boolean logicPartial. AND assumed between words. Capitalize OR. ( ) accepted but not required. In Advanced Search partial Boolean available in boxes. Accepts AND, OR, NOT or AND NOT. Must be capitalized. ( ) accepted but not required. Partial. AND assumed between words. Capitalize OR. - excludes. No ( ) or nesting.
10
Search Engines – Comparison Search Engine GoogleYahooAsk.com +Requires/ - Excludes - excludes + will allow you to retrieve “stop words” (e.g., +in) - excludes + will allow you to search common words: "+in truth" - excludes + will allow you to retrieve “stop words” (e.g., +in) Sub-Searching The search box at the top of the results page shows your current search. Modify this (e.g., add more terms at the end.) Results RankingBased on page popularity measured in links to it from other pages: high rank if a lot of other pages link to it. Matching and ranking based on "cached" version of pages that may not be the most recent version. Documents with all terms are ranked first, followed by documents containing any terms. The farther down, the fewer the terms, although at least one should always be present. Based on Subject- Specific Popularity™, links to a page by related pages.
11
Search Engines – Comparison Search Engine GoogleYahooAsk.com Truncation, Stemming No truncation. Stems some words. Search variant endings and synonyms separately, separating with OR (capitalized): airline OR airlines Neither. Search with OR as in Google.
12
Search Engines – Search Results Searching for “Hancock High School”: Google: About 32,000,000 results Yahoo: 4,250,000 results Ask.com: Doesn’t tell you. Bing.com: 4,120,000 results
13
Search Engines – Search Results
17
Search Engines – Wrap-Up Terms you should know: 1. Search Engine: A program that searches documents for specified keywords 2. Spider or Crawler: A program that automatically fetches Web pages. 3. Meta tags: A special HTML tag that provides information about a Web page. 4. Algorithm: A formula or set of steps for solving a particular problem.
18
Search Engines – Assignment Before you leave today… Pick a topic of interest to you. IT MUST BE APPROPRIATE FOR SCHOOL! Pick 3 search engines (Google, Yahoo, Altavista.com, Ask.com, www.alltheweb.com, bing.com, www.askjeeves.com, lycos.com) Do a search on your topic On the paper put: 1. Your Name and the period. 2. Your topic 3. Report how many web sites each search engine finds 4. Note if any of the top 10 sites are the same between the different search engines (circle the sites that are on all 3 lists).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.