WWW Search Engines CSC1720 – Introduction to Internet Essential Materials.

Slides:



Advertisements
Similar presentations
Getting Your Web Site Found. Meta Tags Description Tag This allows you to influence the description of your page with the web crawlers.
Advertisements

ONLINE RESOURCES. QUESTION Do you ever go onto the Internet and plan to only spend a small amount of time looking for something and spend much longer.
ONLINE RESOURCES. QUESTION Do you ever go into the Internet and plan to only spend a small amount of time looking for something and spend much longer.
Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
Natural Language Processing WEB SEARCH ENGINES August, 2002.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
James Tam Computer Searches Concepts covered What is a search engine and how do they work? General search tips The Big Six search engines Other search.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Chapter 5 Searching for Truth: Locating Information on the WWW.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Introduction Web Development II 5 th February. Introduction to Web Development Search engines Discussion boards, bulletin boards, other online collaboration.
Searching the World Wide Web From Greenlaw/Hepp, In-line/On-line: Fundamentals of the Internet and the World Wide Web 1 Introduction Directories, Search.
Unit 3 Web Search Engines. Can You Find the Answers? n Connect to Google Google n Search for items on Iran Records ________ n Combine Iran with nuclear.
Week 3: MetaSearch Engines Click here for Word handout Tom Johnson Boston University - Dept. of Journalism
What is a search engine? A program that indexes documents, then attempts to match documents relevant to a user's search requests. The term search engine.
SEARCHING ON THE INTERNET
Search Tools for the Internet Adapted from: Kathy Schrock M. Rosettis St. Augustine CHS.
SEARCH ENGINE By Ms. Preeti Patel Lecturer School of Library and Information Science DAVV, Indore E mail:
Net Search Engines The Which, Why and How Tim Landeck Handouts/PowerPoint available at:
Lesson 12 — The Internet and Research
Chapter 5 Searching for Truth: Locating Information on the WWW.
Slide No. 1 Searching the Web H Search engines and directories H Locating these resources H Using these resources H Interpreting results H Locating specific.
XHTML Introductory1 Linking and Publishing Basic Web Pages Chapter 3.
Searching the WWW Chapter 5. Search Engines  Software that lets a user specify search terms. The search engine then finds sites that contain those terms.
Promotion & Cataloguing AGCJ 407 Web Authoring in Agricultural Communications.
Searching the Web Using Search Engines and Directories Effectively Tutorial 4.
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Search Engine Interfaces search engine modus operandi.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 27 How Internet Searching Works.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
Internet Business Foundations © 2004 ProsoftTraining All rights reserved.
Search Engines By Wanda Dansby CECS 5030 Dr. Knezek.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Planning a search strategy.  A search strategy may be broadly defined as a conscious approach to decision making to solve a problem or achieve an objective.
Search Engine Optimization & Pay Per Click Advertising
Search Engines Wayne Shirley Part 2 of lesson 1: INTRODUCTION TO THE INTERNET InformationTechnologySITCourse:3601 To insert your company logo on this slide.
Search Engines AGCM 4143 Electronic Communications in Agriculture.
LOGO Searching the Web CHAPTER 2 Eastern Mediterranean University School of Computing and Technology Department of Information Technology ITEC229 Client-Side.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Where do I find it? Created by Connie CampbellConnie Campbell.
Search Engines Reyhaneh Salkhi Outline What is a search engine? How do search engines work? Which search engines are most useful and efficient? How can.
Search Engines1 Searching the Web Web is vast. Information is scattered around and changing fast. Anyone can publish on the web. Two issues web users have.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
1 Internet Research Third Edition Unit A Searching the Internet Effectively.
Search Tools and Search Engines Searching for Information and common found internet file types.
CPT 499 Internet Skills for Educators Session Three Class Notes.
Internet Research – Illustrated, Fourth Edition Unit A.
Unit 1—Computer Basics Lesson 3 The Internet and Research.
Chapter 1 Getting Listed. Objectives Understand how search engines work Use various strategies of getting listed in search engines Register with search.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
1 SEARCHING FOR TRUTH Locating Information on the WWW chapter 5.
Internet Power Searching: Finding Pearls in a Zillion Grains of Sand By Daniel Arze.
Computer Skills (1) Internet Explorer. To open the Internet Explorer: –Double click on the Internet Explorer icon on Desktop. –Or, from Start  All Programs.
1 Searching the Web. 2 Lecture Outline The search process The search tools Advanced search techniques.
W orkshops in I nformation S kills and E lectronic R esources Oxford University Library Services – Information Skills Training Finding quality information.
Internet Power Searching Finding Pearls in a Zillion Grains of Sand By Amelia Kassel Found in “Technical Communication” on page 198.
Learning how to search on the web “If all you ever do is all you’ve ever done, then all you’ll ever get is all you’ve ever got.” (author unknown)
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Lecture 4 Access Tools/Searching Tools. Learning Objectives To define access tools To identify various access tools To be able to formulate a search strategy.
Search Engine Optimization
Web Searching Strategies
Types of Search Questions
Internet Research Third Edition
Searching for Truth: Locating Information on the WWW
Searching for Truth: Locating Information on the WWW
Searching for Truth: Locating Information on the WWW
Presentation transcript:

WWW Search Engines CSC1720 – Introduction to Internet Essential Materials

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Outline Introduction Introduction Directories, Search Engines, Metasearch Engines Directories, Search Engines, Metasearch Engines Search Fundamentals Search Fundamentals Search Strategies Search Strategies How does a search engine work? How does a search engine work? Searching Tips Searching Tips Your site ’ s ranking? Your site ’ s ranking? Summary Summary

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Introduction You have probably been using search engines, but perhaps may not be as effectively as possible. You have probably been using search engines, but perhaps may not be as effectively as possible. A lot of information is available on-line, but not all of them is completely accurate. A lot of information is available on-line, but not all of them is completely accurate. The web-page addresses are constantly changing, it may be only available for a short time. The web-page addresses are constantly changing, it may be only available for a short time.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Engine History In 1990, before the WWW, Alan Emtage created Archie, the first search tool for finding files on FTP sites. In 1990, before the WWW, Alan Emtage created Archie, the first search tool for finding files on FTP sites. In 1993, Veronica is developed. Followed by Jughead, Wandex, … In 1993, Veronica is developed. Followed by Jughead, Wandex, … In 1994, Galaxy, WebCrawler, Yahoo! and Lycos debuted. In 1994, Galaxy, WebCrawler, Yahoo! and Lycos debuted. In 1995 and afterwards, Excite, Infoseek, Alta Vista, MetaCrawler, … In 1995 and afterwards, Excite, Infoseek, Alta Vista, MetaCrawler, … Next generation: specialized hybrids Next generation: specialized hybrids

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Directories A Web Directory or Web Guide is a hierarchical representation of hyperlinks. A Web Directory or Web Guide is a hierarchical representation of hyperlinks. The top level is typically a wide range of very general topics. The top level is typically a wide range of very general topics. Each topic contains hyperlinks of more specialized sub-topics. Each topic contains hyperlinks of more specialized sub-topics. Very easy to use. Very easy to use.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Hierarchical Representation

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Popular Directories AOL anywhere – search.aol.com AOL anywhere – search.aol.comsearch.aol.com CNET Search.com – CNET Search.com – Excite – Excite – E-Wild life – E-Wild life – Lycos – Lycos – Yahoo! – Yahoo! – Google – Google –

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Some figures

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Engines A search engine is a computer program that does the following: A search engine is a computer program that does the following: –Allows user to submit a query that consists of a word / phase –Searches the database –Returns a list of suitable URLs which match your query. –Allows user to revise and resubmit.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Where to submit Query? Submit your Query

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Popular Search Engines AOL anywhere – search.aol.com AOL anywhere – search.aol.comsearch.aol.com AltaVista – altavista.digital.com AltaVista – altavista.digital.comaltavista.digital.com Excite – Excite – HotBot – HotBot – Magellan – Magellan – Google – Google –

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Metasearch Engines A metasearch or all-in-one search engine performs a search by the use of more than one other search engine to complete the search job. A metasearch or all-in-one search engine performs a search by the use of more than one other search engine to complete the search job. The duplicate retrievals are eliminated. The duplicate retrievals are eliminated. The results are ranked according to how well they match with the query. The results are ranked according to how well they match with the query. Advantage: Advantage: –A single query can access lot of search engines. Disadvantage: Disadvantage: –A high noise-to-signal ratio, lot of matches will not be suitable for you.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Popular Metasearch Engines Metasearch – Metasearch – Metacrawler – Metacrawler – MetaFind – MetaFind – Dogpile – Dogpile –

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Some Figures

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung White Pages / Yellow Pages White pages allows user to lookup information about individuals. White pages allows user to lookup information about individuals. We can use white page to track down the telephone numbers, address. We can use white page to track down the telephone numbers, address. People can abuse white pages People can abuse white pages Some people think that white pages are an invasion of their privacy. Some people think that white pages are an invasion of their privacy. Yellow pages contain information about businesses. Yellow pages contain information about businesses.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Popular White Pages & Yellow Pages Bigfoot – Bigfoot – Yahoo! People Search – people.yahoo.com Yahoo! People Search – people.yahoo.com people.yahoo.com WhoWhere – WhoWhere – Yahoo! Yellow Page – yp.yahoo.com Yahoo! Yellow Page – yp.yahoo.comyp.yahoo.com SuperPages – SuperPages –

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Some Figures – White Pages

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Some Figures – Yellow Pages

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Comparison Directory Search Engine A directory allows you to explore and get what you want eventually. A search engine brings you to the exact page on the words or phrases you are looking for. Use a directory to find cooking- related websites. Use a search engine to find a specific recipe, by providing the name of the ingredients. Use a directory to find travel guides in a country. Use a search engine to find the transport trains schedule in South Africa.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Fundamentals Example: Example: Header: Yahoo Logo and some advertising. Header: Yahoo Logo and some advertising. Information bar: contains other hyperlinks. Information bar: contains other hyperlinks. Search form area: consists a form which allows you to type a query. Search form area: consists a form which allows you to type a query. Directory area: a large number of categories, channels. Directory area: a large number of categories, channels. Yahoo Links: Link to other yahoo sites. Yahoo Links: Link to other yahoo sites. Footer: contains information about yahoo, copyright and a disclaimer. Footer: contains information about yahoo, copyright and a disclaimer.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Fundamentals

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Terminology Search Tool: Any mean to locating information on the Internet. Search Tool: Any mean to locating information on the Internet. Query: Information typed into the form on the search engine. Query: Information typed into the form on the search engine. Query syntax: Rules for constructing a valid query. Query syntax: Rules for constructing a valid query. Query semantics: Rules for defining the meaning of a query. Query semantics: Rules for defining the meaning of a query. Hit/Match: A URL that the search engine returns for a specific query. Hit/Match: A URL that the search engine returns for a specific query. Relevancy score: A value that indicates the quality of the URL (match close to the query 1 to 100). Relevancy score: A value that indicates the quality of the URL (match close to the query 1 to 100).

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Pattern Matching Queries It is also called Fuzzy Query. It is also called Fuzzy Query. You can enter “ ungrammatical sentences ”, “ incomplete sentence fragments ”, “ disjoint phrases ”, “ nonsense words ”. You can enter “ ungrammatical sentences ”, “ incomplete sentence fragments ”, “ disjoint phrases ”, “ nonsense words ”. The search engine gets a collection of keywords. The search engine gets a collection of keywords. Required keyword: Mark with “ + ” before the keyword. Required keyword: Mark with “ + ” before the keyword. Prohibited keyword: Mark with “ - ” before the keyword. Prohibited keyword: Mark with “ - ” before the keyword.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Pattern Matching Queries

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Boolean Queries A Boolean Query is a query that consists keywords but with logical operators (AND, OR, NOT). A Boolean Query is a query that consists keywords but with logical operators (AND, OR, NOT). X AND Y – will return URLs that contain both X and Y. X AND Y – will return URLs that contain both X and Y. X OR Y – will return URLs that contain either X or Y. X OR Y – will return URLs that contain either X or Y. X AND NOT Y – will return URLs that contain X and do not contain Y. X AND NOT Y – will return URLs that contain X and do not contain Y. Symbol: AND - &, OR - |, NOT - !, NEAR - ~ Symbol: AND - &, OR - |, NOT - !, NEAR - ~

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Boolean Queries AND is used for narrowing a query AND is used for narrowing a query –If you know that your target documents will contain a group of keywords, list them using the AND operator OR is used for broadening a query OR is used for broadening a query –If you can think of related words for a topic, list them using the OR operator NOT is used to redirect a query NOT is used to redirect a query –If you find that a keyword or phrase is leading irrelevant hits, then represent it in your query as AND NOT keyword

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Boolean Queries

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Using Wildcards Wildcards are useful for retrieving variations of a word Wildcards are useful for retrieving variations of a word For example, art* will search for art, artwork, artist, artistry, and so forth For example, art* will search for art, artwork, artist, artistry, and so forth An excellent way to broaden a search An excellent way to broaden a search Different wildcard characters are used by different search engines Different wildcard characters are used by different search engines The most common characters are: *, #, and ? The most common characters are: *, #, and ?

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Advanced Search Options

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Break Time – 10 minutes

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Strategies You should find a search engine that meets the following conditions: You should find a search engine that meets the following conditions: –A user-friendly interface –Easy-to-understand documentation –Convenient to access –A large indexed database –Assigning good relevancy scores. Learn the syntax of this particular search engine, but not several different engines. Learn the syntax of this particular search engine, but not several different engines.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Generalization Too few hits? Too few hits? –Needs to generalize your search query. Pattern matching query: eliminate one of the more specific keywords of the query. Pattern matching query: eliminate one of the more specific keywords of the query. Boolean query: remove the keywords with AND operator, or delete the NOT item, or use the OR operator. Boolean query: remove the keywords with AND operator, or delete the NOT item, or use the OR operator. Use a directory or metasearch engine if still cannot locate the matched URL. Use a directory or metasearch engine if still cannot locate the matched URL.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Search Specialization Too many hits? Too many hits? –Needs to specialize your search query. Pattern matching query: add more keywords. Pattern matching query: add more keywords. Boolean query: use AND with other keyword, or add NOT operator to excluded some unwanted pages. Boolean query: use AND with other keyword, or add NOT operator to excluded some unwanted pages. Try capitalizing proper nouns or names. Try capitalizing proper nouns or names. Use a directory to locate your information. Use a directory to locate your information.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Sample Searches Queries about Kayaking in Alaska Queries about Kayaking in Alaska Example: Using infoseek Example: Using infoseek Query: No. of Hits alaska176,954 Alaska176,064 + ” Prince William Sound ” +Alaska778 +kayak + ” Prince William Sound ” +Alaska 44 +kayaking + ” Prince William Sound ” +Alaska60 +kayaking + ” Prince William Sound ” +Alaska +rental20

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung How does it work? User Interface – Allows you to type a query and displays the results. User Interface – Allows you to type a query and displays the results. Searcher – The engine searches the database for matching your query. Searcher – The engine searches the database for matching your query. Evaluator – The engine assigns scores to the retrieved information. Evaluator – The engine assigns scores to the retrieved information. Gatherer – The component that travels the WEB, and collects information. Gatherer – The component that travels the WEB, and collects information. Indexer – The engine that categorizes the data collected by the gatherer. Indexer – The engine that categorizes the data collected by the gatherer.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung User Interface Provides a mechanism for a user to submit queries to the search engine. Provides a mechanism for a user to submit queries to the search engine. Uses forms, very user friendly. Uses forms, very user friendly. The user interface displays the search results in a convenient way. The user interface displays the search results in a convenient way. A summary of each matched page is shown. A summary of each matched page is shown.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Searcher It is a program that uses the search engine ’ s database to locate the matches for a specific query. It is a program that uses the search engine ’ s database to locate the matches for a specific query. The database of a search engine holds extremely large indexed pages. The database of a search engine holds extremely large indexed pages. A highly efficient search algorithm is necessary. A highly efficient search algorithm is necessary. Computer Scientists have spent years to develop the searching and sorting methods. Computer Scientists have spent years to develop the searching and sorting methods. You can refer to computer books. You can refer to computer books.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Evaluator The searcher returns a set of URLs that match your query. The searcher returns a set of URLs that match your query. Not all of the hits equally match your query. Not all of the hits equally match your query. More references to the page, the ranking of the page will be higher. More references to the page, the ranking of the page will be higher. How the relevancy score is calculated? How the relevancy score is calculated? –Varies from one engine to another one. –The number of times of the word appears? –The query words appear in the title? –The query words appear in the META tag?

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Link Popularity reference

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Gatherer It is a program that traverses the Web and gathers information about the Web documents. It is a program that traverses the Web and gathers information about the Web documents. It runs at a short and regular intervals. It runs at a short and regular intervals. It returns information and will be indexed to the database. It returns information and will be indexed to the database. Alternate names: Bot, Crawler, Robot, Spider and Worm. Alternate names: Bot, Crawler, Robot, Spider and Worm.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Spiderlist

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Indexer It organizes the data by creating a set of keys or an index. It organizes the data by creating a set of keys or an index. Indexes need to be rebuilt frequently. Indexes need to be rebuilt frequently. E.g. Libraries – Author, Title, ISBN, etc … E.g. Libraries – Author, Title, ISBN, etc … In order to ensure the returned URL is not out of date. In order to ensure the returned URL is not out of date. The search engine is very complex and needs to break down into different components. The search engine is very complex and needs to break down into different components.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Case Study - AltaVista Sending out Crawlers (robot programs) that capture information from the web and bring them back. Sending out Crawlers (robot programs) that capture information from the web and bring them back. The main crawler – “ Scooter ” simultaneously send out HTTP requests like blind users on the Web. The main crawler – “ Scooter ” simultaneously send out HTTP requests like blind users on the Web. Store all these information to the indexing engine. Store all these information to the indexing engine. Scooter ’ s cousins help to remove “ dead ” links. Scooter ’ s cousins help to remove “ dead ” links. A typical day, Scooter will visit over 10 million pages. A typical day, Scooter will visit over 10 million pages. Web pages with no links referencing will never be found. Web pages with no links referencing will never be found. You can also submit your URL to AltaVista. You can also submit your URL to AltaVista.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Case Study - AltaVista METAtags – special keywords embedded in the headers of the webpage. METAtags – special keywords embedded in the headers of the webpage. Full-text index – Every word on every page is also included during searching. Full-text index – Every word on every page is also included during searching. AltaVista is using Full-text indexing. AltaVista is using Full-text indexing.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung METAtag Example

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Case Study - AltaVista Limit a search to a domain Limit a search to a domain E.g. searching “ edu ” domain E.g. searching “ edu ” domain +domain:edu + ” molecular biophysics ” +domain:edu + ” molecular biophysics ” The above query would only search for molecular biophysics at educational institutions. The above query would only search for molecular biophysics at educational institutions. Here is a list of Top-level Internet Domains Here is a list of Top-level Internet Domainslist

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Searching Tips Be natural Be natural –Is cell phone harmful? –Ask the search engine : “ Cell phone ” AND harmful Capitalize Capitalize –Always use lowercase –star will search “ Star, STAR, stAr, …” –Type “ Star ” unless you really want to search “ Star ”.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Searching Tips Use uncommon keywords Use uncommon keywords –The more specific results will return to you. –Think a valid and uncommon keyword. Require words Require words –Add a “ + ” before the keyword. –It will be in every match. Exclude words Exclude words –Use “ - ” before the keyword. –In what situation should we use?

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Searching Tips Correct Spelling Correct Spelling –Beware of the differences between English and American spellings (Color, Colour)  (color OR colour) Stop words Stop words –Ignore the most common words “ the, is, …” –“ searching the web ” and the search engine will ignore “ the web ”. –Add more relevant keyword.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Searching Tips Use wildcards Use wildcards –Use “ * ” in some search engines. –“ funk* ”  funk, funky, funkiest, … Solve dead links Solve dead links –If the search engine returns which is a dead link. –You can try –Or …

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Factors affect your site ’ s Ranking Keyword prominence Keyword prominence Keyword frequency Keyword frequency Keyword weight Keyword weight Keyword proximity Keyword proximity Keyword placement Keyword placement Click popularity & Stickiness Click popularity & Stickiness

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Keyword Prominence How early in a web site do the keywords first appear? How early in a web site do the keywords first appear? –The first element in HTML is the title tag –What happen if your title is: This is my homepage This is my homepage Welcome to my company ’ s homepage Welcome to my company ’ s homepage Include the keywords in head, Meta tag, early in the body, … Include the keywords in head, Meta tag, early in the body, …

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Keyword Frequency Search engine may determines your site ’ s popularity by checking how frequently the keyword or phrase appears on the page. Search engine may determines your site ’ s popularity by checking how frequently the keyword or phrase appears on the page. What is the problem if you put too many same keywords into one single page? What is the problem if you put too many same keywords into one single page?

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Keyword Weight It is also called keyword density It is also called keyword density Measure by comparing the number of keywords appearing on the web page with the total number of words on the page. Measure by comparing the number of keywords appearing on the web page with the total number of words on the page. In most case, we try not to exceed a keyword weight of 3 to 10 percent. In most case, we try not to exceed a keyword weight of 3 to 10 percent.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Keyword Density reference

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Keyword Proximity The placement of keywords on a web page in relation to each other is measured in “ Keyword Proximity ”. The placement of keywords on a web page in relation to each other is measured in “ Keyword Proximity ”. “ Home loans ” will outrank a citation about “ home mortgage loans ”. “ Home loans ” will outrank a citation about “ home mortgage loans ”. E.g. E.g. –Smith Brothers Inc has been selling puppy food for over 50 years. –Smith Brothers Inc has been selling food for your puppies for over 50 years.

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Keyword Placement Search engines favor web sites that contain keywords in: Search engines favor web sites that contain keywords in: –The title tag –The keyword META tag –The headline tag … –The first 25 words of body –Hyperlinks –Image tags –Text near the end of the document

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Click popularity & Stickiness Click popularity is a measure of the number of clicks received by each site in a search engine's results page. Click popularity is a measure of the number of clicks received by each site in a search engine's results page. Stickiness is a measure of the amount of time a user spends at a site. It's calculated according to the time that elapses between each of the user's clicks on the search engine's results page. Stickiness is a measure of the amount of time a user spends at a site. It's calculated according to the time that elapses between each of the user's clicks on the search engine's results page. Reference: Reference:

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Submit your site to search engines Google – 5 pages/day, Excite – 25 pages/week Google – 5 pages/day, Excite – 25 pages/week

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung Summary Use different resources to find/search different kinds of information. Use different resources to find/search different kinds of information. Use successive query refinement to achieve effective search queries. Use successive query refinement to achieve effective search queries. Think carefully for the keywords typed in the search engine. Think carefully for the keywords typed in the search engine. Use Boolean queries when you need combinations of keywords. Use Boolean queries when you need combinations of keywords. Think carefully when you create your own homepage, can it be easily indexed by search engines? Think carefully when you create your own homepage, can it be easily indexed by search engines?

CSC1720 – Introduction to InternetAll copyrights reserved by C.C. Cheung References searchenginewatch.com searchenginewatch.com searchenginewatch.com Information retrieval Information retrieval Information retrieval Information retrieval Search Engine Positioning – Fredrick Marckini (Wordware Publishing Inc.) Search Engine Positioning – Fredrick Marckini (Wordware Publishing Inc.) The End. The End. Thank you for your patience! Thank you for your patience!