1 Search Engine Basics Mr. Shaw. 2 Search Engine Basics Following is simplified tutorial on search engine basics. Following is simplified tutorial on.

Slides:



Advertisements
Similar presentations
ELibrary Topic Search Basics eLibrary topic search allows users to locate articles and multimedia resources –Relevant to K-12 curricula and user.
Advertisements

Getting Your Web Site Found. Meta Tags Description Tag This allows you to influence the description of your page with the web crawlers.
Web indexing ICE0534 – Web-based Software Development July Seonah Lee.
Advanced Google Becoming a Power Googler. (c) Thomas T. Kaun 2005 How Google Works PageRank: The number of pages link to any given page. “Importance”
1 Presented By Avinash Gutte Under The Guidance of Mrs. Hemangi Kulkarni Department of Computer Engineering Pimpri-Chinchwad College of Engineering, Pune.
IS530 Lesson 12 Boolean vs. Statistical Retrieval Systems.
Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Page 1 June 2, 2015 Optimizing for Search Making it easier for users to find your content.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
How Search Engines Work Source:
WEB SCIENCE: SEARCHING THE WEB. Basic Terms Search engine Software that finds information on the Internet or World Wide Web Web crawler An automated program.
 Search engines are programs that search documents for specified keywords and returns a list of the documents where the keywords were found.  A search.
SEO Techniques Tech Talk 29 th August 2013 (By PEN Vannak)
IDK0040 Võrgurakendused I Building a site: Publicising Deniss Kumlander.
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
Search Optimization Techniques Dan Belhassen greatBIGnews.com Modern Earth Inc.
By: Bihu Malhotra 10DD.   A global network which is able to connect to the millions of computers around the world.  Their connectivity makes it easier.
An Introduction to Content Management. By the end of the session you will be able to... Explain what a content management system is Apply the principles.
HOW SEARCH ENGINE WORKS. Aasim Bashir.. What is a Search Engine? Search engine: It is a website dedicated to search other websites and there contents.
SEO Part 1 Search Engine Marketing Chapter 5 Instructor: Dawn Rauscher.
Search Engine optimization.  Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's.
Courtney Forsmann IT Help Desk Manager Lewis-Clark State College October 1, 2014.
What is YouTube? - Why YouTube? - 8 Tips for Optimizing YouTube for SEO - How to Post to YouTube - Anatomy of a YouTube Upload Page - Video Content.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Overview What is a Web search engine History Popular Web search engines How Web search engines work Problems.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
Do's and don'ts to improve your site's ranking … Presentation by:
The Road to Online Marketing. A Magic Voyage Begins!!!
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Web Searching. How does a search engine work? It does NOT search the Web (when you make a query) It contains a database with info on numerous Web sites.
Chapter 17 Creating a Database.
LOGO Searching the Web CHAPTER 2 Eastern Mediterranean University School of Computing and Technology Department of Information Technology ITEC229 Client-Side.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Search engines are the key to finding specific information on the vast expanse of the World Wide Web. Without sophisticated search engines, it would be.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
Search Engines Reyhaneh Salkhi Outline What is a search engine? How do search engines work? Which search engines are most useful and efficient? How can.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
Computer Science 1000 Information Searching II Permission to redistribute these slides is strictly prohibited without permission.
3 / 12 Databases MIS105 Lec13 Irfan Ahmed Ilyas CHAPTER Prepared By:
How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”
Search Engines By: Faruq Hasan.
Created by Branden Maglio and Flynn Castellanos Team BFMMA.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
By Pamela Drake SEARCH ENGINE OPTIMIZATION. WHAT IS SEO? Search engine optimization (SEO) is the process of affecting the visibility of a website or a.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
By: Kem Forbs Advanced Google Search. Tips and Tricks Keywords: adding additional terms or keywords can redefine your search and make the most relevant.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Perfect Search Media  Search Engine Optimization  Search engine optimization (SEO) is the process and strategy of influencing the correlation.
Types Pros & cons.  A program for the retrieval of data, files, or documents from a database or network, esp. the Internet.  Search engines usually.
Traffic Source Tell a Friend Send SMS Social Network Group chat Banners Advertisement.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
How to Apply PDF in Flipbook on Website. Description If you are finding solution for applying PDF in flipbook mode on website, and adding multimedia items.
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
Search Engine Optimization
Client-Side Internet and Web Programming
APP Store Optimization With the objective of attract, convert, close and delight customers Promote.
Google webmaster tools.  Webmaster is one or more person who is responsible to create one or more sites.  Google webmaster is now changed and called.
Search Engines and Search techniques
Chapter Five Web Search Engines
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
Prepared by Rao Umar Anwar For Detail information Visit my blog:
SEARCH ENGINE OPTIMIZATION. P RESENTATION O VERVIEW  Search Engine Basics  What is SEO?  Key Concepts  Why is Search Engine marketing important? 
What is a Search Engine EIT, Author Gay Robertson, 2017.
9 Algorithms: Indexing Now where did I put that?.
Multimedia Information Retrieval
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Introduction to Information Retrieval
Information Retrieval and Web Design
Discussion Class 9 Google.
Presentation transcript:

1 Search Engine Basics Mr. Shaw

2 Search Engine Basics Following is simplified tutorial on search engine basics. Following is simplified tutorial on search engine basics. Not technically detailed or precise. Not technically detailed or precise. Intended for general students, not computer science majors. Intended for general students, not computer science majors. Intended as “electronic tutorial,” not as “presentation-ready” material. Intended as “electronic tutorial,” not as “presentation-ready” material.

3 How search engines work The basics: “Bots” and “indexing” The basics: “Bots” and “indexing” Computers using sophisticated software (“bots” or “spiders”) automatically seek out and “read” webpages. Computers using sophisticated software (“bots” or “spiders”) automatically seek out and “read” webpages. For each webpage, an “Index” is created (somewhat like index in the back of a book, but more complete). For each webpage, an “Index” is created (somewhat like index in the back of a book, but more complete). Process described as “indexing the webpage” Process described as “indexing the webpage”

4 Indexing a book (conceptually) Review Page 3 of Friedman’s The World is Flat (R. 3.0). Review Page 3 of Friedman’s The World is Flat (R. 3.0). Note that “IBM” appears on p. 3. Note that “IBM” appears on p. 3. (So do “Columbus,” “Texas Instruments,” etc., etc., so note these as well.) (So do “Columbus,” “Texas Instruments,” etc., etc., so note these as well.) Keep reviewing each page; eventually get to p. 59 Keep reviewing each page; eventually get to p. 59 Note that “IBM” appears on that page as well. Note that “IBM” appears on that page as well. page 3 “Columbus” “IBM” “Texas Instruments” etc. page 59 “Osama Bin Laden” “Afghanistan” “IBM” etc.

5 Indexing a book (conceptually) Continue process until you get through entire book. Continue process until you get through entire book. Then you “flip” or reverse your index Then you “flip” or reverse your index Readers don’t want to know “what’s on p. 3 and p. 59,” but, for example, “on what pages do I find ‘IBM,’ or ‘Columbus,’ or ‘Texas Instruments,’ etc. Readers don’t want to know “what’s on p. 3 and p. 59,” but, for example, “on what pages do I find ‘IBM,’ or ‘Columbus,’ or ‘Texas Instruments,’ etc. While this is tedious work, human beings (or at least interns) can do it. While this is tedious work, human beings (or at least interns) can do it. Most books are under 500 pages in length Most books are under 500 pages in length

6 Indexing a book (conceptually) When I am done going through the whole book, I then take each important term (e.g. “IBM”) and determine what pages this term is on. When I am done going through the whole book, I then take each important term (e.g. “IBM”) and determine what pages this term is on. This information is included in the index, found at the back of most books. This information is included in the index, found at the back of most books.

7 Indexing a website (conceptually) Pages are reviewed by computers, not human beings, but the essential process is very similar Pages are reviewed by computers, not human beings, but the essential process is very similar Review Review Note that “IBM” appears on this webpage Note that “IBM” appears on this webpage Review Review Note that “IBM” appears on this webpage Note that “IBM” appears on this webpage Review Review Note that “IBM” appears on this webpage. Note that “IBM” appears on this webpage. Eventually, high percentage of all webpages are “read” by computers; pages where “IBM” appears are identified. Eventually, high percentage of all webpages are “read” by computers; pages where “IBM” appears are identified.

8 Indexing a website (conceptually) Computers can “read” webpages much, much faster than human being can. Computers can “read” webpages much, much faster than human being can. Computers collect much more data from a webpage than a human being can. Computers collect much more data from a webpage than a human being can. Not just “IBM” was on this page, but also … Not just “IBM” was on this page, but also … “IBM” appeared on this page immediately adjacent to the word “software.” “IBM” appeared on this page immediately adjacent to the word “software.” “IBM” appeared within 5 words of the word “Microsoft.” “IBM” appeared within 5 words of the word “Microsoft.”

9 Indexing a website (conceptually) After the computers have “read” every page they can find, the index is “flipped” much like a book index is. After the computers have “read” every page they can find, the index is “flipped” much like a book index is. Result is a “database” mapping specific terms and other information to webpages. Result is a “database” mapping specific terms and other information to webpages. “Term ‘IBM’ is found on following webpages…” “Term ‘IBM’ is found on following webpages…” “Term ‘IBM’ is within 3 words of term ‘software’ on following webpages …” “Term ‘IBM’ is within 3 words of term ‘software’ on following webpages …”

10 Indexing Term “indexing” used frequently; refers to computer “reading” specified content and building an index. Term “indexing” used frequently; refers to computer “reading” specified content and building an index. Indexing can convert a mass of largely useless “stuff” into a very useful resource. Indexing can convert a mass of largely useless “stuff” into a very useful resource. Example: “I would like to index the all transcripts of the CBS Evening News since it’s inception.” Example: “I would like to index the all transcripts of the CBS Evening News since it’s inception.” Now I can identify every broadcast where “Watergate” was mentioned since Now I can identify every broadcast where “Watergate” was mentioned since Example: “Google’s Desktop Search tool can index the thousands of files sitting on my hard drive.” Example: “Google’s Desktop Search tool can index the thousands of files sitting on my hard drive.” Now I can find that article I wrote 10 years ago. Now I can find that article I wrote 10 years ago.

11 How search engines work In some cases, user-generated metadata (“data about data”) is also utilized. In some cases, user-generated metadata (“data about data”) is also utilized. E.G. “keywords” and “description” fields, which are easily added to a webpage when it is created. E.G. “keywords” and “description” fields, which are easily added to a webpage when it is created. E.G. of description: “This webpage describes how search engines work.” E.G. of description: “This webpage describes how search engines work.” Metadata can be extremely useful, but is also misused to manipulate search engines. Metadata can be extremely useful, but is also misused to manipulate search engines. Example: Scammers add common search terms (“Britney Spears,” etc.) to their metadata, even if webpages have nothing to do with Britney Spears. Example: Scammers add common search terms (“Britney Spears,” etc.) to their metadata, even if webpages have nothing to do with Britney Spears. Value of metadata may be greatest in controlled environments, e.g. intranets, where webpage creators can be trusted not to include misleading metadata. Value of metadata may be greatest in controlled environments, e.g. intranets, where webpage creators can be trusted not to include misleading metadata.

12 How search engines work When user submits a query, query is matched to previously-created index. When user submits a query, query is matched to previously-created index. Most basic approach is to just look for similarity between the index and search terms (keywords) contained in query. Most basic approach is to just look for similarity between the index and search terms (keywords) contained in query. Common in early days of search. Common in early days of search. Often fails to provide useful, relevant search results. Often fails to provide useful, relevant search results. Modern search engines use “Secret sauce” to improve results. Modern search engines use “Secret sauce” to improve results. “Secret sauce” = sophisticated algorithms. “Secret sauce” = sophisticated algorithms. Google’s “secret sauce” known as PageRank Google’s “secret sauce” known as PageRank One trick: when to ignore user-generated metadata. One trick: when to ignore user-generated metadata. Search engine optimization vs. search engine manipulation Search engine optimization vs. search engine manipulation