CS 115: COMPUTING FOR THE SOCIO-TECHNO WEB FINDING INFORMATION WITH SEARCH ENGINES.

Slides:



Advertisements
Similar presentations
1 How To Use a Browser A Module of the CYC Course – Computer Basics
Advertisements

The Internet and the Web
The Structure of the Web Mark Levene (Follow the links to learn more!)
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
How does a web search engine work?. search  google (started 1998 … now worth $365 billion)  bing  amazon  web, images, news, maps, books, shopping,
Search Engines: The players and the field The mechanics of a typical search. The search engine wars. Statistics from search engine logs. The architecture.
Information Retrieval Lecture 8 Introduction to Information Retrieval (Manning et al. 2007) Chapter 19 For the MSc Computer Science Programme Dell Zhang.
Exploring the Deep Web Brunvand, Amy, Kate Holvoet, Peter Kraus, and David Morrison. "Exploring the Deep Web." PPT--Download University of Utah.
1 Web Search and Web Search Overlap: What the Deal? Amanda Spink Queensland University of Technology.
Searching on the WWW The Google Phenomena Snyder p
Searching the Web II. The Web Why is it important: –“Free” ubiquitous information resource –Broad coverage of topics and perspectives –Becoming dominant.
Social Bookmarking & Research What Delicious can do for you.
1 ETT 429 Spring 2007 Microsoft Publisher II. 2 World Wide Web Terminology Internet Web pages Browsers Search Engines.
WEB SCIENCE: SEARCHING THE WEB. Basic Terms Search engine Software that finds information on the Internet or World Wide Web Web crawler An automated program.
 Search engines are programs that search documents for specified keywords and returns a list of the documents where the keywords were found.  A search.
What Is A Web Page? An Introduction to the Internet.
The Internet & The World Wide Web Notes
1.Learning the Terms Learning the TermsLearning the Terms 2.Accessing the Internet from a PC Accessing the Internet from a PCAccessing the Internet from.
Internet Research, Second Edition- Illustrated 1 Internet Research: Unit A Searching the Internet Effectively.
Copyright © Allyn & Bacon (2010) Graziano and Raulin Website Resources Graziano and Raulin Research Methods: Website This multimedia product and its contents.
Browser Wars and the Politics of Search Engines
Search Engine Marketing Shelly Brown Director of Web Services Southwest Baptist University.
By Mrs. Fisher. What does www stand for? The web is a huge collection of electronic pages filled with written information, graphics, sound and video.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
 Search Engine Search Engine  Steps to Search for webpages pertaining to a specific information Steps to Search for webpages pertaining to a specific.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Business Research Methods Using the Internet- to aid your studies.
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
Autumn Web Information retrieval (Web IR) Handout #0: Introduction Ali Mohammad Zareh Bidoki ECE Department, Yazd University
Internet Vocabulary CTE Intro. URL  The “address” of a website. Entering this address in the Address Bar will take you directly to a particular website.
HOW BIG IS THE INTERNET? As of 2005, Internet size is estimated at 5 million terabytes: 5.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
Mrs. Geist Belfield Elementary School Librarian Helping Students Search Effectively.
Search Engines: The players and the field The mechanics of a typical search. The search engine wars. Statistics from search engine logs. The architecture.
Meet the web: First impressions How big is the web and how do you measure it? How many people use the web? How many use search engines? What is the shape.
1 Internet Research Third Edition Unit A Searching the Internet Effectively.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
Google, Bing, MSN, Yahoo! and many more!. How useful are search Engines? We discussed some of the techniques involved in the previous lesson. Search Engines.
The Structure of the Web. Getting to knowing the Web How big is the web and how do you measure it? How many people use the web? How many use search engines?
Internet and WWW. Internet Network linking computers to other computers Access to numerous resources – Communications systems Instant messaging.
Internet Research – Illustrated, Fourth Edition Unit A.
The World Wide Web: Information Resource. How a Search Engine works… How Search Works - YouTube
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
 SEO Terms A few additional terms Search site: This Web site lets you search through some kind of index or directory of Web sites, or perhaps both an.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
Discovering Computers Fundamentals, 2011 Edition Living in a Digital World.
Searching the Web for academic information Ruth Stubbings.
WEB SEARCH BASICS By K.KARTHIKEYAN. Web search basics The Web Ad indexes Web spider Indexer Indexes Search User Sec
The World Wide Web.
Marking the Most of the Web’s Resources
What this activity will show you
Internet Searching: Finding Quality Information
Map Reduce.
Search Engine Optimisation
Jim Barton Librarian Glenside Public Library District
Internet.
Instructor Name Instructor Title Library Name
Electronic Communication
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall
CS 115: COMPUTING FOR The Socio-Techno Web
Basic Technology skills
Internet Research Third Edition
Objectives To understand the about types of computer network
What is a Search Engine EIT, Author Gay Robertson, 2017.
How Search Engines Work?
Search Engines And how they work.
All About the Internet.
Presentation transcript:

CS 115: COMPUTING FOR THE SOCIO-TECHNO WEB FINDING INFORMATION WITH SEARCH ENGINES

SEARCHING VS SURFING Search = employing a search engine to find information. Surfing (or navigating) = employing a link-following strategy to find information. The web encourages a combination of search and navigation.

SURFING THE WEB

THE WEB IS A DIRECTED GRAPH Like a map of a country with cities and one-way roads Directed Graph of Nodes and Arcs (one-way connections) Nodes = web pages Arcs = hyperlinks from a page to another Why is this cool? Because… it can be explored it can be indexed

GOOGLE

HOW SEARCH ENGINES WORK A E C D B The Web Web spider Indexer Indexes

HOW SEARCH ENGINES WORK A E C D B The Web Ad indexes Web spider Indexer Indexes Search User

BINARY SEARCH Halve things each time

MECHANICS OF A TYPICAL SEARCH

WEB DIRECTORIES ORGANIZE INFORMATION IN CATEGORIES WITH HUMAN HELP

WHAT ARE YOU TRYING TO FIND? Types of queries: Informational – want to learn about something Navigational – want to go to that page Transactional – want to do something (web-mediated) Access a service Downloads Shop Gray areas Find a good hub (resource collection) Exploratory search “see what’s there” Peripheral neuropathy Wellesley College Wellesley weather Mars surface images Nikon SLR camera car rental Boston morality of abortion

HOW FAR DO YOU LOOK FOR RESULTS?

DIVERSITY IN CONTENT Languages Hundreds of languages (2001) Home pages (1997): English 82%, Next 15 languages: 13% Google’s index (mid 2001): English: 53%, JGCFSKRIP: 30% This trend is expected to continue today Popular Query Topics (from 1 million Google queries, Apr 2000) 1.8%Regional: Europe7.2%Business ………… 2.3%Business: Industries7.3%Recreation 3.2%Computers: Internet8%Adult 3.4%Computers: Software8.7%Society 4.4%Adult: Image Galleries10.3%Regional 5.3%Regional: North America13.8%Computers 6.1%Arts: Music14.6%Arts

QUESTIONS ABOUT THE WEB How big is the Web? How many people use the Web? How many people use search engines? How hard is it to go from one page to another through clicks? What is the shape of the Web?

HOW BIG IS THE WEB? Number of accessible web pages (the visible web) Google claims to have encountered 1 trillion unique URLs (though in the past claimed to have indexed 26.6 billion pages Yahoo claims to have indexed 55 billion pages Cuil claims to have indexed 120 billion pages The deep web (or hidden or invisible web) “contains times more information” Coverage (i.e. the proportion of the web indexed) is crucial for search engines. Today, less than 15% pages are indexed!

CAN YOU MEASURE THE SIZE OF WEB? How do you count fish on a lake? Lincoln-Petersen Method aka: Capture-Mark-Recapture method M = # of fish captured and marked; released. C= # of fish returned in second visit. R = # of marked fish in second visit. Estimate : M/N = R/C => N= (M x C) / R M R C N

CAN YOU MEASURE THE SIZE OF WEB? Capture-Mark-Recapture method SE1 = # of pages indexed search engine 1. QSE2 = # of pages returned by search engine 2 for typical queries. OVR = # of pages returned by both search engines for typical queries. Estimate : SE1 / WWW = OVR / QSE2 => WWW = (SE1 x QSE2) / OVR SE1 OVR QSE2 WWW

HOW MANY PEOPLE USE THE WEB? 87% of the American online 70% of Americans use broadband at home 68.0% of Americans a ccess the internet on a cell phone, tablet, or other mobile device 39% of the world had internet access in 2013 What does this tell you about the importance of the Web?

HOW MANY PEOPLE USE SEARCH ENGINES? 49% of all internet users use a search engine on a daily basis 622 million queries per day ( 18.6 billion searches in April, 2014) Search engine usage as of June 2004: Google (41.6%), Yahoo! (31.5%), MSN (27.4%), AOL (13.6%), Ask (7%) Search engine usage as of April 2014: Google (67.6%), Yahoo! (10%), MSN (18.7%), AOL (1.3%), Ask (2.4%) What does this tell you about the importance of the Search Engines?

HOW HARD IS IT TO SURF FROM ONE PAGE TO ANOTHER? Over 75% of the time there is no directed path from one random web page to another. When a directed path exists its average length is 16 clicks. Short average path between pairs of nodes is characteristic of a small-world network.

WHAT IS THE SHAPE OF THE WEB? “Map of the Internet” (1998)

WHAT DOES THE WEB LOOK LIKE? BOW-TIE SHAPE OF THE WEB

A CONSTRUCTIVE ALGORITHM TO PROVE THAT THE WEB IS A BOWTIE Start with disconnected Web pages Examine the shape after 1 link/page is considered Bowtie appears after the 2 nd link per page is considered After that, the Bowtie shape gets stronger

AFTER ONE LINK IS CONSIDERED A collection of pseudo-trees

AFTER A SECOND LINK IS CONSIDERED A collection of bowties

WHEN MORE LINKS ARE INCLUDED… Consider the combinations of links within the same bowtie between bowties

CORRECT THE SHAPE OF THE WEB Bowties are everywhere!

EXERCISES 1-Draw a web graph of the course class website. 2-Handout

Crawling starting point HOW ABOUT THE CLASS WEB?

Crawling starting point CAN SE’S COVER ALL THE WEB? Put a starting Web page in a queue Q & repeat: Pick up a page P from the queue, Crawl P, and Put on the queue each page reachable from P