How does Google search for everything? Searching For and Organizing Data Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014.

Slides:



Advertisements
Similar presentations
Chapter 9: Searching, Sorting, and Algorithm Analysis
Advertisements

Searching and Sorting Topics  Sequential Search on an Unordered File  Sequential Search on an Ordered File  Binary Search  Bubble Sort  Insertion.
Chapter 11 Sorting and Searching. Topics Searching –Linear –Binary Sorting –Selection Sort –Bubble Sort.
Algorithms (Contd.). How do we describe algorithms? Pseudocode –Combines English, simple code constructs –Works with various types of primitives Could.
1 CS 502: Computing Methods for Digital Libraries Lecture 16 Web search engines.
Chapter 3: Sorting and Searching Algorithms 3.2 Simple Sort: O(n 2 )
Column Method for Subtraction. Steps to success 1.Put the largest number on top. 2.Place the digits in the correct column. 3.Show the subtraction and.
Chapter 3: The Efficiency of Algorithms
Starting Out with C++: Early Objects 5/e © 2006 Pearson Education. All Rights Reserved Starting Out with C++: Early Objects 5 th Edition Chapter 9 Searching.
Search Engines and their Public Interfaces: Which APIs are the Most Synchronized? Frank McCown and Michael L. Nelson Department of Computer Science, Old.
Programming Concepts Jacques Tiberghien office : Mobile :
Information Retrieval CENG 555 Spring Course Web Page Authoritative source of administrivia In-class announcements generally reflected on Web.
Suzanne Westbrook, PhD School of Information: Science, Technology, & Arts Computer Science Dept, UA.
Data Structures & Algorithms and The Internet: A different way of thinking.
Search Engine Interfaces search engine modus operandi.
Chapter 8 Searching and Sorting Arrays Csc 125 Introduction to C++ Fall 2005.
CIS3023: Programming Fundamentals for CIS Majors II Summer 2010 Ganesh Viswanathan Searching Course Lecture Slides 28 May 2010 “ Some things Man was never.
Chapter 5 Algorithms © 2007 Pearson Addison-Wesley. All rights reserved.
Compsci 101.2, Fall Plan For the Day l Discuss Algorithms and Programming at a high level, examples with cooperative/group work  Connect to reading.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
How to Read Code Benfeard Williams 6/11/2015 Susie’s lecture notes are in the presenter’s notes, below the slides Disclaimer: Susie may have made errors.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
CompSci 102 Discrete Math for Computer Science April 17, 2012 Prof. Rodger.
Computer Science 101 Introduction to Sorting. Sorting One of the most common activities of a computer is sorting data Arrange data into numerical or alphabetical.
Starting Out with C++ Early Objects Seventh Edition by Tony Gaddis, Judy Walters, and Godfrey Muganda Modified for use by MSU Dept. of Computer Science.
CS261 Data Structures Ordered Bag Dynamic Array Implementation.
GIS Data Models GEOG 370 Christine Erlien, Instructor.
Lecture on Binary Search and Sorting. Another Algorithm Example SEARCHING: a common problem in computer science involves storing and maintaining large.
Chapter 11Java: an Introduction to Computer Science & Programming - Walter Savitch 1 Chapter 11 l Basics of Recursion l Programming with Recursion Recursion.
New Mexico Computer Science For All Search Algorithms Maureen Psaila-Dombrowski.
Computer Science 101 A Survey of Computer Science Sorting.
Sorting. Sorting Sorting is important! Things that would be much more difficult without sorting: –finding a telephone number –looking up a word in the.
CompSci 101 Introduction to Computer Science January 19, 2016 Prof. Rodger compsci 101 spring
Computer Science Unplugged Dr. Tom Cortina Carnegie Mellon University.
CPS120: Introduction to Computer Science Sorting.
CompSci 101 Introduction to Computer Science January 15, 2015 Prof. Rodger 1.
The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.
CPS120: Introduction to Computer Science Sorting.
CMSC 104, Version 8/061L24Searching&Sorting.ppt Searching and Sorting Topics Sequential Search on an Unordered File Sequential Search on an Ordered File.
CompSci 101 Introduction to Computer Science March 31, 2016 Prof. Rodger.
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
Algorithmic Efficency
Introduction to Search Algorithms
IGCSE 6 Cambridge Effectiveness of algorithms Computer Science
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
COMP 103 SORTING Lindsay Groves 2016-T2 Lecture 26
CompSci 101 Introduction to Computer Science
Prepared by Rao Umar Anwar For Detail information Visit my blog:
Punch Card Sorting: Binary Radix Sort
CompSci 101 Introduction to Computer Science
Searching and Sorting Topics Sequential Search on an Unordered File
CompSci 101 Introduction to Computer Science
Searching and Sorting Topics Sequential Search on an Unordered File
How does Google search for everything? Computer Science at Work
Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014
What is a Search Engine EIT, Author Gay Robertson, 2017.
Searching and Sorting Arrays
Standard Version of Starting Out with C++, 4th Edition
BOOSTING IMAGE RETRIEVAL
Searching and Sorting Arrays
Intro to Computer Science CS1510 Dr. Sarah Diesburg
CPS120: Introduction to Computer Science
CPS120: Introduction to Computer Science
Indexing, Access and Database System Architecture
Computer Science: An Overview Tenth Edition
Presentation transcript:

How does Google search for everything? Searching For and Organizing Data Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014

Lots of names. How do we find someone? Anderson, Mary : 203 Main St. Durham NC Williams, Fred : 14 Union Circle, Cary, NC Wu, Xin : 57 Wilson Court, Raleigh, NC Smith, Doug : 18 Pine Cone Lane, Durham, NC Pratt, Sarah: 6 White Lane, Hillsborough, NC Chase, Angela: 34 Dogwood Road, Durham, NC Brooks, Bolton : 10 Time St., Durham, NC French, Melvin : 42 Starship Circle, Durham, NC Gao, Bo : 134 Brookside Lane, Durham, NC

Put the names in alphabetical order Anderson, Mary : 203 Main St. Durham NC Brooks, Bolton : 10 Time St., Durham, NC Chase, Angela: 34 Dogwood Road, Durham, NC French, Melvin : 42 Starship Circle, Durham, NC Gao, Bo : 134 Brookside Lane, Durham, NC Pratt, Sarah: 6 White Lane, Hillsborough, NC Smith, Doug : 18 Pine Cone Lane, Durham, NC Williams, Fred : 14 Union Circle, Cary, NC Wu, Xin : 57 Wilson Court, Raleigh, NC

Anderson Applegate Bethune Brooks Carter Edwards Foggle Griffin Holhouser Jefferson Klatchy Morgan Munson Narten Oliven Parken Rivers Roberts Stevenson Thomas Wilson Woodrow Yarbrow X X X Find Narten Found! How many words did we look at?

Searching for words If we had a million words in alphabetical order, how many would we need to look at worst case to find a word? 18 1,000,000 50,000 25,000 12,500 6,250 3,125 1, If you are clever, Cut the number of numbers to look at in half over and over again, Then only 18 numbers to look at worst case

How does one search for an item? Data must be organized in some way Sorting alphabetically (or numerically) is one way There are other ways to organize data!

Google Search Query

Computer Science at work behind the scenes! Googlebot web crawler – Finds and retrieves pages – Gives pages to google indexer

“how” “google” “search” “works”

Page Rank Algorithm

Correction Algorithms

Google is all about problem solving and writing algorithms Algorithms must happen fast! Can Google put all the web pages it finds in alphabetical order to search? Want efficient, fast algorithms! No one wants to wait on a search query!

Activities Given numbers – sort yourselves Redistribute numbers – sort using selection sort Parallel Sort Hashing with buckets – Hash function is last digit – remainder when you divide by 10

Sorting Network

Sort numbers (largest at bottom) using comparators in parallel

Sorting Network different setup for comparators

Sort numbers (largest at bottom)

My research - Making theoretical concepts come alive – visualize and interact with!