Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014

Slides:



Advertisements
Similar presentations
How does Google search for everything? Searching For and Organizing Data Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014.
Advertisements

Searching and Sorting Topics  Sequential Search on an Unordered File  Sequential Search on an Ordered File  Binary Search  Bubble Sort  Insertion.
Algorithms (Contd.). How do we describe algorithms? Pseudocode –Combines English, simple code constructs –Works with various types of primitives Could.
1 CS 502: Computing Methods for Digital Libraries Lecture 16 Web search engines.
Chapter 3: The Efficiency of Algorithms
Starting Out with C++: Early Objects 5/e © 2006 Pearson Education. All Rights Reserved Starting Out with C++: Early Objects 5 th Edition Chapter 9 Searching.
Search Engines and their Public Interfaces: Which APIs are the Most Synchronized? Frank McCown and Michael L. Nelson Department of Computer Science, Old.
Information Retrieval CENG 555 Spring Course Web Page Authoritative source of administrivia In-class announcements generally reflected on Web.
04/30/13 Last class: summary, goggles, ices Discrete Structures (CS 173) Derek Hoiem, University of Illinois 1 Image: wordpress.com/2011/11/22/lig.
Data Structures & Algorithms and The Internet: A different way of thinking.
Search Engine Interfaces search engine modus operandi.
Chapter 8 Searching and Sorting Arrays Csc 125 Introduction to C++ Fall 2005.
CIS3023: Programming Fundamentals for CIS Majors II Summer 2010 Ganesh Viswanathan Searching Course Lecture Slides 28 May 2010 “ Some things Man was never.
Compsci 101.2, Fall Plan For the Day l Discuss Algorithms and Programming at a high level, examples with cooperative/group work  Connect to reading.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
How to Read Code Benfeard Williams 6/11/2015 Susie’s lecture notes are in the presenter’s notes, below the slides Disclaimer: Susie may have made errors.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
CompSci 102 Discrete Math for Computer Science April 17, 2012 Prof. Rodger.
Computer Science 101 Introduction to Sorting. Sorting One of the most common activities of a computer is sorting data Arrange data into numerical or alphabetical.
CS261 Data Structures Ordered Bag Dynamic Array Implementation.
GIS Data Models GEOG 370 Christine Erlien, Instructor.
Lecture on Binary Search and Sorting. Another Algorithm Example SEARCHING: a common problem in computer science involves storing and maintaining large.
CompSci 101 Introduction to Computer Science January 19, 2016 Prof. Rodger compsci 101 spring
Computer Science Unplugged Dr. Tom Cortina Carnegie Mellon University.
CompSci 101 Introduction to Computer Science January 15, 2015 Prof. Rodger 1.
CPS120: Introduction to Computer Science Sorting.
CMSC 104, Version 8/061L24Searching&Sorting.ppt Searching and Sorting Topics Sequential Search on an Unordered File Sequential Search on an Ordered File.
CompSci 101 Introduction to Computer Science March 31, 2016 Prof. Rodger.
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
Searching and Sorting Arrays
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
Algorithmic Efficency
Introduction to Search Algorithms
IGCSE 6 Cambridge Effectiveness of algorithms Computer Science
Lists and Sorting Algorithms
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
COMP 103 SORTING Lindsay Groves 2016-T2 Lecture 26
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
Punch Card Sorting: Binary Radix Sort
CompSci 101 Introduction to Computer Science
structures and their relationships." - Linus Torvalds
Searching CSCE 121 J. Michael Moore.
File and Database Concepts
Searching and Sorting Topics Sequential Search on an Unordered File
CompSci 101 Introduction to Computer Science
Chapter 5: Algorithms Computer Science: An Overview Tenth Edition
Searching and Sorting Topics Sequential Search on an Unordered File
How does Google search for everything? Computer Science at Work
Scratch Where Are You Now?
Searching and Sorting Arrays
Manipulating lists of data
Standard Version of Starting Out with C++, 4th Edition
BOOSTING IMAGE RETRIEVAL
ITEC 2620M Introduction to Data Structures
UMBC CMSC 104 – Section 01, Fall 2016
Searching and Sorting Arrays
Intro to Computer Science CS1510 Dr. Sarah Diesburg
Hash Tables By JJ Shepherd.
Searching and Sorting Arrays
CPS120: Introduction to Computer Science
CPS120: Introduction to Computer Science
Indexing, Access and Database System Architecture
Information Retrieval
structures and their relationships." - Linus Torvalds
Computer Science: An Overview Tenth Edition
Presentation transcript:

Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014 How does Google search for everything? Searching For and Organizing Data Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014

Lots of names. How do we find someone? Anderson, Mary : 203 Main St. Durham NC Williams, Fred : 14 Union Circle, Cary, NC Wu, Xin : 57 Wilson Court, Raleigh, NC Smith, Doug : 18 Pine Cone Lane, Durham, NC Pratt, Sarah: 6 White Lane, Hillsborough, NC Chase, Angela: 34 Dogwood Road, Durham, NC Brooks, Bolton : 10 Time St., Durham, NC French, Melvin : 42 Starship Circle, Durham, NC Gao, Bo : 134 Brookside Lane, Durham, NC

Put the names in alphabetical order Anderson, Mary : 203 Main St. Durham NC Brooks, Bolton : 10 Time St., Durham, NC Chase, Angela: 34 Dogwood Road, Durham, NC French, Melvin : 42 Starship Circle, Durham, NC Gao, Bo : 134 Brookside Lane, Durham, NC Pratt, Sarah: 6 White Lane, Hillsborough, NC Smith, Doug : 18 Pine Cone Lane, Durham, NC Williams, Fred : 14 Union Circle, Cary, NC Wu, Xin : 57 Wilson Court, Raleigh, NC

X X X Find Narten Found! How many words did we look at? Anderson Applegate Bethune Brooks Carter Edwards Foggle Griffin Holhouser Jefferson Klatchy Morgan Munson Narten Oliven Parken Rivers Roberts Stevenson Thomas Wilson Woodrow Yarbrow Found! How many words did we look at? X X

Searching for words If we had a million words in alphabetical order, how many would we need to look at worst case to find a word? 18 1,000,000 50,000 25,000 12,500 6,250 3,125 1,562 781 390 195 97 48 24 12 6 3 2 1 If you are clever, Cut the number of numbers to look at in half over and over again, Then only 18 numbers to look at worst case If you are clever, you only have to look at a few word!

How does one search for an item? Data must be organized in some way Sorting alphabetically (or numerically) is one way There are other ways to organize data!

Google Search Query

Computer Science at work behind the scenes! Googlebot web crawler Finds and retrieves pages Gives pages to google indexer

“how” “google” “search” “works”

Page Rank Algorithm

Correction Algorithms

Google is all about problem solving and writing algorithms Algorithms must happen fast! Can Google put all the web pages it finds in alphabetical order to search? Want efficient, fast algorithms! No one wants to wait on a search query!

Activities Given numbers – sort yourselves Redistribute numbers – sort using selection sort Parallel Sort Hashing with buckets Hash function is last digit – remainder when you divide by 10

Sorting Network

Sort numbers (largest at bottom) using comparators in parallel

Sorting Network different setup for comparators

Sort numbers (largest at bottom)

My research - Making theoretical concepts come alive – visualize and interact with!