9 Algorithms: Indexing Now where did I put that?
Algorithms Algorithm : Step by step instructions used to solve a problem
Algorithms Algorithm 1 : Add the numbers
Algorithms Algorithm 1 : Algorithm 2: Add the numbers Add each digit, carry if a column is over 10
Algorithms Algorithm 1 : Algorithm 2: Add the numbers Add each digit, carry if a column is over 10
Algorithms Algorithm 3: Start with the rightmost column Add the numbers in the column If the total is less than ten, write it below that column If the total is ten or more, write the 1's digits below that column and write a 1 above the next column to the left Move to the next column to the left If there are any numbers in it, go to step 2 If there are no numbers in the column, you are done
Algorithms Need to be mechanically executable No intuition or guesswork Ability to use an algorithm does not require or imply knowledge
Algorithms Need to be mechanically executable Need to be precise & detailed What is precise? Do I have to explain how to add two single digits numbers? Do I have to explain columns? Do I have to explain left and right?
Finding Information How do I find a webpage? Catalog based navigation: Hard to scale
Finding Information Keyword search
Finding Information 2 part problem Identifying matches Identifying best matches
Indexing the Web Web "spiders": Programs that explore the web
Web Page Contents Three simple web pages: What algorithm could answer query "cat"
How Do We Find Things? How do I find ebay in 9 Algorithms book?
Indexing A book index:
Web Page Contents Three simple web pages: A basic index:
What can we do? What algorithm could answer query "cat"
What can we do? What algorithm could answer query "cat" Why is ordering important?
What can we do? What algorithm could answer query cat dog
What can we do? Can we answer the query "cat sat" ? "cat sat" vs cat sat
Word Indexes Web pages with numbered words:
Word Indexes Web pages with numbered words: Page-Word index
Word Indexes How can we now answer these queries? "cat sat" "the mat" "a cat"
Word Indexes How can we now answer these queries? "cat sat" "the mat" "a cat" 1-2 1-3 1-5 1-6 no match 2-5 2-6
Ranking Which page is the best source for malaria?
Ranking Absolute word location can guide ranking Which page is the best source for malaria?
Ranking Which page is the best source for causes of malaria?
Ranking Relative word location can guide ranking Which page is the best source for causes of malaria?
Metadata Metadata : data about the data
Structured Information Simplified HTML (web) page
Structured Information Simplified HTML (web) page What pages have "cat" in the title?
Not just for the web Operating systems, databases, etc…
Search as Metaphor Search displacing/complementing "desktop" as interface metaphor
Changes Significantly large changes in quantity produce a change quality