How Google and Microsoft taught search to “understand” the Web Austin Granger Chris Hesemann.

Slides:



Advertisements
Similar presentations
Searching for Information Search engines vs. subscription services.
Advertisements

Internet Search Methods 1.01 Understand Internet search tools and methods.
SCRAPING BUSINESS PHONE NOS Anisha S. Agenda When business URLs are present When business URLs are not present; What is present is a list of keywords.
The Google Similarity Distance  We’ve been talking about Natural Language parsing  Understanding the meaning in a sentence requires knowing relationships.
Advanced Data Structures
IBM Cloud Marketplace Content Pack Template - Listing Info Tab Your Company Name.
Exploring the Deep Web Brunvand, Amy, Kate Holvoet, Peter Kraus, and David Morrison. "Exploring the Deep Web." PPT--Download University of Utah.
Crawling the WEB Representation and Management of Data on the Internet.
ADMINISTRATION Sources of Information REVISION – BLOCK 6.
Computer comunication B Information retrieval. Information retrieval: introduction 1 This topic addresses the question on how it is possible to find relevant.
Retrieving Location-based Data on the Web Andrei Tabarcea,
How Search Engines Work. Any ideas? Building an index Dan taylor Flickr Creative Commons.
On the Use of Regular Expressions for Searching Text Charles L.A. Clarke and Gordon V. Cormack Fast Text Searching.
SCRAPING BUSINESS ADDRESSES Anisha S. Agenda When business URLs are present When business URLs are not present; What is present is a list of keywords.
INTERNET CHAPTER 12 Information Available The INTERNET contains a huge amount of information a huge amount of information information on any topic you.
CAREERS IN LINGUISTICS OUTSIDE OF ACADEMIA CAREERS IN INDUSTRY.
1 The BT Digital Library A case study in intelligent content management Paul Warren
CMPS 3223 Theory of Computation Automata, Computability, & Complexity by Elaine Rich ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Slides provided.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
Using Hyperlink structure information for web search.
What is Web Mining? Discovering desired and useful information from the World-Wide Web.
The Development of a search engine & Comparison according to algorithms Sungsoo Kim Haebeom Lee The mid-term progress report.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal, Surajit Chaudhuri, Gautam Das Cathy Wang
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
Search. Search and Economics Search is ubiquitous –Money as a search efficiency Eliminates double coincidence of wants in search for barter exchange –Job.
Databases Week 5 LBSC 690 Information Technology.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Data Mining By Dave Maung.
1 Crawling The Web. 2 Motivation By crawling the Web, data is retrieved from the Web and stored in local repositories Most common example: search engines,
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
Text Based Information Retrieval Text Based Information Retrieval H02C8A H02C8B Marie-Francine Moens Karl Gyllstrom Katholieke Universiteit Leuven.
Stephen E. Lucas C H A P T E R McGraw-Hill © 2007 Stephen E. Lucas. All rights reserved. 6 6 Gathering Materials.
Web Search Algorithms By Matt Richard and Kyle Krueger.
Curtis Spencer Ezra Burgoyne An Internet Forum Index.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
Google’s Deep-Web Crawl By Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy August 30, 2008 Speaker : Sahana Chiwane.
Search Strategies for the World Wide Web. Search Engines  SearchEngineWatch.com will periodically rank the most effective search engines.  Some Search.
Digital Literacy Concepts and basic vocabulary. Digital Literacy Knowledge, skills, and behaviors used in digital devices (computers, tablets, smartphones)
Google PageRank Algorithm
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
Query Models CSCI 572: Information Retrieval and Search Engines Summer 2010.
March, 2007RCO LLC, RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about.
Apriori Algorithm and the World Wide Web Roger G. Doss CIS 734.
INTERNET VOCAB. WEB BROWSER An app for finding info on the web.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Smart Web Search Agents Data Search Engines >> Information Search Agents - Traditional searching on the Web is done using one of the following three: -
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.
Indian SEO Company Web:
Data mining in web applications
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
1.01- Understand Internet search tools and methods.
NOSQL databases and Big Data Storage Systems
Prepared by Rao Umar Anwar For Detail information Visit my blog:
ITE 130 Web Searching.
Web IR: Recent Trends; Future of Web Search
Example: Recognizing Five Letters
What is a Search Engine EIT, Author Gay Robertson, 2017.
Keyword Searching and Browsing in Databases using BANKS
Distributed Systems CS
Internet Search Methods
Internet Search Methods
Internet Search Methods
Internet Search Methods
6 Gathering Materials Chapter 6 Title Slide Gathering Materials
Information Retrieval and Web Design
Kittiya Poonsilp, Rujijan Vichivanives, Attakorn Poonsilp
Presentation transcript:

How Google and Microsoft taught search to “understand” the Web Austin Granger Chris Hesemann

Knowledge of the Web String searching does not always convey the true meaning of content. Search by knowledge, not by sub-string matching. Extracting and categorizing concepts allows for knowledge-based searching.

“Web of Concepts” Extract raw data (phone numbers, addresses, prices, etc.). Link related entities together (e.g., link actor to movie). Categorize information about each entity (what does this store sell, what has this author written, how highly are they reviewed?).

Search engines discover webpages, parse them into objects and data, process them and store the data, updating existing entries as needed. “Concept web” stored in vast databases. – Not traditional databases. – Based on graph theory, not relational model. – Database consists of nodes and links.

Memory Cloud To make this efficient we must traverse the entire graph in milliseconds. One solution – “memory cloud.” – Store entire database within memory at all times. Example: Google search “blowfish” – Results: Show company, encryption algorithm, sushi – New results: Suggest “pufferfish”

Limitations Currently only works in English. Including other languages increases the complexity exponentially, we’ve got a long way to go. Dissecting language to understand searches written in normal language, not just keywords. The Future of Knowledge Searching