WIRED Week 2 Syllabus Update Readings Overview.

Slides:



Advertisements
Similar presentations
Natural Language Processing WEB SEARCH ENGINES August, 2002.
Advertisements

Web Search - Summer Term 2006 III. Web Search - Introduction (Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
“ The Anatomy of a Large-Scale Hypertextual Web Search Engine ” Presented by Ahmed Khaled Al-Shantout ICS
Information Retrieval in Practice
Search Engines and Information Retrieval
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
Web Search – Summer Term 2006 III. Web Search - Introduction (Cont.) - Jeff Dean, Google's Systems Lab:
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Information Retrieval in Practice
Chapter 5 Searching for Truth: Locating Information on the WWW.
Search engines. The number of Internet hosts exceeded in in in in in
Overview of Search Engines
Internet Research Search Engines & Subject Directories.
Chapter 3 Search Before Google. Briefly describe search engines before Google Innovations (introduction of something new) Mistakes or things that these.
Introductions Search Engine Development COMP 475 Spring 2009 Dr. Frank McCown.
Chapter 5 Searching for Truth: Locating Information on the WWW.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Search Engines and Information Retrieval Chapter 1.
1. 2 Search Engine Marketing What, why and how? IAB Peru,
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Search Engine Interfaces search engine modus operandi.
Overview What is a Web search engine History Popular Web search engines How Web search engines work Problems.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway
Search Engine Architecture
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Web Search Engines AGED Search Engines Search engines (most have directories, too)  Yahoo  AltaVista  Lycos
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Search the Web Looking for a Needle in a Haystack Cut the Haystack Down to Size.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
WIRED Future Quick review of Everything What I do when searching, seeking and retrieving Questions? Projects and Courses in the Fall Course Evaluation.
The Anatomy of a Large-Scale Hypertextual Web Search Engine (The creation of Google)
Seminar on seminar on Presented By L.Nageswara Rao 09MA1A0546. Under the guidance of Ms.Y.Sushma(M.Tech) asst.prof.
1 Chapter 5 (3 rd ed) Your library is an excellent resource tool. Your library is an excellent resource tool.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
Information Retrieval in Practice
Information Retrieval in Practice
Presentation by: Rebecca Chambers WebDuck Designs
Information Architecture
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Search Engine Architecture
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
Improving your list of results in EBSCO Discovery Service (EDS)
Map Reduce.
Information Retrieval and Web Search
Search Engine Architecture
Web Design/Internet Essentials
Information Retrieval and Web Search
Federated & Meta Search
Information Retrieval and Web Search
Search Engines & Subject Directories
Information Retrieval
Search Before Google Computer Science 49S
Lesson Objectives Aims You should know about: – Web Technologies
IL Step 3: Using Bibliographic Databases
Searching for Truth: Locating Information on the WWW
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Introduction to Information Retrieval
Search Engines & Subject Directories
Search Engines & Subject Directories
Searching for Truth: Locating Information on the WWW
Search Engine Architecture
Searching for Truth: Locating Information on the WWW
Recuperação de Informação
Information Retrieval and Web Search
Presentation transcript:

WIRED Week 2 Syllabus Update Readings Overview

Why IR? IR originally mostly for systems, not people IR in the last 25 years: classification and categorization systems and languages user interfaces and visualization A small world of concern The Web changed everything Huge amount of accessible information Varied information sources Relatively easy to look for information Improving IR means improving learning Digital technology changes everything (again) We cut out the middle man and pass the savings on to you!

WIRED Focus Information Retrieval: representation, storage, organization of, and access to information items Focus is on the user information need User information need: Find all docs containing information on Austin which: Are hosted by utexas.edu Discuss restaurants Emphasis is on the retrieval of information (not data, not just a keyword match)

The Search Who is John Battelle? Magazine Editor: WIRED, The Industry Standard Web 2.0 conference organizer Business 2.0 magazine columnist Federated Media Publishing Boingboing.net “manager”

Database of Intentions What do you think the database of intentions is? Is it more than Google’s Zeitgeist? What we’re thinking about and interested in. Everything we want to know and when we want to know it. “the aggregate results of every search ever entered, every result list ever tendered, and every path taken as a result” (Battelle, p 6) “a real time history of post-Web culture” (p 6) What other databases like this are there? How is this possible?

Searchiness? The “tasking” of search? Everything could be a search task? Every task has an ad associated with it? Our expectations are met and made with search. How would the Web work without search? Yahoo and email links, LOTS of email links You are your clickstream? Products & services based on it “marketing, media, technology, pop culture, international law, and civil liberties” (p 13)

Elements of Search Crawl Index Runtime system (query processor) Segments the data Analyzes the Crawl Optimizes everything Interface Query Reults Users

Search before Google Traditional systems: SMART (Salton) Strongly typed information, (traditional databases) Not always interactive or easy to use Library Catalogs online Controlled vocabulary & limited records Internet: Archie & Veronica Titles only (mostly) over text Web: WWW Wanderer, Web Crawler Full text, HTML & links

AltaVista gets serious Web now large enough to be a challenge Now enough content that you’d want to search it Costs of hardware & bandwidth falling Parallel crawlers Significant CPU resources 1995 = 16 million documents Why didn’t people get it ?

The Web goes Pro Lycos Yahoo AOL Excite Anchor text & content location context Yahoo Directory & clean interface for browsing links Adversiting & user (logs) analysis AOL Gateway to the internet for many Excite Consumer-driven, word relationships Acquisitions of Magellan, WebCrawler ++ MyExcite - the Portal @Home (compete with AOL)

Google is Born Larry Page & Sergey Brin Links are the key (Bibliometrics) Impact factor (“link it if you like it”) Patterns of citation (links) expand the text Defending & setting the context of your work by associating it with others Backrub Crawl pages, store links, analyze them, publish Large computing challenges PageRank Link counts with a recipe for deriving (relative) value Value is who & and their rank too

Google goes Pro More resources for more data Help with (significant) analysis design Lack of commercial approach may have been a strength Not ads, but just good search Simple (non-existent) design of interface had an impact More people getting online Broadband adoption & stabilizing browsers Growing content (to say the least)

Assignments Read weekly Primary Readings & Participate in class discussions 10% Re-design Search Results interface 10% Web (log) analytics 25% “Google 2010” (5 page paper) 10% Class Topic Presentation 15% Main Project 30%

Projects and/or Papers Overview How can (Web) IR be better? Better IR models Better User Interfaces More to find vs. easier to find Scriptable applications New interfaces for applications New datasets for applications