Avi Rappoport, Search Tools Consulting Search and Discovery Tools A View into the Future.

Slides:



Advertisements
Similar presentations
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Advertisements

Business Development Suit Presented by Thomas Mathews.
MICHAEL MARINO CSC 101 Whats New in Office Office Live Workspace 3 new things about Office Live Workspace are: Anywhere Access Store Microsoft.
Basic Searching Engineering Village. Agenda What is Engineering Village? Setting up a personal account Searching Engineering Village How to.
Enterprise Content Management Departmental Solutions Enterprisewide Document/Content Management at half the cost of competitive systems ImageSite is:
Advanced Searching Engineering Village.
Engineering Village ™ Basic Searching.
“ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Engineering Village ™ ® Basic Searching On Compendex ®
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Information Retrieval in Practice
Search Engines and Information Retrieval
1. Failure is when users do not feel they get what they paid for. 2. Failure is when the overall organization fails to adopt the solution.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Basics Computer Internet Search Strategy. Computer Basics IP address: Internet Protocol Address An identifier for a computer or device on a network The.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
Content Management Systems Digital Resources for Research in the Humanities 2001.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Introduction Web Development II 5 th February. Introduction to Web Development Search engines Discussion boards, bulletin boards, other online collaboration.
Libraries and Institutional Content Management Systems
How Search Engines Work: A Technology Overview Avi Rappoport Search Tools Consulting UC Berkeley SIMS class.
Overview of Search Engines
Databases & Data Warehouses Chapter 3 Database Processing.
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
1 Web Developer Foundations: Using XHTML Chapter 11 Web Page Promotion Concepts.
Trimble Connected Community
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Search Engines and Information Retrieval Chapter 1.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Using Taxonomies Effectively in the Organization v. 2.0 KnowledgeNets 2001 Vivian Bliss Microsoft Knowledge Network Group
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
Web of Knowledge Service for UK Education April 2007 An Overview Web of Knowledge Support Officer
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
SUMMON ® 2.0 DISCOVERY REINVENTED. What is Summon 2.0? A new, streamlined, modern interface New and enhanced features providing layers of contextual guidance.
What to Know: 9 Essential Things to Know About Web Searching Janet Eke Graduate School of Library and Information Science University of Illinois at Champaign-Urbana.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Electronic Scriptorium, Ltd. AIIM Minnesota Chapter Metadata and Taxonomy Presentation Copyright Electronic Scriptorium, Ltd. All rights reserved, 1991.
Module 10 Administering and Configuring SharePoint Search.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Evaluating & Maintaining a Site Domain 6. Conduct Technical Tests Dreamweaver provides many tools to assist in finalizing and testing your website for.
Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining.
1 © Xchanging 2010 no part of this document may be circulated, quoted or reproduced without prior written approval of Xchanging. MOSS Training – UI customization.
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
How EPA/ORD Moved to Drupal 7 Jessica Dearie U.S. EPA, Office of Research and Development Office of Science Information Management.
Avi Rappoport, SearchTools.com InternetWorld NY 2001 Site Search That Doesn't Stink.
Information Retrieval in Practice
SP Business Suite Deployment Kick-off
Information Architecture
Internet Made Easy! Make sure all your information is always up to date and instantly available to all your clients.
Summon® 2.0 Discovery Reinvented
Search Engine Architecture
Federated & Meta Search
Taxonomies, Lexicons and Organizing Knowledge
Search Techniques and Advanced tools for Researchers
Zetoc: Electronic Table of Contents from the British Library
How Search Engines Work: A Technology Overview
Introduction to Information Retrieval
Scopus - Elsevier (Advanced Course: Module 8)
Presentation transcript:

Avi Rappoport, Search Tools Consulting Search and Discovery Tools A View into the Future

2 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Defining Intranet Search Searching internal network –Intranet and file servers – archives, Lotus Notes –External sites or feeds Using Internet-developed search tools –Protocols such as TCP/IP and HTTP –Thin client = Web browser –Search engine functionality and interface Like Google, Yahoo, AskJeeves

3 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Present vs. Future 80/20 rule –Solve the easy problems now –Simple search “Information needs” -- a non-trivial question Technology is not a panacea Complex Research

4 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Three Parts of Usable Search content search functionality user interface Like an iceberg, search is mostly invisible

5 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery: Finding What You Have Core Intranet –Varies with intranet history –HR and Communications –Facilities Support International Public sites Partner and Extranet Sites

6 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery: Good, Bad and Ugly Some items should be there but aren't –Problem links: bad syntax, JavaScript, etc. –Wrongly configured robots.txt –Graphical text, funky PDFs Some items shouldn't be there all –Confidential information –Early versions of documents –Very local content (4,000 tech support cases)

7 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery: What to Look For Documents with and without metadata –Title tag is the most important Frequency of updates –Dynamic servers don't show mod date Incoming and outgoing links Languages and character sets Errors –Bad links –Access control

8 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search: Intranet Information Needs Don’t assume you know - invest in asking –Wide target for surveys –Outlying offices –Key audiences Data mining –Intranet user feedback –Search log analysis –Phone and trends

9 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Common Intranet Searches Employee and departmental contacts HR issues –Holidays, benefits, evaluations, surveys Office functions –Heating & cooling, training, menus Technical information –Product data, support, services Topical research (less frequent)

10 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Real Intranet Usage Example 3business cards 7fedex 8webex 9expense report 11training 12401k 13pto 14accounts payable 15holiday party 17bereavement 18payroll 20holiday

11 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Most Frequent Search Problems Useful content not indexed Confusing interfaces Complicated query languages Mysterious relevance ranking Not enough human judgment Excess complexity Lack of user testing and log analysis

12 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Defining Search Priorities Identify pain points –Common information needs –Frequently-changing content –Confusing interfaces Define audiences –Self-selected search users –People who have significant problems Work with content creators Do the easy stuff first

13 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery and Indexing Index almost everything –Invest in understanding content –Find new valuable data –Avoid duplication Work with content creators –Encourage focused pages with titles Keep the index current –Update quickly in times of change Hide old stuff in archives

14 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Improve Basic Searching Offer a search field in all navigation bars –Long search fields are best –Minimize complexity Default to keyword matching Simplify search results pages –Show intranet navigation –Provide a filled-in search box –Show match pages with context –Avoid clutter

15 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Keep Search Metrics Number of searches per day / week / month –Correlate with corporate trends Percentage of frequent queries –Should go down if navigation improves Problems –No-matches –Server errors Audience information

16 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search Log Analysis What people are looking for –What words do they use? –Are they getting good results? What they click on –Candidates for search suggestions (best bets) Improve taxonomy & controlled vocabulary Analyze search and information architecture –Search default to "match all words"? –Add high-level navigation link?

17 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Continue Intranet Discovery Track new content Use APIs, including Web Services –Index CMSs and other data stores Deal with date problems Linguistics –Character set recognition and correct tokenization –Language recognition Document attributes Stemming

18 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Security and Access Control Be careful what you index –Reverse-engineering via search HTTPS for showing SSL results Access control & authentication Search security design –Entire engine / index –Collection security –Hit-level (document) access control

19 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Simplify Searching Minimal query expansion –Stemming (light pluralization) –Explain anything Offer options don't force them –Search suggestions (Best Bets) –Synonyms (can get 20% usage) –Spell-checking (can get 15% usage)

20 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Sometimes, Advanced Search Works

21 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Relevance Ranking: KISS Keep It Simple –No complex algorithms –Start with basic query word matches Use Heuristics –Exact phrase match in title is usually best –Phrase matches are good –Metadata matches are good –Take advantage of intranet IA, taxonomies –Leverage human judgment Transparency: mark match terms in context

22 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Improve Results Page Layout Should fit with look and feel of intranet Navigation Search Results Header –Search field –Number and type of matches –Results navigation Search Results Items –Use whatever content you have –Provide context for result

23 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Problem Results Page

24 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Better Results Page

25 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Why Searches Fail Vocabulary mismatch Spelling errors Wrong scope Empty search Query requirements not met Software problems

26 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Dealing with Search Failure Improve the no-matches page –Standard design and navigation links –Display a search field –Describe contents covered on site and search –Link to specialized search engines Log analysis –Track frequent failures –Add synonyms, suggestions or intranet content

27 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Unhelpful No-Matches Pages

28 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Better No-Matches Page

29 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search Engine Software Requirements Flexible and configurable indexer –Integration and import modules for data sources –Current file formats (e.g. Acrobat 6) Good defaults for interface, retrieval and relevance Override default settings Security & access control Admin interface Logging and analysis tools Scalable

30 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search and Information Architecture IA: the art and science of organizing and labeling information Search provides ad-hoc access, reduces the need to organize everything perfectly Search can take advantage of IA –Less duplication and overlap –Fewer gaping holes in coverage –Controlled vocabulary –Labels can explain search results

31 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search and Taxonomies AKA ontologies, cataloging, categorization, classification, directories, hierarchies Taxonomy: organizing information into levels of named categories, like Yahoo! Vital to navigate within large data sets No such thing as a finished taxonomy –A resource-intensive challenge –Language and requirements change Multiple topic areas, multiple taxonomies

32 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search & Taxonomy Work Together Search –Crosses categories –Supplements drill-down –Handles non-standard vocabulary Taxonomy Categories –Create subset for precise search –Provide valuable context in search results Refer to the same controlled vocabulary

33 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Future Discovery & Indexing Tools Integration with CMS and DMS Metadata –Entity Extraction –Date extraction and tracking –Other facets Automatic Chunking –Topical sections of long documents

34 Intranets 2004 / © Avi Rappoport, Search Tools Consulting New Tools for Better Search Grouping results by location Faceted Metadata Search / Browse –Expose available structure –Allow users to drill down intelligently Federated search –Search across multiple engines –“Best Source” problem Personalization - user control

35 Intranets 2004 / © Avi Rappoport, Search Tools Consulting In-Depth Research Medical diagnosis Scientific articles & experiments Investment Business intelligence Market research Patent searches Journalism, sociology, history Politics and current events

36 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Research Requirements Full recall - everything on a topic Organize results Save searches Understand topic within context Find the experts Revise and extend queries Share knowledge Get alerts for new information

37 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Tools for Better Research Federated searching –Research and purchased reports –Databases and archives –News, RSS and other information streams Complex query-building Visualization Networking Collaboration

38 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Checklist for Intranet Search Keep researching user needs Provide wide coverage in the index Make the search field ubiquitous Keep it simple and fast Tune relevance ranking Take advantage of IA and taxonomies Offer suggestions Usable results and no-matches pages Search log analysis for continuous improvement

39 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Apply the Right Tools Simple search for the wide intranet –Rich indexing –Leverage metadata –Solve common problems –Tune for employee needs Research tools when appropriate –Concepts and topics –Visualization –Networking