Avi Rappoport, Search Tools Consulting Search and Discovery Tools A View into the Future
2 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Defining Intranet Search Searching internal network –Intranet and file servers – archives, Lotus Notes –External sites or feeds Using Internet-developed search tools –Protocols such as TCP/IP and HTTP –Thin client = Web browser –Search engine functionality and interface Like Google, Yahoo, AskJeeves
3 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Present vs. Future 80/20 rule –Solve the easy problems now –Simple search “Information needs” -- a non-trivial question Technology is not a panacea Complex Research
4 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Three Parts of Usable Search content search functionality user interface Like an iceberg, search is mostly invisible
5 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery: Finding What You Have Core Intranet –Varies with intranet history –HR and Communications –Facilities Support International Public sites Partner and Extranet Sites
6 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery: Good, Bad and Ugly Some items should be there but aren't –Problem links: bad syntax, JavaScript, etc. –Wrongly configured robots.txt –Graphical text, funky PDFs Some items shouldn't be there all –Confidential information –Early versions of documents –Very local content (4,000 tech support cases)
7 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery: What to Look For Documents with and without metadata –Title tag is the most important Frequency of updates –Dynamic servers don't show mod date Incoming and outgoing links Languages and character sets Errors –Bad links –Access control
8 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search: Intranet Information Needs Don’t assume you know - invest in asking –Wide target for surveys –Outlying offices –Key audiences Data mining –Intranet user feedback –Search log analysis –Phone and trends
9 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Common Intranet Searches Employee and departmental contacts HR issues –Holidays, benefits, evaluations, surveys Office functions –Heating & cooling, training, menus Technical information –Product data, support, services Topical research (less frequent)
10 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Real Intranet Usage Example 3business cards 7fedex 8webex 9expense report 11training 12401k 13pto 14accounts payable 15holiday party 17bereavement 18payroll 20holiday
11 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Most Frequent Search Problems Useful content not indexed Confusing interfaces Complicated query languages Mysterious relevance ranking Not enough human judgment Excess complexity Lack of user testing and log analysis
12 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Defining Search Priorities Identify pain points –Common information needs –Frequently-changing content –Confusing interfaces Define audiences –Self-selected search users –People who have significant problems Work with content creators Do the easy stuff first
13 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Discovery and Indexing Index almost everything –Invest in understanding content –Find new valuable data –Avoid duplication Work with content creators –Encourage focused pages with titles Keep the index current –Update quickly in times of change Hide old stuff in archives
14 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Improve Basic Searching Offer a search field in all navigation bars –Long search fields are best –Minimize complexity Default to keyword matching Simplify search results pages –Show intranet navigation –Provide a filled-in search box –Show match pages with context –Avoid clutter
15 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Keep Search Metrics Number of searches per day / week / month –Correlate with corporate trends Percentage of frequent queries –Should go down if navigation improves Problems –No-matches –Server errors Audience information
16 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search Log Analysis What people are looking for –What words do they use? –Are they getting good results? What they click on –Candidates for search suggestions (best bets) Improve taxonomy & controlled vocabulary Analyze search and information architecture –Search default to "match all words"? –Add high-level navigation link?
17 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Continue Intranet Discovery Track new content Use APIs, including Web Services –Index CMSs and other data stores Deal with date problems Linguistics –Character set recognition and correct tokenization –Language recognition Document attributes Stemming
18 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Security and Access Control Be careful what you index –Reverse-engineering via search HTTPS for showing SSL results Access control & authentication Search security design –Entire engine / index –Collection security –Hit-level (document) access control
19 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Simplify Searching Minimal query expansion –Stemming (light pluralization) –Explain anything Offer options don't force them –Search suggestions (Best Bets) –Synonyms (can get 20% usage) –Spell-checking (can get 15% usage)
20 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Sometimes, Advanced Search Works
21 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Relevance Ranking: KISS Keep It Simple –No complex algorithms –Start with basic query word matches Use Heuristics –Exact phrase match in title is usually best –Phrase matches are good –Metadata matches are good –Take advantage of intranet IA, taxonomies –Leverage human judgment Transparency: mark match terms in context
22 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Improve Results Page Layout Should fit with look and feel of intranet Navigation Search Results Header –Search field –Number and type of matches –Results navigation Search Results Items –Use whatever content you have –Provide context for result
23 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Problem Results Page
24 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Better Results Page
25 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Why Searches Fail Vocabulary mismatch Spelling errors Wrong scope Empty search Query requirements not met Software problems
26 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Dealing with Search Failure Improve the no-matches page –Standard design and navigation links –Display a search field –Describe contents covered on site and search –Link to specialized search engines Log analysis –Track frequent failures –Add synonyms, suggestions or intranet content
27 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Unhelpful No-Matches Pages
28 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Better No-Matches Page
29 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search Engine Software Requirements Flexible and configurable indexer –Integration and import modules for data sources –Current file formats (e.g. Acrobat 6) Good defaults for interface, retrieval and relevance Override default settings Security & access control Admin interface Logging and analysis tools Scalable
30 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search and Information Architecture IA: the art and science of organizing and labeling information Search provides ad-hoc access, reduces the need to organize everything perfectly Search can take advantage of IA –Less duplication and overlap –Fewer gaping holes in coverage –Controlled vocabulary –Labels can explain search results
31 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search and Taxonomies AKA ontologies, cataloging, categorization, classification, directories, hierarchies Taxonomy: organizing information into levels of named categories, like Yahoo! Vital to navigate within large data sets No such thing as a finished taxonomy –A resource-intensive challenge –Language and requirements change Multiple topic areas, multiple taxonomies
32 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Search & Taxonomy Work Together Search –Crosses categories –Supplements drill-down –Handles non-standard vocabulary Taxonomy Categories –Create subset for precise search –Provide valuable context in search results Refer to the same controlled vocabulary
33 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Future Discovery & Indexing Tools Integration with CMS and DMS Metadata –Entity Extraction –Date extraction and tracking –Other facets Automatic Chunking –Topical sections of long documents
34 Intranets 2004 / © Avi Rappoport, Search Tools Consulting New Tools for Better Search Grouping results by location Faceted Metadata Search / Browse –Expose available structure –Allow users to drill down intelligently Federated search –Search across multiple engines –“Best Source” problem Personalization - user control
35 Intranets 2004 / © Avi Rappoport, Search Tools Consulting In-Depth Research Medical diagnosis Scientific articles & experiments Investment Business intelligence Market research Patent searches Journalism, sociology, history Politics and current events
36 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Research Requirements Full recall - everything on a topic Organize results Save searches Understand topic within context Find the experts Revise and extend queries Share knowledge Get alerts for new information
37 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Tools for Better Research Federated searching –Research and purchased reports –Databases and archives –News, RSS and other information streams Complex query-building Visualization Networking Collaboration
38 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Checklist for Intranet Search Keep researching user needs Provide wide coverage in the index Make the search field ubiquitous Keep it simple and fast Tune relevance ranking Take advantage of IA and taxonomies Offer suggestions Usable results and no-matches pages Search log analysis for continuous improvement
39 Intranets 2004 / © Avi Rappoport, Search Tools Consulting Apply the Right Tools Simple search for the wide intranet –Rich indexing –Leverage metadata –Solve common problems –Tune for employee needs Research tools when appropriate –Concepts and topics –Visualization –Networking