Language Recognition… Searching with Precision Santa Clara, CA October 31, 2001 Julian Henkin Vice President, Worldwide Customer Services LexiQuest, Inc. Booth # 523
Topics for Discussion Critical Nature of Search Importance of Linguistics Language Recognition Case Studies
Critical Nature of Search “At least one-third of your visitors are going to use the search function as soon as they enter your site.” - Improving Your Site’s Search, The Information Standard, August 11, 2000 “On average, professional users spend 11 hours per week looking for information. 71% said they could not find what they were looking for.” - “Information Management Software,” Lazard Freres & Co. LLC, February 2001 “Ultimately, the return on investment (ROI) of corporate information systems cannot be solely derived from the cost of building populating and maintaining these systems. True ROI also reflects the ability of all classes of users to effectively use the information.” - “Looking for a Lifesaver?”, KM Magazine, August 1999
Challenges with Today’s Search Traditional and advanced methods (key word, Boolean searches, statistical and probability algorithms, concept agents, neural networks and pattern recognition) are limited in their ability to retrieve accurate results: Not intuitive for typical user so full breadth of capability is rarely utilized Do not provide any level of “understanding” of the text or of the concepts represented by the queries. Search is based solely or largely on the comparison of the character strings in both queries and text. Results often include a lot of “noise” (irrelevant results) and “silence” (accurate results are not found). What if you don’t know what you are looking for? What if you don’t know what you are looking for?
Importance of Linguistics Linguistic-based systems are knowledge-sensitive: the more information there is in their “dictionaries”, the better the quality : Natural Language interface is very intuitive for users, lets the system do the work Up to a 400% improvement in performance over traditional search engines (greater relevance, and precision) Can deliver multilingual and cross-lingual access
How Does Language Recognition Work? CONCEPTS Organizes concepts regardless of their language i.e., Table (Fr), Table (Eng), Mesa (Sp), Tavola (It) SEMANTIC Understands the meanings of words i.e., book=to register for a future activity vs. book= set of bound sheets of paper SYNTAX Understands a sentence’s or phrase’s structure and the “roles” of words i.e., subjects, verbs, objects; “to book” vs. “a book” MORPHOLOGY Word structure. Recognizes words (simple and compound) i.e., “to buy”, “bought” The Ladder of Language
1. Personalization (Sharing) 2. Codification (Capture, Structured Storage) 3. Discovery (Search, Retrieval) 4. Creation Innovation LexiQuest Mine LexiQuest Categorize LexiQuest Guide LexiQuest Respond 5. Capture Monitor “Knowledge Management is the collection of processes that govern the creation, dissemination, and utilization of knowledge.” “Knowledge is one, if not THE, principal factor that makes personal, organizational, and societal intelligent behavior possible.” “Organizations that have adopted this position (Chief Knowledge Officer) include Hoffman-LaRoche, GE Lighting, Xerox PARC, and several consultancies, including Ernst &Young, Gemini, and McKinsey” Five KM Activities
Enterprise Document Databases, Web sites or Repositories Domain 1 Limited amount of content Domain 3 Significant amount and depth of content Users who browse via a directory structure/taxonomy. Many Search Engines now leverage a taxonomy: improved accuracy Users who know what they are looking for and prefer using a search engine. Users who collectively ask the same narrow set of questions over and over again LexiQuest Mine LexiQuest Respond LexiQuest Categorize LexiQuest Guide Users who don’t know what they are looking for and need concepts illuminated. (Research) Domain 2 Limited amount of content Suite of Capabilities
User Experience “Who are the main ISP’s in the Far East?” Linguistic Analysis Accurate Results: Taiwanese Access Service Provider
Mine: A Research Tool Electronic Commerce NEAR Fraud
Mine’s Native Search 89 Documents
Guide’s Linguistic Expansion “consumers’ fraud protection online” – 21 documents “Swindle” returns this relevant document
Pharmaceutical Example “Cipro” expands to ciprofloxacin hydrochloride
Pharmaceutical Example “What antibiotic treats anthrax” Antibiotic expansions include ciprofloxicin, Cipro, ciprofloxicin hydrochloride
Quantitative Results 60% 50% 40% 30% 20% 10% 0% 400% more Accurate than Current Solutions number of answers retrieved Custom LexiQuest Guide Search Engine % of correct answers of all answers retrieved Ensures all relevant information is retrieved Reduces “noise” from irrelevant results
Julian Henkin Vice President, Worldwide Customer Services 641 Lexington Ave, 30 th Floor New York, NY x19 Booth # 523