Text Analytics World San Francisco – March 31, :15-4:45pm Speaker: Bryan Bell, Executive Vice President, Expert System USA What is in Your Business Requirement: Searching or Finding? Enterprise Search Product Demonstration: The Google Search Appliance (GSA) integrated with a semantic technology platform. 1.Internal and external information comes at us faster than we can keep up with. 2.Business expectations for deploying solutions, using enterprise search and content navigation systems to capture the hidden value of strategic information. 3.CONTEXT: Exploiting deep linguistic analysis, combined with semantics offers the ability to create contextually correct metadata. 4.Dynamically enrich content with contextually relevant metadata and deploy as the heart of a knowledge management applications and the Google Search Appliance.
1. Internal and external information comes at us faster than we can keep up with. 80 – 90% is unstructured text.
Zettabyte 1,000,000,000,000,000,000, 000 bytes
4 The Google crawler visits 20 billion web sites a day. The search engine has located more than 30 trillion unique URLs. Processes 100 billion searches every month. 3.3 billion searches per day. Over 38,000 thousand searches per second. A single Google query uses 1,000 computers to retrieve an answer. This volume combined with the PageRank algorithm… PR(A) = (1-d) + d (PR(T1)/C(T1) + PR(Tn)/C(Tn)) …. is why Google is so good on the internet. 16% to 20% of queries that get asked every day have never been asked before. Amit Singhal, Senior Vice President of development, Google Search August 2012 The Internet
2. Deploying internal enterprise search engine / content navigation system to capture and share the hidden value of the information that is available to the company. The intranet / corporate portal
2. Deploying internal enterprise search engine / content navigation system to capture and share the hidden value of the information that is available to the company. The intranet / corporate portal
“Our search stinks! I want it to work like Google.”
9 Zettabyte 1,000,000,000,000,000,000,000 bytes Good news: PR(A) = (1-d) + d (PR(T1)/C(T1) PR(Tn)/C(Tn)) Don’t have 3.3 billion searches per day. Don’t have 38,000 thousand searches per second. Don’t have 1,000 computers to retrieve an answer.
10 Zettabyte 1,000,000,000,000, 000,000,000 bytes Key words No metadata Poor metadata Inconsistent
11 Zettabyte 1,000,000,000,000, 000,000,000 bytes Key words No metadata Poor metadata Inconsistent = POOR CONTENT FINDABILITY
12 stock People are able to disambiguate “on the fly”, but machines cannot. Key words vs. Context Language ambiguity
13 People are able to disambiguate “on the fly”, but machines cannot. stock apple Key words vs. Context Language ambiguity
14 stock apple Apple People are able to disambiguate “on the fly”, but machines cannot. Key words vs. Context Language ambiguity
15 stock apple Apple “I bought 10,000 shares of stock in Apple.” “I have 10,000 apples in stock.” People are able to disambiguate “on the fly”, but machines cannot. Context is King
3. Exploiting deep linguistic analysis, combined with semantics. 4. Dynamically enrich content with contextually relevant metadata. How is word context established?
17 Deep linguistic analysis combined with semantics. It helps computers read information in much the same way as people. Semantics is driven by words. Semantics is DNA for language. How is word context established?
Morphological analysisword formsdog, dog-catcher, doggy bag Grammatical analysisparts of speech"There are 40 rows in the table." (noun) "She rows 5 times a week." (verb) Logical analysis word relationships "The car I bought, to replace my Chrysler, stinks." Semantic analysisword context"I bought 10,000 shares of stock in Apple." "I have 10,000 apples in stock." "I used chicken broth for my soup stock." Deep linguistic analysis of words to achieve word disambiguation. How is word context established and deployed with the GSA?
20 Linguistic and semantic analysis engine
21 Linguistic analysis
22 Linguistic analysis
23 Linguistic analysis
24 Linguistic analysis
25 Linguistic analysis to bare arms
26 Linguistic analysis to bare arms
27 Case Study: GSA – Google Search Appliance What is in Your Business Requirement? Searching or Finding.
28 Enhancing ROI from existing applications Integrations and connectors will be used bridge the gap. Semantics will be used to strengthen analytic capabilities and increase the ROI of platforms such as: Microsoft SharePoint, Google i2 Analyst’s Notebook, …and more.
29 GSA with semantics metadata
Contacts Thank you Bryan