Download presentation
Presentation is loading. Please wait.
Published byElla Norman Modified over 9 years ago
1
Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional Services http://www.kapsgroup.com
2
2 Agenda Introduction A Semantic Platform – What and Why Text Analytics – What and Why – Getting Started with Text Analytics Building on the Platform: – Search – Range of Apps Conclusion
3
3 Introduction: KAPS Group Knowledge Architecture Professional Services – Network of Consultants Applied Theory – Faceted taxonomies, complexity theory, natural categories, emotion taxonomies Services: – Strategy – IM & KM - Text Analytics, Social Media, Integration – Taxonomy/Text Analytics development, consulting, customization – Text Analytics Quick Start – Audit, Evaluation, Pilot – Social Media: Text based applications – design & development Partners – Smart Logic, Expert Systems, SAS, SAP, IBM, FAST, Concept Searching, Attensity, Clarabridge, Lexalytics Clients: – Genentech, Novartis, Northwestern Mutual Life, Financial Times, Hyatt, Home Depot, Harvard Business Library, British Parliament, Battelle, Amdocs, FDA, GAO, World Bank, etc. Presentations, Articles, White Papers – www.kapsgroup.comwww.kapsgroup.com
4
4 Building a Foundation for Info Apps What is a Semantic Platform? Semantic Layer = Taxonomies, Metadata, Vocabularies + Text Analytics – adding cognitive science Technology Layer – Search, Content Management, SharePoint, Intranets Publishing process, multiple users & info needs – Hybrid human automatic structure (tagging) Infrastructure – Not an Application – Business / Library / KM / EA and IT Building on the Foundation – Info Apps (Search-based Applications) Foundation of foundation – Text Analytics
5
5 Building a Foundation for Info Apps Why a Semantic Platform Search Failed – lack of semantics – Results of Find Wise survey – deep dissatisfaction – Ten years of development = ? Content Management under-performing – lack of semantics Taxonomy and Metadata – a solution but - Failed – Taxonomy – formal model of a domain – Library science good for some things – indexing, etc. Semantics is about language, meaning, information – And structure = taxonomy Plus – Need cognitive science – how people think – Text Analytics Solution = Strategic Vision + Quick Start
6
6 Building a Foundation for Info Apps Text Analytics Features Noun Phrase Extraction / Fact Extraction – Catalogs with variants, rule based dynamic – Relationships of entities – Ontologies of people-organizations, etc. Sentiment Analysis – Products and Phrases – Statistics, Dictionaries, & rules – Positive and Negative Summarization – replace snippets Auto-categorization – built on a taxonomy – Training sets, Terms, Semantic Networks – Rules: AND, OR, NOT, DIST, PARAGRAPH, SENTENCE – Foundation – subjects, disambiguation, add intelligence to all Ontologies – fact extraction + reasoning about relationships Text Mining – NLP, machine learning, predictive analytics
7
Building a Foundation for Info Apps Adding Structure to Unstructured Content How do you bridge the gap – taxonomy to documents? Tagging documents with taxonomy nodes is tough – And expensive – central or distributed Library staff –experts in categorization not subject matter – Too limited, narrow bottleneck – Often don’t understand business processes and business uses Authors – Experts in the subject matter, terrible at categorization – Intra and Inter inconsistency, “intertwingleness” – Choosing tags from taxonomy – complex task – Folksonomy – almost as complex, wildly inconsistent – Resistance – not their job, cognitively difficult = non-compliance Text Analytics is the answer(s)! 7
8
Building a Foundation for Info Apps Adding Structure to Unstructured Content Text Analytics and Taxonomy Together – Platform – Text Analytics provides the power to apply the taxonomy – And metadata of all kinds – Consistent in every dimension, powerful and economic Hybrid Model – Publish Document -> Text Analytics analysis -> suggestions for categorization, entities, metadata - > present to author – Cognitive task is simple -> react to a suggestion instead of select from head or a complex taxonomy – Feedback – if author overrides -> suggestion for new category – Facets – Requires a lot of Metadata - Entity Extraction feeds facets Hybrid – Automatic is really a spectrum – depends on context – Automatic – adding structure at search results 8
9
Quick Start for Text Analytics Step 1 : Start with Self Knowledge Ideas – Content and Content Structure – Map of Content – Tribal language silos – Structure – articulate and integrate – Taxonomic resources People – Producers & Consumers – Communities, Users, Central Team Activities – Business processes and procedures – Semantics, information needs and behaviors – Information Governance Policy Technology – CMS, Search, portals, text analytics – Applications – BI, CI, Semantic Web, Text Mining 9
10
Quick Start for Text Analytics Step 2: S oftware Evaluation: Different Type of Evaluation Traditional Software Evaluation - Start – Filter One- Ask Experts - reputation, research – Gartner, etc. Market strength of vendor, platforms, etc. Feature scorecard – minimum, must have, filter to top 6 – Filter Two – Technology Filter – match to your overall scope and capabilities – Filter not a focus – Filter Three – In-Depth Demo – 3-6 vendors Reduce to 1-3 vendors Vendors have different strengths in multiple environments – Millions of short, badly typed documents, Build application – Library 200 page PDF, enterprise & public search Essential Step – POC or Pilot – search or first Info App 10
11
Quick Start for Text Analytics Step 3: Proof of Concept / Quick Start POC – understand how text analytics can work in your environment Learn the software – internal resources trained by doing Learn the language – syntax (Advanced Boolean) Learn categorization and extraction Good catego rization rules – Balance of general and specific – Balance of recall and precision Develop or refine taxonomies for categorization POC – can be the Quick Start or the First Application 11
12
Development, Implementation Quick Start – First Application: Search and TA Simple Subject Taxonomy structure – Easy to develop and maintain Combined with categorization capabilities – Added power and intelligence Combined with people tagging, refining tags Combined with Faceted Metadata – Dynamic selection of simple categories – Allow multiple user perspectives Can’t predict all the ways people think Monkey, Banana, Panda Combined with ontologies and semantic data – Multiple applications – Text mining to Search – Combine search and browse 12
13
13 Building a Foundation for Info Apps What are Info Apps? Search-based Applications Plus E-Discovery, Behavior Prediction, document duplication, BI & CI, etc. Legal Review – Significant trend – computer-assisted review (manual =too many) – TA- categorize and filter to smaller, more relevant set – Payoff is big – One firm with 1.6 M docs – saved $2M Expertise Location – Data (HR, project) plus text – authored documents – subject & level Financial Services – Combine unstructured text (why) and structured data (what) – Anti-Money Laundering
14
14 Building a Foundation for Info Apps Pronoun Analysis: Fraud Detection - Enron Emails Patterns of “Function” words reveal wide range of insights Function words = pronouns, articles, prepositions, conjunctions, etc. – Used at a high rate, short and hard to detect, very social, processed in the brain differently than content words Areas: sex, age, power-status, personality – individuals and groups Lying / Fraud detection: Documents with lies have – Fewer and shorter words, fewer conjunctions, more positive emotion words – More use of “if, any, those, he, she, they, you”, less “I” – More social and causal words, more discrepancy words Current research – 76% accuracy in some contexts Text Analytics can improve accuracy and utilize new sources
15
15 Conclusions Info Apps based on search and search needs help Text analytics with taxonomy & metadata = semantic platform – Formal and informal language and cognition Semantic Infrastructure – Knowledge Audit -> Content, People, Technology, Processes Strategic Vision – Integration of text analytics search, content management – Hybrid Model of tagging – best of human & machine – Build integrated Info Apps Platform vs. Apps = Yes Thing Big (Semantics), Build Small, Build Integrated
16
Questions? Tom Reamy tomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.