Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.

Similar presentations


Presentation on theme: "Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional."— Presentation transcript:

1 Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional Services http://www.kapsgroup.com

2 2 Agenda  Introduction  A Semantic Platform – What and Why  Text Analytics – What and Why – Getting Started with Text Analytics  Building on the Platform: – Search – Range of Apps  Conclusion

3 3 Introduction: KAPS Group  Knowledge Architecture Professional Services – Network of Consultants  Applied Theory – Faceted taxonomies, complexity theory, natural categories, emotion taxonomies  Services: – Strategy – IM & KM - Text Analytics, Social Media, Integration – Taxonomy/Text Analytics development, consulting, customization – Text Analytics Quick Start – Audit, Evaluation, Pilot – Social Media: Text based applications – design & development  Partners – Smart Logic, Expert Systems, SAS, SAP, IBM, FAST, Concept Searching, Attensity, Clarabridge, Lexalytics  Clients: – Genentech, Novartis, Northwestern Mutual Life, Financial Times, Hyatt, Home Depot, Harvard Business Library, British Parliament, Battelle, Amdocs, FDA, GAO, World Bank, etc.  Presentations, Articles, White Papers – www.kapsgroup.comwww.kapsgroup.com

4 4 Building a Foundation for Info Apps What is a Semantic Platform?  Semantic Layer = Taxonomies, Metadata, Vocabularies + Text Analytics – adding cognitive science  Technology Layer – Search, Content Management, SharePoint, Intranets  Publishing process, multiple users & info needs – Hybrid human automatic structure (tagging)  Infrastructure – Not an Application – Business / Library / KM / EA and IT  Building on the Foundation – Info Apps (Search-based Applications)  Foundation of foundation – Text Analytics

5 5 Building a Foundation for Info Apps Why a Semantic Platform  Search Failed – lack of semantics – Results of Find Wise survey – deep dissatisfaction – Ten years of development = ?  Content Management under-performing – lack of semantics  Taxonomy and Metadata – a solution but - Failed – Taxonomy – formal model of a domain – Library science good for some things – indexing, etc.  Semantics is about language, meaning, information – And structure = taxonomy Plus – Need cognitive science – how people think – Text Analytics  Solution = Strategic Vision + Quick Start

6 6 Building a Foundation for Info Apps Text Analytics Features  Noun Phrase Extraction / Fact Extraction – Catalogs with variants, rule based dynamic – Relationships of entities – Ontologies of people-organizations, etc.  Sentiment Analysis – Products and Phrases – Statistics, Dictionaries, & rules – Positive and Negative  Summarization – replace snippets  Auto-categorization – built on a taxonomy – Training sets, Terms, Semantic Networks – Rules: AND, OR, NOT, DIST, PARAGRAPH, SENTENCE – Foundation – subjects, disambiguation, add intelligence to all  Ontologies – fact extraction + reasoning about relationships  Text Mining – NLP, machine learning, predictive analytics

7 Building a Foundation for Info Apps Adding Structure to Unstructured Content  How do you bridge the gap – taxonomy to documents?  Tagging documents with taxonomy nodes is tough – And expensive – central or distributed  Library staff –experts in categorization not subject matter – Too limited, narrow bottleneck – Often don’t understand business processes and business uses  Authors – Experts in the subject matter, terrible at categorization – Intra and Inter inconsistency, “intertwingleness” – Choosing tags from taxonomy – complex task – Folksonomy – almost as complex, wildly inconsistent – Resistance – not their job, cognitively difficult = non-compliance  Text Analytics is the answer(s)! 7

8 Building a Foundation for Info Apps Adding Structure to Unstructured Content  Text Analytics and Taxonomy Together – Platform – Text Analytics provides the power to apply the taxonomy – And metadata of all kinds – Consistent in every dimension, powerful and economic  Hybrid Model – Publish Document -> Text Analytics analysis -> suggestions for categorization, entities, metadata - > present to author – Cognitive task is simple -> react to a suggestion instead of select from head or a complex taxonomy – Feedback – if author overrides -> suggestion for new category – Facets – Requires a lot of Metadata - Entity Extraction feeds facets  Hybrid – Automatic is really a spectrum – depends on context – Automatic – adding structure at search results 8

9 Quick Start for Text Analytics Step 1 : Start with Self Knowledge  Ideas – Content and Content Structure – Map of Content – Tribal language silos – Structure – articulate and integrate – Taxonomic resources  People – Producers & Consumers – Communities, Users, Central Team  Activities – Business processes and procedures – Semantics, information needs and behaviors – Information Governance Policy  Technology – CMS, Search, portals, text analytics – Applications – BI, CI, Semantic Web, Text Mining 9

10 Quick Start for Text Analytics Step 2: S oftware Evaluation: Different Type of Evaluation  Traditional Software Evaluation - Start – Filter One- Ask Experts - reputation, research – Gartner, etc. Market strength of vendor, platforms, etc. Feature scorecard – minimum, must have, filter to top 6 – Filter Two – Technology Filter – match to your overall scope and capabilities – Filter not a focus – Filter Three – In-Depth Demo – 3-6 vendors  Reduce to 1-3 vendors  Vendors have different strengths in multiple environments – Millions of short, badly typed documents, Build application – Library 200 page PDF, enterprise & public search  Essential Step – POC or Pilot – search or first Info App 10

11 Quick Start for Text Analytics Step 3: Proof of Concept / Quick Start  POC – understand how text analytics can work in your environment  Learn the software – internal resources trained by doing  Learn the language – syntax (Advanced Boolean)  Learn categorization and extraction  Good catego rization rules – Balance of general and specific – Balance of recall and precision  Develop or refine taxonomies for categorization  POC – can be the Quick Start or the First Application 11

12 Development, Implementation Quick Start – First Application: Search and TA  Simple Subject Taxonomy structure – Easy to develop and maintain  Combined with categorization capabilities – Added power and intelligence  Combined with people tagging, refining tags  Combined with Faceted Metadata – Dynamic selection of simple categories – Allow multiple user perspectives Can’t predict all the ways people think Monkey, Banana, Panda  Combined with ontologies and semantic data – Multiple applications – Text mining to Search – Combine search and browse 12

13 13 Building a Foundation for Info Apps What are Info Apps?  Search-based Applications Plus  E-Discovery, Behavior Prediction, document duplication, BI & CI, etc.  Legal Review – Significant trend – computer-assisted review (manual =too many) – TA- categorize and filter to smaller, more relevant set – Payoff is big – One firm with 1.6 M docs – saved $2M  Expertise Location – Data (HR, project) plus text – authored documents – subject & level  Financial Services – Combine unstructured text (why) and structured data (what) – Anti-Money Laundering

14 14 Building a Foundation for Info Apps Pronoun Analysis: Fraud Detection - Enron Emails  Patterns of “Function” words reveal wide range of insights  Function words = pronouns, articles, prepositions, conjunctions, etc. – Used at a high rate, short and hard to detect, very social, processed in the brain differently than content words  Areas: sex, age, power-status, personality – individuals and groups  Lying / Fraud detection: Documents with lies have – Fewer and shorter words, fewer conjunctions, more positive emotion words – More use of “if, any, those, he, she, they, you”, less “I” – More social and causal words, more discrepancy words  Current research – 76% accuracy in some contexts  Text Analytics can improve accuracy and utilize new sources

15 15 Conclusions  Info Apps based on search and search needs help  Text analytics with taxonomy & metadata = semantic platform – Formal and informal language and cognition  Semantic Infrastructure – Knowledge Audit -> Content, People, Technology, Processes Strategic Vision – Integration of text analytics search, content management – Hybrid Model of tagging – best of human & machine – Build integrated Info Apps  Platform vs. Apps = Yes  Thing Big (Semantics), Build Small, Build Integrated

16 Questions? Tom Reamy tomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com


Download ppt "Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional."

Similar presentations


Ads by Google