Text Analytics And Text Mining Best of Text and Data

Slides:



Advertisements
Similar presentations
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Advertisements

Top Tips Enterprise Content Management Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Metadata Strategies Alternatives for creating value from metadata Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Improving Navigation and Findability Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Beyond Sentiment New Dimensions for Social Media A Panel Discussion of Trends and Ideas Dave Hills, Twelvefold Media Mike Lazarus, Atigeo, LLC Moderator:
Copyright © 2012, SAS Institute Inc. All rights reserved. #analytics2012 Quick Start for Text Analytics Tom Reamy Chief Knowledge Architect KAPS Group.
Enterprise Information Architecture A Platform for Integrating Your Organization’s Information and Knowledge Activities Tom Reamy Chief Knowledge Architect.
Search, Browse, and Faceted Navigation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Faceted Navigation: Search and Browse Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy Development Case Studies
Innovation in Search? Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Model of Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Semantic Infrastructure Workshop Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Semantic Infrastructure Workshop Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy Boot Camp Panel Text Analytics Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Improving Search for Discovery Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.
Automatic Facets: Faceted Navigation and Entity Extraction Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Copyright © 2011, SAS Institute Inc. All rights reserved. #analytics2011 Text Analytics Evaluation A Case Study: Amdocs Tom Reamy Chief Knowledge Architect.
Beyond Sentiment Mining Social Media A Panel Discussion of Trends and Ideas Marie Wallace, IBM Marcello Pellacani, Expert System Fabio Lazzarini, CRIBIS.
Enterprise Semantic Infrastructure Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Beyond Sentiment Mining Social Media Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Facets and Faceted Navigation Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Expanding Enterprise Roles for Librarians Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Workshop Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Best of Both Worlds Text Analytics and Text Mining Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Selecting Taxonomy Software Who, Why, How Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Overview of Search Engines
Taxonomy and Knowledge Organization Taxonomy in Context Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.
Enterprise Search/ Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Best of All Worlds Text Analytics and Text Mining and Taxonomy Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
New Directions in Social Media Tom Reamy Chief Knowledge Architect KAPS Group
SemTech Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group
Smart Text How to Turn Big Text into Big Data Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World.
Taxonomies and Faceted Navigation Getting the Best of Both
Mashup Mindset Moving Mashups to Next Level Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Integrating an Enterprise Taxonomy with Local Variations Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge.
Applying Semantics to Search Text Analytics Tom Reamy Chief Knowledge Architect KAPS Group Enterprise Search Summit New York.
Text Analytics Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Taxonomy and Social Media Social Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
Content Categorization Tools Taxonomies & Technologies for Infrastructure Solutions Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture.
Text Analytics Summit Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Software Choosing the Right Fit Tom Reamy Chief Knowledge Architect KAPS Group Text Analytics World October 20.
New Directions in Social Media Tom Reamy Chief Knowledge Architect KAPS Group
Faceted Navigation Design Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Metadata and Taxonomies The Best of Both Worlds Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Integrating an Enterprise Taxonomy with Local Variations Tom Reamy Chief Knowledge Architect KAPS Group Taxonomy Boot Camp.
Text Analytics Mini-Workshop Quick Start Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.
Enterprise Semantic Infrastructure Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Folksonomy Folktales Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Selecting Taxonomy Software Who, Why, How Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Text Analytics for Search Applications Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
Text Analytics Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Taxonomy and Text Analytics Case Studies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy Development An Infrastructure Model Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Deep Text New Approaches in Text Analytics and Knowledge Organization Tom Reamy Chief Knowledge Architect KAPS Group Author: Deep.
Text Analytics Webinar
Tom Reamy Chief Knowledge Architect KAPS Group
Tom Reamy Chief Knowledge Architect KAPS Group
Enterprise Social Networks A New Semantic Foundation
Program Chair: Tom Reamy Chief Knowledge Architect
Taxonomies, Lexicons and Organizing Knowledge
Text Analytics Workshop: Introduction
Program Chair: Tom Reamy Chief Knowledge Architect
Presentation transcript:

Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com

Agenda Text Analytics Capabilities Text Analytics Applications Text Mining and Text Analytics Data and Unstructured Content Case Study – Text Mining for Taxonomy Development Conclusion

KAPS Group: General Knowledge Architecture Professional Services Virtual Company: Network of consultants – 8-10 Partners – SAS, Smart Logic, Microsoft-FAST, Concept Searching, etc. Consulting, Strategy, Knowledge architecture audit Services: Text Analytics evaluation, development, consulting, customization Knowledge Representation – taxonomy, ontology, Prototype Metadata standards and implementation Knowledge Management: Collaboration, Expertise, e-learning Applied Theory – Faceted taxonomies, complexity theory, natural categories

Introduction to Text Analytics Text Analytics Features Noun Phrase Extraction Catalogs with variants, rule based dynamic Multiple types, custom classes – entities, concepts, events Feeds facets Summarization Customizable rules, map to different content Fact Extraction Relationships of entities – people-organizations-activities Ontologies – triples, RDF, etc. Sentiment Analysis Statistical, rules – full categorization set of operators

Introduction to Text Analytics Text Analytics Features Auto-categorization Training sets – Bayesian, Vector space Terms – literal strings, stemming, dictionary of related terms Rules – simple – position in text (Title, body, url) Semantic Network – Predefined relationships, sets of rules Boolean– Full search syntax – AND, OR, NOT Advanced – NEAR (#), PARAGRAPH, SENTENCE This is the most difficult to develop Build on a Taxonomy Combine with Extraction, Sentiment Foundation for best text analytics & combination

Varieties of Taxonomy/ Text Analytics Software Taxonomy Management Synaptica, SchemaLogic Full Platform SAS-Teragram, SAP-Inxight, Smart Logic, Data Harmony, Concept Searching, Expert System, IBM, GATE Content Management – embedded Embedded – Search FAST, Autonomy, Endeca, Exalead, etc. Specialty Sentiment Analysis , VOC – Lexalytics, Attensity / Reports Ontology – extraction, plus ontology

Text Analytics Applications Platform for Multiple Applications Content Aggregation, Duplicate Documents – save millions! Business intelligence, Customer Intelligence Social Media - sentiment analysis, Voice of the Customer Social – Hybrid folksonomy / taxonomy / auto-metadata Social – expertise, categorize tweets and blogs, reputation Ontology – travel assistant, semantic web, etc. eDiscovery, Reputation management, Customer Experience Expertise Location, Crowd sourcing Technical support

Text Analytics Applications: Enterprise Search - Elements Text Analytics can “solve” enterprise search Multiple Knowledge Structures Facet – orthogonal dimension of metadata Taxonomy - Subject matter / aboutness Software - Search, ECM, auto-categorization, entity extraction, Text Analytics and Text Mining People – tagging, evaluating tags, fine tune rules and taxonomy Rich Search Results – context and conversation Platform for search based applications

Text Analytics and Text Mining Data and Unstructured Content 80% of content is unstructured – adding to semantic web is major Text Analytics – content into data Big Data meets Big Content Real integration of text and ontology Beyond “hasDescription” Improve accuracy of extracted entities, facts – disambiguation Pipeline – oil & gas OR research / Ford Add Concepts, not just “Things” – 68% want this Semantic Web + Text Analytics = real world value Linked Data + Text Analytics – best of both worlds Build superior foundation elements – taxonomies, categorization

Combine with Data Mining New sources of information Text Analytics and Text Mining and Data Mining Vaccine Adverse Reaction Combine with Data Mining New sources of information News stories, medical records Blogs, social Find new connections, sources of knowledge Vaccine Adverse Effects – disease, symptoms, variables Unstructured text into a data source Some preliminary analysis, content structure Find unknown adverse effects and prevalence Drug Discovery + search / research – 5 year story

Text Analytics Applications Example – Vaccine Adverse Effects

Text Analytics Applications Example – Vaccine Adverse Effects

Text Analytics Applications Example – Vaccine Adverse Effects

Text Analytics and Text Mining Case Study – Taxonomy Development Problem – 200,000 new uncategorized documents Old taxonomy –need one that reflects change in corpus Text mining, entity extraction, categorization Bottom Up- terms in documents – frequency, date, Clustering – suggested categories Clustering – chunking for editors Time savings – only feasible way to scan documents Quality – important terms, co-occurring terms

Text Analytics and Text Mining Case Study – Taxonomy Development Text into Data: Article, Abstract, Title, Subtitle – fields & source of terms Add Data: PubDate, journalTitle, Taxonomy Node Terms – Map to frequency, date, date ranges, Taxonomy Node New Terms, Trends Relevance – frequency, Abstract, Title, human judgment Entity Extraction – Authors, Organizations, Products, Categorization – build on clusters & taxonomy Combination – reports, visualizations, interactive explorations

Case Study – Taxonomy Development

Case Study – Taxonomy Development

Case Study – Taxonomy Development

Conclusion The best is yet to come! Text Analytics impact is huge – solve information overload Enterprise Search and Search Based Applications: Save millions and enhance productivity Combination of Text Analytics & Text Mining – unlimited range of applications Mutual Enrichment – more data, add structure to unstructured Add Ontology = Richer Text Analytics – smarter, more useful Text Analytics + Text Mining + Semantic Web Move from theory to new practical applications The best is yet to come!

Questions? Tom Reamy tomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com