Download presentation
Presentation is loading. Please wait.
Published byAda Harris Modified over 9 years ago
1
Text Analytics And Text Mining Best of Text and Data
Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
2
Agenda Text Analytics Capabilities Text Analytics Applications
Text Mining and Text Analytics Data and Unstructured Content Case Study – Text Mining for Taxonomy Development Conclusion
3
KAPS Group: General Knowledge Architecture Professional Services
Virtual Company: Network of consultants – 8-10 Partners – SAS, Smart Logic, Microsoft-FAST, Concept Searching, etc. Consulting, Strategy, Knowledge architecture audit Services: Text Analytics evaluation, development, consulting, customization Knowledge Representation – taxonomy, ontology, Prototype Metadata standards and implementation Knowledge Management: Collaboration, Expertise, e-learning Applied Theory – Faceted taxonomies, complexity theory, natural categories
4
Introduction to Text Analytics Text Analytics Features
Noun Phrase Extraction Catalogs with variants, rule based dynamic Multiple types, custom classes – entities, concepts, events Feeds facets Summarization Customizable rules, map to different content Fact Extraction Relationships of entities – people-organizations-activities Ontologies – triples, RDF, etc. Sentiment Analysis Statistical, rules – full categorization set of operators
5
Introduction to Text Analytics Text Analytics Features
Auto-categorization Training sets – Bayesian, Vector space Terms – literal strings, stemming, dictionary of related terms Rules – simple – position in text (Title, body, url) Semantic Network – Predefined relationships, sets of rules Boolean– Full search syntax – AND, OR, NOT Advanced – NEAR (#), PARAGRAPH, SENTENCE This is the most difficult to develop Build on a Taxonomy Combine with Extraction, Sentiment Foundation for best text analytics & combination
12
Varieties of Taxonomy/ Text Analytics Software
Taxonomy Management Synaptica, SchemaLogic Full Platform SAS-Teragram, SAP-Inxight, Smart Logic, Data Harmony, Concept Searching, Expert System, IBM, GATE Content Management – embedded Embedded – Search FAST, Autonomy, Endeca, Exalead, etc. Specialty Sentiment Analysis , VOC – Lexalytics, Attensity / Reports Ontology – extraction, plus ontology
13
Text Analytics Applications Platform for Multiple Applications
Content Aggregation, Duplicate Documents – save millions! Business intelligence, Customer Intelligence Social Media - sentiment analysis, Voice of the Customer Social – Hybrid folksonomy / taxonomy / auto-metadata Social – expertise, categorize tweets and blogs, reputation Ontology – travel assistant, semantic web, etc. eDiscovery, Reputation management, Customer Experience Expertise Location, Crowd sourcing Technical support
14
Text Analytics Applications: Enterprise Search - Elements
Text Analytics can “solve” enterprise search Multiple Knowledge Structures Facet – orthogonal dimension of metadata Taxonomy - Subject matter / aboutness Software - Search, ECM, auto-categorization, entity extraction, Text Analytics and Text Mining People – tagging, evaluating tags, fine tune rules and taxonomy Rich Search Results – context and conversation Platform for search based applications
17
Text Analytics and Text Mining Data and Unstructured Content
80% of content is unstructured – adding to semantic web is major Text Analytics – content into data Big Data meets Big Content Real integration of text and ontology Beyond “hasDescription” Improve accuracy of extracted entities, facts – disambiguation Pipeline – oil & gas OR research / Ford Add Concepts, not just “Things” – 68% want this Semantic Web + Text Analytics = real world value Linked Data + Text Analytics – best of both worlds Build superior foundation elements – taxonomies, categorization
18
Combine with Data Mining New sources of information
Text Analytics and Text Mining and Data Mining Vaccine Adverse Reaction Combine with Data Mining New sources of information News stories, medical records Blogs, social Find new connections, sources of knowledge Vaccine Adverse Effects – disease, symptoms, variables Unstructured text into a data source Some preliminary analysis, content structure Find unknown adverse effects and prevalence Drug Discovery + search / research – 5 year story
19
Text Analytics Applications Example – Vaccine Adverse Effects
20
Text Analytics Applications Example – Vaccine Adverse Effects
21
Text Analytics Applications Example – Vaccine Adverse Effects
22
Text Analytics and Text Mining Case Study – Taxonomy Development
Problem – 200,000 new uncategorized documents Old taxonomy –need one that reflects change in corpus Text mining, entity extraction, categorization Bottom Up- terms in documents – frequency, date, Clustering – suggested categories Clustering – chunking for editors Time savings – only feasible way to scan documents Quality – important terms, co-occurring terms
23
Text Analytics and Text Mining Case Study – Taxonomy Development
Text into Data: Article, Abstract, Title, Subtitle – fields & source of terms Add Data: PubDate, journalTitle, Taxonomy Node Terms – Map to frequency, date, date ranges, Taxonomy Node New Terms, Trends Relevance – frequency, Abstract, Title, human judgment Entity Extraction – Authors, Organizations, Products, Categorization – build on clusters & taxonomy Combination – reports, visualizations, interactive explorations
24
Case Study – Taxonomy Development
27
Case Study – Taxonomy Development
28
Case Study – Taxonomy Development
29
Conclusion The best is yet to come!
Text Analytics impact is huge – solve information overload Enterprise Search and Search Based Applications: Save millions and enhance productivity Combination of Text Analytics & Text Mining – unlimited range of applications Mutual Enrichment – more data, add structure to unstructured Add Ontology = Richer Text Analytics – smarter, more useful Text Analytics + Text Mining + Semantic Web Move from theory to new practical applications The best is yet to come!
30
Questions? Tom Reamy tomr@kapsgroup.com KAPS Group
Knowledge Architecture Professional Services
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.