Presentation is loading. Please wait.

Presentation is loading. Please wait.

Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services

Similar presentations


Presentation on theme: "Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services"— Presentation transcript:

1 Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com

2 2 Agenda  Introduction – Elements & Infrastructure Platform – Semantics not technology – Infrastructure not project – Value of Text Analytics  Evaluating Software – Two Phase Process – Designing the Team and Content Structures  Development – Taxonomy, Categorization, Faceted Metadata  Text Analytics Applications – Integration with Search and ECM – Platform for Information Applications

3 3 KAPS Group: General  Knowledge Architecture Professional Services  Virtual Company: Network of consultants – 8-10  Partners – SAS, SAP, Microsoft-FAST, Concept Searching, etc.  Consulting, Strategy, Knowledge architecture audit  Services: – Taxonomy/Text Analytics development, consulting, customization – Technology Consulting – Search, CMS, Portals, etc. – Evaluation of Enterprise Search, Text Analytics – Metadata standards and implementation – Knowledge Management: Collaboration, Expertise, e-learning – Applied Theory – Faceted taxonomies, complexity theory, natural categories

4 4 Introduction to Text Analytics Semantic Infrastructure - Elements  Taxonomy – Thesauri, Controlled Vocabulary  Metadata – Standard (Dublin Core) and Facets  Basic Text Analytics – Categorization – Document Topics – Aboutness – Entity Extraction – noun phrases, feed facets – Summarization – beyond snippets  Advanced Text Analytics – Fact extraction – ontologies – Sentiment Analysis – good, bad, and ugly  What is in a Name – text analytics or ?

5 5 Introduction to Text Analytics Taxonomy  Thesauri, Controlled Vocabulary – Resources to build on – Indexing not categorization  Taxonomy – Foundation for Categorization – Browse – classification scheme – Formal – Is-Child-Of, Is-Part-Of – Large taxonomies - MeSH – indexing all topics – Small is better – for categorization and faceted navigation

6 6 Introduction to Text Analytics Metadata  Metadata standards – Dublin Core - Mostly syntactic not semantic – Description – static or dynamic (summarization) – Semantic – keywords – very poor performance  Best Bets – high level categorization-search – Human judgments  Audience – mixed results – Role, function, expertise, information behaviors  Facets – classes of metadata – Standard - People, Organization, Document type-purpose – Specialized – methods, materials, products

7 7 Introduction to Text Analytics Text Analytics  Categorization – Multiple techniques – examples, terms, Boolean – Built on a taxonomy  Entity Extraction – Catalogs with variants, rule based dynamic  Summarization – Rules – find sentences in a document  Fact Extraction – Relationships of entities – people-organizations-activities  Sentiment Analysis – Rules – adjectives & adverbs not nouns

8 8 Introduction to Text Analytics Text Analytics  Why Text Analytics? – Enterprise search has failed to live up to its potential – Enterprise Content management has failed to live up to its potential – Taxonomy has failed to live up to its potential – Adding metadata, especially keywords has not worked  What is missing? – Intelligence – human level categorization, conceptualization – Infrastructure – Integrated solutions not technology, software  Text Analytics can be the foundation that (finally) drives success – search, content management, and much more

9 9 Text Analytics Platform 4 Basic Contexts  Ideas – Content Structure – Language and Mind of your organization – Applications - exchange meaning, not data  People – Company Structure – Communities, Users – Central team - establish standards, facilitate  Activities – Business processes and procedures  Technology – CMS, Search, portals, taxonomy tools – Applications – BI, CI, Text Mining

10 10 Text Analytics Platform: The start and foundation Knowledge Architecture Audit  Knowledge Map - Understand what you have, what you are, what you want – The foundation of the foundation  Contextual interviews, content analysis, surveys, focus groups, ethnographic studies  Category modeling – “Intertwingledness” -learning new categories influenced by other, related categories  Natural level categories mapped to communities, activities Novice prefer higher levels Balance of informative and distinctiveness  Living, breathing, evolving foundation is the goal

11 11 Text Analytics Platform – Benefits IDC White Paper  Time Wasted – Reformat information - $5.7 million per 1,000 per year – Not finding information - $5.3 million per 1,000 – Recreating content - $4.5 Million per 1,000  Small Percent Gain = large savings – 1% - $10 million – 5% - $50 million – 10% - $100 million

12 12 Text Analytics Platform – Benefits  Findability within and outside the enterprise – Savings per year - $millions  Rescue enterprise search and ECM projects – Add semantics to search  Clean up enterprise content – Duplication and accurate categorization  Improve the quality of information access – Finding the right information can save millions  Build smarter applications – Social networking, locate expertise within the enterprise

13 13 Text Analytics Platform – Benefits  Understand your customers – What they are talking about and how they feel about it  Empower your employees – Not only more time, but they work smarter  Understand your competitors – What they are working on, talking about – Combine unstructured content and rich data sources – more intelligent analysis

14 14 Text Analytics Platform – Dangers  Text Analytics as a software project  Not enough resources – to develop, to maintain-refine  Wrong resources – SME’s, IT, Library – Need all of the above and taxonomists+  Bad Design: – Start with bad taxonomy – Wrong taxonomy – too big or two flat  Bad Categorization / Entity Extraction – Right kind of experience

15 15 Resources  Books – Women, Fire, and Dangerous Things George Lakoff – Knowledge, Concepts, and Categories Koen Lamberts and David Shanks – The Stuff of Thought – Steven Pinker  Web Sites – Text Analytics News - http://social.textanalyticsnews.com/index.php http://social.textanalyticsnews.com/index.php – Text Analytics Wiki - http://textanalytics.wikidot.com/http://textanalytics.wikidot.com/

16 16 Resources  Blogs – SAS- Manya Mayes – Chief Strategist - http://blogs.sas.com/text-mining/ http://blogs.sas.com/text-mining/  Web Sites – Taxonomy Community of Practice: http://finance.groups.yahoo.com/group/TaxoCoP/ http://finance.groups.yahoo.com/group/TaxoCoP/ – Whitepaper – CM and Text Analytics - http://www.textanalyticsnews.com/usa/contentmanagementm eetstextanalytics.pdf http://www.textanalyticsnews.com/usa/contentmanagementm eetstextanalytics.pdf

17 Questions? Tom Reamy tomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com


Download ppt "Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services"

Similar presentations


Ads by Google