Download presentation
Presentation is loading. Please wait.
Published byVanessa Perkins Modified over 9 years ago
1
Content Categorization Tools Taxonomies & Technologies for Infrastructure Solutions Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com
2
2 Agenda KAPS Group & Categorization Research The Answer is Taxonomy, What is the problem? Machine Categorization – Companies, Methods, Directions The Place of Taxonomy in the Enterprise – Taxonomy as an infrastructure activity – Foundation for Content Management, Search, Portals, Smart Applications
3
3 KAPS Group KAPS Background – Knowledge Architecture Consultants – Organize and contextualize content, communities, and tasks – Professional Services partner to Categorization Companies Categorization research – Evaluated 20+ companies – More companies, more new technologies – The answer is categorization, not Google
4
4 The Answer is Taxonomy. What is the Problem? Professionals spend more time looking for information than using it Professionals spend up to 2 hours a day searching Corporate Intranets Survey – Can’t find anything – Search Stinks - Can’t find good content – No good content
5
5 The Answer is Taxonomy. What is the Problem? Infoglut: More information is being generated every day in modern companies than our entire corpus from the Athenian golden age Quantity of information overwhelms our ability to present and classify it. Search is not enough. – Humans search concepts, not strings
6
6 A Modest Proposal: A Solution to Infoglut Bury all new content for 2,500 years Lose most new content in a library fire Unless you can convince a group of monks that your content is worth copying, it gets tossed Dark Ages Solution: Stop writing for a thousand years
7
7 Infoglut: A Really Radical Solution Hire librarians, editors, information architects to categorize your content OR Develop technologies that: – support and enhance the ability of authors and editors to characterize content – enhance the ability of users to find content AND Create a hybrid human/automatic solution
8
8 New Technologies: Categorization Explosion Autonomy Semio Verity Inxight Topical Net Mohomine LingoMotors H5Technologies YellowBrix Entopia Bridgewell MetaTagger Applied Semantics Sageware SmartLogik Inktomi/Quiver Stratify Vivisimo Textology Other - Tacit
9
Auto-Categorization: Methods – Semi-Automatic: Rules, If-Then Maximum precision & flexibility – Catalog by Example: Bayesian, SVM, Neural Training Sets (5-500) Speed, Learning – Statistical Clustering Set of Documents & Taxonomy Level – Semantic Analysis & World Knowledge
10
Origins of Auto-Categorization News Feeds and Content providers uniform content, size and structure professional writers Simple or standard vocabulary Corporate intranet Wildly varied content Mix of good, bad, and ugly writers Tower of Babel: Acronyms, special meanings
11
New Technologies: The Human Element Automatic Categorization is Not Humans are better, but not as consistent – Bring outside contexts to the document Purpose, similar documents, common sense – Understandable mistakes Computers are faster and cheaper – Faster yes, Cheaper ? – Cost of poorer quality categorization Intranet: 20,000 users taking 60 seconds longer = $20,000 a week The Best Answer is Hybrid or Cyborg Categorization
12
12 Summary No clear leader in categorization No one has it all. Immature industry and pent up demand No out of the box solutions: Support Distributed Hybrid Look for Advanced Algorithms Clustering, Auto-Summarization, noun phrase extraction World Knowledge, import public & custom taxonomies Integration – rules, metadata, components & product CM, Search, Portals, Expertise, Collaboration, Applications
13
13 Location of Taxonomy in the Enterprise: An Infrastructure Activity Technology $Millions and 1,000’s of people Organizational Recognized Value fundamental to business activity Intellectual A couple of librarians No budget First to be laid off 3 Infrastructures TechnologicalOrganizationalIntellectual
14
14 Location of Taxonomy in the Enterprise: An Infrastructure Activity Technology $Millions and 1,000’s of people Organizational Recognized Value fundamental to business activity Intellectual A couple of librarians No budget First to be laid off 3 Infrastructures TechnologicalOrganizationalIntellectual 3 Infrastructures TechnologicalOrganizational Intellectual
15
15 Creating an Intellectual Infrastructure Knowledge Audit / Knowledge Map Knowledge Creating – Innovation, Content Management, E-learning Knowledge Sharing / Transmission – Collaboration, Retrieval - content, experts Knowledge Using – Smart Applications, CRM, Portals Knowledge Architecture People
16
16 Content Management and Taxonomy Taxonomic Publishing Model – Publish by Category, not web site – Web Site the wrong unit of organization Distributed Work Flow Collaborative Categorization and keywords by Subject Matter Experts, aided by software Content Re-Organization – Rich Web of Related Content Basic information + contexts Content Re-Organization: Next Steps – Document can be wrong unit of organization
17
17 Taxonomy and Search Knowledge Retrieval: Information + Contexts Information Retrieval: ProductName – List of Documents, ranked by frequency of keyword Knowledge Retrieval: ProductName – Personal & Community & Historical Filters – List of Documents – about product – Categorized list: Features of Product Comparisons of Products Legal / Policy documents Activities associated with product – Background Resources Glossaries, Communities
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.