Taxonomy Development Knowledge Structures

Slides:



Advertisements
Similar presentations
Strategies LLCTaxonomy May 22, 2006Copyright 2006 Taxonomy Strategies LLC. All rights reserved Enterprise Search Summit Taxonomy Fundamentals Workbook.
Advertisements

Taxonomy Development An Infrastructure Model
Taxonomy and Knowledge Organization Taxonomy in Context
Top Tips Enterprise Content Management Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Metadata Strategies Alternatives for creating value from metadata Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Improving Navigation and Findability Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Buy, Build, Automate: Why you should Buy Your Taxonomy Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Cyborg Categorization The Basics Tom Reamy Knowledge Architect Intranet Consultant.
Semantic Infrastructure for KM 2.0 A new approach to folksonomies and other knowledge representations Tom Reamy Chief Knowledge Architect KAPS Group Knowledge.
Enterprise Information Architecture A Platform for Integrating Your Organization’s Information and Knowledge Activities Tom Reamy Chief Knowledge Architect.
Search, Browse, and Faceted Navigation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Faceted Navigation: Search and Browse Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Knowledge Architecture in the Enterprise 2.0 Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Semantic Infrastructure for Taxonomy 2.0 A new approach to folksonomies and other knowledge representations Tom Reamy Chief Knowledge Architect KAPS Group.
Taxonomy Development Case Studies
Innovation in Search? Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Knowledge Architecture Process & Case Studies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy Boot Camp Panel Text Analytics Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Automatic Facets: Faceted Navigation and Entity Extraction Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Beyond Sentiment Mining Social Media A Panel Discussion of Trends and Ideas Marie Wallace, IBM Marcello Pellacani, Expert System Fabio Lazzarini, CRIBIS.
Enterprise Semantic Infrastructure Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Facets and Faceted Navigation Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Expanding Enterprise Roles for Librarians Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
OPAL Conference, August Social Tagging, Folksonomies & Controlled Vocabularies Inviting New Access Systems to our Academic Table Margaret Maurer.
Alternatives to Metadata IMT 589 February 25, 2006.
Best of Both Worlds Text Analytics and Text Mining Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Selecting Taxonomy Software Who, Why, How Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.
Enterprise Search/ Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Text Analytics And Text Mining Best of Text and Data
Essentials of Knowledge Architecture Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
SemTech Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomies and Faceted Navigation Getting the Best of Both
Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.
Specialized Application Software © 2013 The McGraw-Hill Companies, Inc. All rights reserved.Computing Essentials 2013.
Mashup Mindset Moving Mashups to Next Level Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
HENRY FORD “If I’d asked my customers what they wanted, they would’ve said a faster horse.”
Content Categorization Tools Taxonomies & Technologies for Infrastructure Solutions Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture.
Text Analytics Summit Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Selecting a Topic and Purpose
Text Analytics Software Choosing the Right Fit Tom Reamy Chief Knowledge Architect KAPS Group Text Analytics World October 20.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Faceted Navigation Design Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Metadata and Taxonomies The Best of Both Worlds Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
When Search is not Enough Case Study: The Advertising Research Foundation Gilbane Boston November 27, 2007 Gilbane Boston November 27, 2007.
Integrating an Enterprise Taxonomy with Local Variations Tom Reamy Chief Knowledge Architect KAPS Group Taxonomy Boot Camp.
Enterprise Semantic Infrastructure Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Faceted Navigation An Alternative to Search and Browse Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Folksonomy Folktales Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Selecting Taxonomy Software Who, Why, How Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Evolving Folksonomies Complexity Theory & Folksonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Text Analytics for Search Applications Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
Text Analytics Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Faceted Navigation: Best of Browse and Search Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy and Text Analytics Case Studies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Chapter 04: Specialized Application Software
Enterprise Social Networks A New Semantic Foundation
Federated & Meta Search
SPECIALIZED APPLICATION SOFTWARE
American Library Association Online Resource Center
Introduction into Knowledge and information
Text Analytics Workshop: Introduction
Presentation transcript:

Taxonomy Development Knowledge Structures Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com

Agenda Introduction Knowledge Structures Taxonomy Management Software Exercises Conclusion

Knowledge Structures List of Keywords (Folksonomies) Controlled Vocabularies, Glossaries Thesaurus Browse Taxonomies (Classification) Formal Taxonomies Faceted Classifications Semantic Networks / Ontologies Topic Maps Knowledge Maps

Knowledge Structures Lists of Keywords (Folksonomies) Folksonomy (also known as collaborative tagging, social classification, social indexing, and social tagging) is the practice and method of collaboratively creating and managing tags to annotate and categorize content. Folksonomy describes the bottom-up classification systems that emerge from social tagging.[1] In contrast to traditional subject indexing, metadata is generated not only by experts but also by creators and consumers of the content. Usually, freely chosen keywords are used instead of a controlled vocabulary.[2] Folksonomy (from folk + taxonomy) is a user generated taxonomy.

Knowledge Structures Controlled Vocabularies, Glossaries Lists with minimum structure Easy to develop Difficult to get value from Simple Reference resource Thesaurus Taxonomy-like Less formal BT, NT – also RT

Two Types of Taxonomies: Browse and Formal Browse Taxonomy – Yahoo

Two Types of Taxonomies: Formal

Facets and Dynamic Classification Facets are not categories Entities or concepts belong to a category Entities have facets Facets are metadata - properties or attributes Entities or concepts fit into one category All entities have all facets – defined by set of values Facets are orthogonal – mutually exclusive – dimensions An event is not a person is not a document is not a place. Facets – variety – of units, of structure Date or price – numerical range Location – big to small (partonomy) Winery – alphabetical Hierarchical - taxonomic

Knowledge Structures Semantic Networks / Ontologies Ontology more formal XML standards – OWL, DAML Semantic Web – machine understanding RDF – Noun – Verb – Object Vice President is Officer Build implications – from properties of Officer Semantic Network – less formal Represent large ontologies Synonyms and variety of relationships

Knowledge Structures: Ontology Instruments Music is a is a create Bluegrass Violins uses Musicians uses is a Violinists

Knowledge Structures Topic Maps ISO Standard See www.topicmaps.org Topic Maps represent subjects (topics) and associations and occurrences Similar to semantic networks Ontology defines the types of subjects and types of relationships Combination of semantic network and other formal structures (taxonomy or ontology)

Knowledge Structure: Topic Maps

Knowledge Structures Knowledge Maps No standards – applied at high level Ontologies plus / applied to specific environment Map of Groups – Content Stores – Purpose – Technology Add structure to each element Facet Structure – filter by group – content – purpose Strategic resource

Knowledge Structures: Which one to use? Level 1 – keywords, glossaries, acronym lists, search logs Resources, inputs into upper levels Level 2 – Thesaurus, Taxonomies Semantic Resource – foundation for applications, metadata Level 3 – Facets, Ontologies, semantic networks, topic maps Applications Level 4 – Knowledge maps Strategic Resource

Web 2.0 – No need for Taxonomies etc.? “Tags are great because you throw caution to the wind, forget about whittling down everything into a distinct set of categories and instead let folks loose categorizing their own stuff on their own terms." - Matt Haughey - MetaFilter Tyranny of the majority - worst type of central authority More Madness of Crowds than Wisdom of Crowds “Things fall apart; the center cannot hold; Mere anarchy is loosed upon the world,… The best lack all conviction, while the worst Are full of passionate conviction.” - The Second Coming – W.B. Yeats

Advantages of Folksonomies Simple (no complex structure to learn) No need to learn difficult formal classification system Lower cost of categorization Distributes cost of tagging over large population Open ended – can respond quickly to changes Relevance – User’s own terms Support serendipitous form of browsing Easy to tag any object – photo, document, bookmark Better than no tags at all Getting people excited about metadata!

Folksonomies – Problems and Limits Folksonomies don’t compare with taxonomies or ontologies Serendipity browsing is small part of search Limited areas of success – popular sites are popular Quality Content – finance, science, etc – not good candidates No mechanism for improving folksonomies Scale – Too Big (million hits) – Too Little (200 items) – Amazon and LibraryThing Need intrinsic value of tagging – not tagging for better tags Bad Tags - idiosyncratic or too broad, errors, limited reach Most people can’t tag very well – learned skill

Del.icio.us Tags Design blog software music tools reference art video programming webdesign web2.0 mac howto linux tutorial web free news photography shopping blogs css imported education travel javascript food games Development inspiration politics flash apple tips java google osx business windows iphone science productivity books toread helath funny internet wordpress ajax ruby research humor fun technology search opensource Photoshop media recipes cool work article marketing security mobile jobs rails lifehacks tutorials resources php social download diy ubuntu freeware portfolio photo movies writing graphics youtube audio online

Del.icio.us - Folksonomy Findability Too many hits (where have we heard that before?) Design – 1 Mil, software – 931,259, sex – 129,468 No plurals, stemming (singular preferred) Folksonomy – 14,073, folksonomies – 3,843, both – 1,891 Blog-1.7M, blogs – 516,340, Weblog- 155,917, weblogs – 36,434, blogging – 157,922, bloging – 697 Taxonomy – 9.683, taxonomies – 1,574 Personal tags – cool, fun, funny, etc Good for social research, not finding documents or sites How good for personal use? Funny is time dependent

Library Thing Book people aren’t much better at tagging High level concepts – psychology (55,000), religion (120,000), science (101,000) Issue – variety of terms – cognitive science – need at least 40 other tags to cover the actual field of cognitive science Strange tags – book (19,000) – it’s a book site? Combination of facets and topics Facets – Date (16th century, 1950’s, 2007) // Function (owned, not read) // Type (graphic novel, novel) // Genre (horror, mystery) Topics – majority like Del.icio.us

Library Thing – Book on Neuroscience 1) (Location: dining room)(1) biological(1) biology(8) box74(1) Brain(1) brain research(1) brains(1) cognitive neuroscience(1) cognitive science(1) consciousness(1) currently reading(1) HelixHealth(1) kognitionswissenschaft(1) medical(1) medicine(1) neuroscience(19) non-fiction(5) partread(1) Psychology(4) Science(10) textbook(10) theory(1) Too General: Science, Psychology, biology, textbook Too specific: Location: dining room, box74 Facets: currently reading, partread

Better Folksonomies: Will social networking make tags better? Not so far – example of Del.icio.us – same tags Quality and Popularity are very different things Most people don’t tag, don’t re-tag Study – folksonomies follow NISO guidelines – nouns, etc – but do they actually work – see analysis Most tags deal with computers and are created by people that love to do this stuff – not regular users and infrequent users – Beware true believers!

Browse Taxonomies: Strengths and Weaknesses Strengths: Browse is better than search Context and discovery Browse by task, type, etc. Weaknesses: Mix of organization Catalogs, alphabetical listings, inventories Subject matter, functional, publisher, document type Vocabulary and nomenclature Issues Problems with maintenance, new material Poor granularity and little relationship between parts. Web site unit of organization No foundation for standards

Formal Taxonomies: Strengths and Weaknesses Fixed Resource – little or no maintenance Communication Platform – share ideas, standards Infrastructure Resource Controlled vocabulary and keywords More depth, finer granularity Weaknesses: Difficult to develop and customize Don’t reflect users’ perspectives Users have to adapt to language

Faceted Navigation: Strengths and Weaknesses More intuitive – easy to guess what is behind each door 20 questions – we know and use Dynamic selection of categories Allow multiple perspectives Trick Users into “using” Advanced Search wine where color = red, price = x-y, etc.. Weaknesses: Difficulty of expressing complex relationships Simplicity of internal organization Loss of Browse Context Difficult to grasp scope and relationships Limited Domain Applicability – type and size Entities not concepts, documents, web sites

Dynamic Classification / Faceted navigation Search and browse better than either alone Categorized search – context Browse as an advanced search Dynamic search and browse is best Can’t predict all the ways people think Advanced cognitive differences Panda, Monkey, Banana Can’t predict all the questions and activities Intersections of what users are looking for and what documents are often about China and Biotech Economics and Regulatory

Varieties of Taxonomy/ Text Analytics Software Taxonomy Management Text Analytics Auto-Categorization, Entity Extraction Sentiment Analysis Software Platforms Content Management, Search Application Specific Business Intelligence

Vendors of Taxonomy/ Text Analytics Software Attensity Business Objects – Inxight Clarabridge ClearForest Data Harmony / Access Innovations Lexalytics Multi-Tes Nstein SchemaLogic Teragram Wikionomy Wordmap Lots More

Why Taxonomy Software? If you have to ask, you can’t afford it Spreadsheets Good for calculations, days of taxonomy development over (almost) Ease of use – more productive Increase speed of taxonomy development Better Quality – synonyms, related terms, etc. Distributed development – lower cost, user input (good and bad)

Text Analytics Software – Features Taxonomy Management Functions Entity Extraction Multiple types, custom classes Auto-categorization – Taxonomy Structure Training sets – Bayesian, Vector space Terms – literal strings, stemming, dictionary of related terms Rules – simple – position in text (Title, body, url) Boolean– Full search syntax – AND, OR, NOT Advanced – NEAR (#), PARAGRAPH, SENTENCE Advanced Features Facts / ontologies /Semantic Web – RDF + Sentiment Analysis

Conclusion Variety of information and knowledge structures Important to know what will solve what Taxonomies and Facets are foundation elements Build higher levels based on lower levels Glossaries to Taxonomies Taxonomy to Ontology / faceted navigation Important to have good taxonomy and text analytics software (spreadsheets are OK for first draft) Web 2.0/Folksonomies are not the answer

Resources Books Software Web Sites Women, Fire, and Dangerous Things George Lakoff Knowledge, Concepts, and Categories Koen Lamberts and David Shanks The Stuff of Thought – Steven Pinker Software Tools & Techniques (Taxonomy Boot Camp) Web Sites Taxonomy Community of Practice: http://finance.groups.yahoo.com/group/TaxoCoP/

Questions? Tom Reamy tomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com