Download presentation
Presentation is loading. Please wait.
Published byJared Warner Modified over 9 years ago
1
Implementing a Taxonomy in a Content Management Portal Content Week 2005 Miami, Florida Monday, January 31, 2005 Workshop H 2:45pm – 4:45 pm Marjorie M.K. Hlava Access Innovations, Inc. 505-998-0800 mhlava@accessinn.com www.accessinn.com
2
Introductions Name Project Expectations for these two short hours Please fill in the sign up sheet Would you like – 1. Copy of this presentation? – 2. Sample software? – 3. Other information?
3
Copyright © 2005 Access Innovations, Inc. What will we talk about this afternoon? 1.Definitions 2.Where taxonomy fits in the Information Circle 3.Where to use a taxonomy 4.Taxonomies for Communities of Practice 5.Surrounding theories and applications 6.How to build and maintain 7.How is used in enterprise information
4
Thesaurus Master Data Feed MAI to add Metadata Database Management System Add Metadata using MAI Search Inverted File Implementing a Taxonomy in a Content Management Portal
5
Copyright © 2005 Access Innovations, Inc. 1. Definitions
6
Copyright © 2005 Access Innovations, Inc. What is a taxonomy? A hierarchical thesaurus with authority terms applied at the final node A browse-able web interface A Linnaean System A browse- able list with the term instance at the final leaf
7
Copyright © 2005 Access Innovations, Inc. Types of Taxonomies Naming and organizing things into groups that share similar characteristics 1. Flat – just a list 2. Hierarchical – Taxonomic view 3. Faceted – Sorted by a single charasteristic – Metadata - Dublin Core – COSATI -GILS 4. Thesaurus – Term records – Database backend – Easier to modify and maintain
8
Copyright © 2005 Access Innovations, Inc. Taxonomy in meta data Definition – Taxonomy is a thesaurus in its hierarchical view with the authority files applied at the final nodes – It allows the browse-able front end to a portal – It provides keyword and name access to the content in the portal
9
Copyright © 2005 Access Innovations, Inc. Taxonomy definition A taxonomy is a thesaurus in hierarchical view with authority file terms added at the final nodes Thesaurus Authority file Hierarchical form Final nodes
10
Copyright © 2005 Access Innovations, Inc. Thesaurus Concepts Methods Procedures Cognitive approach The knowledge capture piece The topics or subjects
11
Copyright © 2005 Access Innovations, Inc. Authority file People Places Things The tangible approach Concrete Entities
12
Copyright © 2005 Access Innovations, Inc. Hierarchical view Gives the Portal view The view of all the preferred terms in categorized order An outline of the thesaurus
13
Copyright © 2005 Access Innovations, Inc. Final Nodes The last position on the hierarchical tree – Taxonomy concept – narrower terms » final node - people, place or thing term » document instance » Letter to George Wiesman Dec 12, 2003 » Technical report number TR-1039 » Museum artifact 1706 wodden wagon wheel
14
Copyright © 2005 Access Innovations, Inc. Term Records – the Database Part Associative terms – Related terms Equivalence terms – Preferred and non preferred – Use and used for – Synonyms Hierarchical terms – Broader narrower terms – Parent Child
15
Copyright © 2005 Access Innovations, Inc. Other term record fields Scope notes Cross references History Term Status Category User defined
16
Copyright © 2005 Access Innovations, Inc. 2. Where does a taxonomy fit in the information circle?
17
Copyright © 2005 Access Innovations, Inc. Information Circle - Overview Taxonomy User Content Output
18
Copyright © 2005 Access Innovations, Inc. Content Taxonomy User Content Output Web Pages White Papers Research Reports Licensed Data Feeds Intranet Internal Reports Lotus Notes files Databases Public Relations Documents/Press Releases Market Research Reports Customer Relationship Management (CRM) HR Files Accounting/Financial Records Legal Documents Patents Museum artifacts
19
Copyright © 2005 Access Innovations, Inc. Taxonomy User Content Output Content – cont’d HTML – Meta name / Keywords DB – Field / Meta tag / Element XML – Entity table for valid values Content Creation:
20
Copyright © 2005 Access Innovations, Inc. Taxonomy User Content Output Taxonomy is applied to new and existing content: Meta Tags Thesaurus Terms Authority Terms Date Author Description etc. Rule BaseTaxonomy
21
Copyright © 2005 Access Innovations, Inc. Taxonomy – cont’d Taxonomy User Content Output Index data - Manually - Automatically Suggest new candidate terms Review
22
Copyright © 2005 Access Innovations, Inc. Output Taxonomy User Content Output Searchable Data - Internal Data - External Data
23
Copyright © 2005 Access Innovations, Inc. User Taxonomy User Content Output Web Browsing/Searching Database Browsing/Searching Query Resolution
24
Copyright © 2005 Access Innovations, Inc. User – cont’d Taxonomy OutputUser Content User Input - Suggested Candidate Terms - New Documents Reports Based on User Search - Search Logs - Null Hits (These will also suggest new candidate terms)
25
Copyright © 2005 Access Innovations, Inc. New Content Taxonomy User New Content Output The cycle begins again
26
Copyright © 2005 Access Innovations, Inc. Information Circle - Overview Taxonomy User Content Output
27
Copyright © 2005 Access Innovations, Inc. 3. Where to use a taxonomy Link the Taxonomy and Indexing Always in sync with the industry Keep up to date with terminology Automatically index the old data Filter newsfeeds Search using the Taxonomy File using the taxonomy Spell check using the taxonomy Link to translation system Catalog using the taxonomy Index a book
28
Copyright © 2005 Access Innovations, Inc.
31
Thesaurus Master
32
Copyright © 2005 Access Innovations, Inc.
33
Database Management System - Add Metadata using MAI Search Inverted File Aadvark Alligator Apple Advantage …. Zebra Record locator Accessinn.com/12345/demofile/recid15 Database records Each with many elements Portal Searching
34
Copyright © 2005 Access Innovations, Inc. Search Inverted File Aadvark Alligator Apple Advantage …. Zebra Record locator Accessinn.com/12345/demofile/recid15 Database records Each with many elements Portal Searching Many data bases can be reached
35
Copyright © 2005 Access Innovations, Inc. 4. Taxonomies for Communities of Practice
36
Copyright © 2005 Access Innovations, Inc. Taxonomies in a Community of Practice Nature of Communities of Practice (CoP) Taxonomies in context Value of taxonomies Creating a taxonomy Applying the taxonomy
37
Copyright © 2005 Access Innovations, Inc. Nature of CoPs Free flowing, loosely structured Simple, ad hoc categorization Active CoPs need organization Search tends to be hit-or-miss Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA
38
Copyright © 2005 Access Innovations, Inc. Taxonomies in Context A taxonomy aspires to be: a correlation of the different functional, regional and (possibly) national languages used by a community of practice a support mechanism for navigation a support tool for search engines and knowledge maps an authority for tagging documents and other information objects a knowledge base in its own right Reference: “Taxonomies: the vital tool of information architecture”, www.tfpl.com
39
Copyright © 2005 Access Innovations, Inc. Value of Taxonomies Improves organization & structure Facilitates navigation Facilitates knowledge discovery Reduces effort Saves time “Taxonomies are better created by professional indexers or librarians than by domain experts.” Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA
40
Copyright © 2005 Access Innovations, Inc. Naval Postgraduate School’s Homeland Security Taxonomy (1)
41
Copyright © 2005 Access Innovations, Inc. Naval Postgraduate School’s Homeland Security Taxonomy (2)
42
Copyright © 2005 Access Innovations, Inc. IBM Insight graphical view
44
Copyright © 2005 Access Innovations, Inc. Applying a Taxonomy (1) Manually Add terms into meta data fields Design navigation & site indexes with taxonomy hierarchy Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA
45
Incorporating Hierarchical Classification from a Taxonomy Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA
46
Applying a Taxonomy (2) System integration Search & retrieval systems Auto-assignment of metadata Categorization systems Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA
47
Applying the Taxonomy to a Digital Library Web portal Locally held documents Public repositories Commercial data sources Agency data sources INTERNET (public) spiders Meta-Search Tool Filtered content Search engine Automated categorization Library catalogs Search engine Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA
48
Copyright © 2005 Access Innovations, Inc. 5. Surrounding theories and applications
49
Copyright © 2005 Access Innovations, Inc. Other Vocabulary types Uncontrolled lists Classification System Subject headings Controlled vocabulary – usually synonyms and spelling Authority files Thesaurus Taxonomy
50
Copyright © 2005 Access Innovations, Inc. Uncontrolled list - define Add terms as they occur No cross reference Simple flat structure
51
Copyright © 2005 Access Innovations, Inc. Controlled term lists - defined State the preferred terms Provide allowed term entry Heavily cross referenced Not generally hierarchical Popular Easy to create
52
Copyright © 2005 Access Innovations, Inc. Controlled term list - format Cars – use Automobiles Personal Computer – use Microcomputer
53
Copyright © 2005 Access Innovations, Inc. Classification vs Subject Headings Classification – single spot or placement – browse physical list – often a numbering system – clear hierarchy – no or few cross references
54
Copyright © 2005 Access Innovations, Inc. Classification vs Subject Headings Subject headings – generic search – hidden classification system – related terms and cross references in heavy use – Usually the inverted form cells, electric – Alphabetic access
55
Copyright © 2005 Access Innovations, Inc. Authority systems - defined Lists of terms in the preferred format for use Frequently have cross references Widely available Frequently coded lists Brand names
56
Copyright © 2005 Access Innovations, Inc. Authority lists - examples ISO Country Name and Code – International Standards Organization ISO Language list NAICS (SIC) – Standard Industrial Classification Code (SIC) – Replaced by – North American Industrial Classification System (NAICS)
57
Copyright © 2005 Access Innovations, Inc. What is a thesaurus? Jessica L. Milstead. All Rights Reserved “For writers, it is a tool like Roget’s one with words grouped and classified to help select the best word to convey a specific nuance of meaning. For indexers and searchers, it is an information storage and retrieval tool: a listing of words and phrases authorized for use in an indexing system, together with relationships, variants and synonyms, and aids to navigation through the thesaurus” www.jelem.com
58
Copyright © 2005 Access Innovations, Inc. Thesaurus - defined For information retrieval 1960’s – indexing either intellectual or automatic – in searching – searching but not indexing – indexing but not searching – hierarchical view for searching
59
Copyright © 2005 Access Innovations, Inc. Thesaurus - defined Monolingual - standard – British – English - ISO 5578 – American – English –ANSI/NISO Z39.19 Multilingual – standard ISO 5579 – concept mapping – Eurovoc Discipline or Mission based - ad hoc
60
Copyright © 2005 Access Innovations, Inc. Thesaurus -standard format Main Entries Top Terms - TT Broader Terms - BT Narrower Terms - NT RELATED TERMS - RT Scope Notes - SN History - HI Date term added/changed - DA
61
Copyright © 2005 Access Innovations, Inc. Standards Monolingual – NISO / ANSI – Z39.19 – ISO 5578 Multilingual – ISO 5579
62
Copyright © 2005 Access Innovations, Inc. ISO Standards Set up already - easy to adopt Multiple broader terms The standards outline procedures – ISO -better for implementation – NISO much better reading
63
Copyright © 2005 Access Innovations, Inc. Why do we index ? Improve precision – define scope of terms Improve recall – different terms for same concept Guide to a field of expertise Learning tool Richer expression
64
Copyright © 2005 Access Innovations, Inc. Uses ? Indexing* – …process by which subject terms or classification symbols are assigned to concepts in documents – A thesaurus is also known as an indexing language – * not the building of the inverted file in computer sense of indexing
65
Copyright © 2005 Access Innovations, Inc. What are we controlling ? Synonyms – different terms same concept Polysemes or Homonyms – same word different meanings – Lead – Reading
66
Copyright © 2005 Access Innovations, Inc. How ? Meaning – delineation of scope of a term Term equivalence – linking of synonyms Disambiguation of homonyms – lead (metal) – lead (element) – lead (management)
67
Copyright © 2005 Access Innovations, Inc. Precision options Language specificity Coordination Compound terms - level of precoordination Homographs and scope notes Word distance indication
68
Copyright © 2005 Access Innovations, Inc. Precision options Structural relationships Links and roles Treatment and aspect codes Weighting
69
Copyright © 2005 Access Innovations, Inc. Disambiguation BillInvoice BillLegislative Bill Sport BillPerson
70
Copyright © 2005 Access Innovations, Inc. Disambiguation BillsInvoices BillsLegislation Bill Animal BillPerson PT NTBT RTRT BTNT
71
Copyright © 2005 Access Innovations, Inc. 6. How to build and maintain a taxonomy
72
Copyright © 2005 Access Innovations, Inc. How to build a taxonomy Collect the terms Pull out authority terms Organize into arrays Choose top terms Organize hierarchically Flesh out term records Test, review, and edit
73
Copyright © 2005 Access Innovations, Inc. Or said another way … Define scope Collect terms and relationships Identify existing taxonomies Identify resources Create & refine taxonomy Apply taxonomy Review and update
74
Copyright © 2005 Access Innovations, Inc. Maintain Steady stream of terms – Web logs – Null sets – New announcements – Indexing team – Library – Records managers – Etc. Candidate terms Out of date is nearly useless
75
Copyright © 2005 Access Innovations, Inc. Best Results Measures Accuracy Productivity Hits, Misses and Noise Precision (Recall) Relevance Ease of set up Time to production
76
Copyright © 2005 Access Innovations, Inc. Integration Thesaurus – full featured – multiple views – multiple versions – multiple languages Automatic indexing – filtering – assisted Data Harmony MAI and Thesaurus Master
77
Copyright © 2005 Access Innovations, Inc. Visual Taxonomy Ways to look – Hierarchical – Alphabetic – by term – Ring diagrams – Topic maps – Related terms Visual Taxonomy
82
Content Management System
83
Copyright © 2005 Access Innovations, Inc. API to Many Systems for CMS
84
Copyright © 2005 Access Innovations, Inc. Apply to the meta data Automatic application? Spider setting internally External web crawls – use all aliases Filter data Enhance search experience
85
Copyright © 2005 Access Innovations, Inc. Meta data The fields The elements – Class codes – Title – Author – Plaintiff – Product – subject / topic Meta Name Keywords in HTML
86
Copyright © 2005 Access Innovations, Inc.
87
7. How Taxonomies are used in Enterprise Information
88
Copyright © 2005 Access Innovations, Inc. Brand is repeated in several spots and tied to search as well
91
Another way of listing brands
92
Category list from taxonomy is tied to brand list and product list
93
Category code from the taxonomy is tied to the brand list and the product list
94
Copyright © 2005 Access Innovations, Inc. Enterprise Taxonomy Management Consistent application across entire site Synonyms are used interchangeably User doesn’t need to know the taxonomy Pop up view is helpful Site map for construction and browsing Allows hidden sections for internal use
95
Copyright © 2005 Access Innovations, Inc. Taxonomies Form the basis for knowledge sharing Add value to discussion Allow deeper retrieval Are straightforward to create Require on-going maintenance
96
Copyright © 2005 Access Innovations, Inc. Your Taxonomy There is too much information to pile it on the floor. It fits in many places in the information flow
97
Copyright © 2005 Access Innovations, Inc.
98
Data Feed Thesaurus Master MAI to add Metadata Database Management System Add Metadata using MAI Search Inverted File Implementing a Taxonomy in a Content Management Portal
99
Copyright © 2005 Access Innovations, Inc. Thank you for your time! Questions? Marjorie M.K. Hlava Access Innovations, Inc. 505-998-0800 mhlava@accessinn.com www.accessinn.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.