Strategies LLCTaxonomy Sept. 28, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. Ron Daniel Taxonomy Strategies LLC

Slides:



Advertisements
Similar presentations
Strategies LLC Taxonomy Nov. 20, 2009Copyright 2009 Taxonomy Strategies LLC. All rights reserved. Metadata: Defining & Harnessing Ron Daniel, Jr. Principal,
Advertisements

IUFRO International Union of Forest Research Organizations Eero Mikkola The Increasing Importance of Metadata in Forest Information Gathering NEFIS Symposium.
System Development Life Cycle (SDLC)
Building a SOA roadmap for your enterprise Presented by Sanjeev Batta Architect, Cayzen Technologies.
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
Chapter 10 Schedule Your Schedule. Copyright 2004 by Pearson Education, Inc. Identifying And Scheduling Tasks The schedule from the Software Development.
Information and Business Work
Spreadsheet Management. Field Interviews with Senior Managers by Caulkins et. al. (2007) report that Spreadsheet errors are common and have been observed.
Organising Information in your Website Steps and Schemes.
UI Standards & Tools Khushroo Shaikh.
Usability presented by the OSU Libraries’ u-team.
Environmental Terminology System and Services (ETSS) June 2007.
Multimedia & Website Design Working in Teams. This week Look at team work issues in web design Plan file and directory conventions Introduce formal software.
Lecture Nine Database Planning, Design, and Administration
What do you hate most about the web?
IBE312: Ch15 Building an IA Team & Ch16 Tools & Software 2013.
Strategies LLCTaxonomy November 8, 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Ron Daniel Taxonomy Strategies LLC
Customer Relationship Management
Pair Programming Testing 2, October 14, Administration  Project due Monday 2PM SHARP  Remember all parts of documentation (list of tests, project.
This chapter is extracted from Sommerville’s slides. Text book chapter
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Chapter 15 Designing Effective Output
Chapter 7 Requirement Modeling : Flow, Behaviour, Patterns And WebApps.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Developing an accessibility policy. In this talk we will discuss What is an accessibility policy Why do we need one? Getting started - steps to consult.
BSBIMN501A QUEENSLAND INTERNATIONAL BUSINESS ACADEMY.
Get More Value from Your Reference Data—Make it Meaningful with TopBraid RDM Bob DuCharme Data Governance and Information Quality Conference June 9.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
VSQUASK WELCOME Lester Pourciau Round Table Group Session 1
Organization Structure Chapter 08 McGraw-Hill/Irwin Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved.
1 Understanding Process Basics. BA 553: Business Process Management2 What is Systems Thinking? Systems thinking is a holistic approach to analysis that.
Developing an accessibility strategy. In this talk we will discuss an accessibility strategy an accessibility policy getting started - steps to consultation.
Using Taxonomies Effectively in the Organization v. 2.0 KnowledgeNets 2001 Vivian Bliss Microsoft Knowledge Network Group
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
An Online Knowledge Base for Sustainable Military Facilities & Infrastructure Dr. Annie R. Pearce, Branch Head Sustainable Facilities & Infrastructure.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Copyright © 2006 Access Innovations, Inc. 1 Building Taxonomies Part 5 Alice Redmond-Neal Access Innovations, Inc. Enterprise Search Summit New York City,
Creator: ACSession No: 16 Slide No: 1Reviewer: SS CSE300Advanced Software EngineeringFebruary 2006 (Software Quality) Configuration Management CSE300 Advanced.
Roadmap to successful ECM implementation Kateřina Divišová British Chamber of Commerce
EPA’s Environmental Terminology System and Services (ETSS) Michael Pendleton Data Standards Branch, EPA/OEI Ecoiformatics Technical Collaborative Indicators.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
Social Innovation Fund Creating an Application in eGrants Technical Assistance Call 1 – 2:00 p.m. Eastern Time on Friday, March 19, ;
Electronic Scriptorium, Ltd. AIIM Minnesota Chapter Metadata and Taxonomy Presentation Copyright Electronic Scriptorium, Ltd. All rights reserved, 1991.
Strategies LLC Taxonomy September 27, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. Making the Business Case for Taxonomy Joseph A.
Accommodation & Hospitality Services STAFF BRIEFING – No 15 ISO Quality Management.
+ Chapter 9: Management of Business Intelligence © Sabherwal & Becerra-Fernandez.
A Use Case Primer 1. The Benefits of Use Cases  Compared to traditional methods, use cases are easy to write and to read.  Use cases force the developers.
The Move from Planning to Implementation John Boetsch - NCCN / Olympic NP 2009 I&M Data Management Conference Tucson, Arizona – April 2009.
Strategies LLC Taxonomy September 25, 2008Copyright 2008 Taxonomy Strategies LLC. All rights reserved. Essentials of Metadata and Taxonomy Strategies:
Market Research & Product Management.
Electronic labnotes Mari Wigham COMMIT/. Information WUR  Organising, sharing, finding and reusing data  Expertise in: ● Modelling data.
Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part II.
Requirements Validation
Strategies LLC Taxonomy June 4, 2009Copyright 2006 Taxonomy Strategies LLC. All rights reserved. Taxonomy Validation Joseph A Busch, Founder & Principal.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
Software Reuse Course: # The Johns-Hopkins University Montgomery County Campus Fall 2000 Session 4 Lecture # 3 - September 28, 2004.
Requirements Engineering Processes. Syllabus l Definition of Requirement engineering process (REP) l Phases of Requirements Engineering Process: Requirements.
Requirement engineering & Requirement tasks/Management. 1Prepared By:Jay A.Dave.
School of Information, Fall 2007 University of Texas A. Fleming Seay Information Architecture Class Four.
Developed by Cool Pictures & MultiMedia PresentationsCopyright © 2004 by South-Western, a division of Thomson Learning. All rights reserved. Fundamentals.
Copyright © 2007, Oracle. All rights reserved. Managing Items and Item Catalogs.
Level 2 Business Studies AS90843 Demonstrate understanding of the internal operations of a large business.
Databases: What they are and how they work
Trinity Health Presenters:
SHAREPOINT METADATA & TAXONOMIES AUTOMATED
Taxonomies, Lexicons and Organizing Knowledge
Data Resource Management
Data Resource Management
UNIT No- III- Leverging Information System ( Investing strategy )
Presentation transcript:

Strategies LLCTaxonomy Sept. 28, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. Ron Daniel Taxonomy Strategies LLC Frequently Asked Questions about Taxonomies and Metadata

2 Taxonomy Strategies LLC The business of organized information Pop Quiz On a blank piece of paper: What questions did you want to have answered by coming to today’s talks? What new questions do you have, based on what you’ve learned from the previous presentations? Flag one question to be answered later. You do NOT have to provide your name. Please DO provide your job title, division, and either company or company type.

3 Taxonomy Strategies LLC The business of organized information Agenda  Pop Quiz  FAQs – Frequently Asked Questions  SAQs – Seldom Asked Questions  Today’s Questions

4 Taxonomy Strategies LLC The business of organized information What is a taxonomy – just a folder structure or something else? Irony in action – there is no agreed definition of what a “taxonomy” is.  When talking with someone about taxonomy, make sure you are talking about the same things. We look at taxonomies and metadata together.  The metadata specification will call for several fields that take pre-defined lists of values.  Those lists, flat or hierarchical, are “facets” within the overall taxonomy.

5 Taxonomy Strategies LLC The business of organized information Other things sometimes called taxonomy TypeRemarks Synonym Ring  Connects a series of terms together  Treats them as equivalent for search purposes e.g (Dog, Canine, Pooch, Mutt) (Cat, Feline, Kitty), … Authority File  Used to control variant names with a preferred term  Typically used for names of countries, individuals, organizations e.g. (IBM, Big Blue, International Business Machines Inc.) Classification Scheme  A hierarchical arrangement of terms  May or may not follow strict “is-a” hierarchy rules  Usually enumerated; ie, LC or Dewey Thesaurus  Expresses semantic relationships of: Hierarchy (broader & narrower terms) Equivalence (synonyms) Associative (related terms)  May include definitions Ontology  Resembles faceted taxonomy but uses richer semantic relationships among terms and attributes and strict specification rules  A model of reality

6 Taxonomy Strategies LLC The business of organized information How do taxonomies actually improve search? Input (Query) Side  “Search” using a small set of pre-defined values instead of trying to guess what word or words might have been used in the content.  Have synonyms mapped together so searches for “car” and “automobile” return the same things. Output (Results) Side  Organize search results into groups of related items.  Sorting and filtering  Refinement

7 Taxonomy Strategies LLC The business of organized information Taxonomy in action on the results side Position Category Company City State Salary

8 Taxonomy Strategies LLC The business of organized information Who should build the taxonomy? The taxonomy (and metadata specification) should be produced by a cross-functional team which includes business, technical, information management, and content creation stakeholders. The team should plan on maintaining the taxonomy as well as building it.  Maintenance will not (usually) be anyone’s full-time job.  Exact mix of people on team will change. It should be built in an iterative fashion, with more content and broader review for each iteration.

9 Taxonomy Strategies LLC The business of organized information How big should the taxonomy be? Consultant’s answer – “It depends”  How much content do you need to organize?  How fine-grained does the categorization need to be? Overly-simplistic method:  Nterms = # items / desired bucket size  (1 M documents, 100 documents / bucket = > 10k buckets)  Bad method – documents don’t distribute evenly Second method:  # facets ≈ Log(# items) ± 2  (1 M items => 5..7 facets)  Sum of terms across all facets < 1200 in most cases

10 Taxonomy Strategies LLC The business of organized information How do we know we have a good taxonomy? MethodProcessWhoRequiresValidation Walk-thruShow & explain  Taxonomist  SME  Team  Rough taxonomy  Approach  Appropriateness to task Walk-thruCheck conformance to editorial rules  Taxonomist  Draft taxonomy  Editorial Rules  Consistent look and feel Usability Testing Contextual analysis (card sorting, scenario testing, etc.)  Users  Rough taxonomy  Tasks & Answers  Tasks are completed successfully  Time to complete task is reduced User Satisfaction Survey  Users  Rough Taxonomy  UI Mockup  Search prototype  Reaction to taxonomy  Reaction to new interface  Reaction to search results Tagging Samples Tag sample content with taxonomy  Taxonomist  Team  Indexers  Sample content  Rough taxonomy (or better)  Content ‘fit’  Fills out content inventory  Training materials for people & algorithms  Basis for quantitative methods

11 Taxonomy Strategies LLC The business of organized information Taxonomy validation: Tagging content How many items? Goal Number of ItemsCriteria Illustrate metadata schema1-3Random (excluding junk) Develop training documentation 10-20Show typical & unusual cases Qualitative test of small vocabulary (<100 categories) 25-50Random (excluding junk) Quantitative test of vocabularies 3-10X number of categories Use computer-assisted methods when more than categories. Pre- existing metadata is the most meaningful. v The best way to validate a taxonomy is to use it to tag some content.

12 Taxonomy Strategies LLC The business of organized information Taxonomy validation: Closed card sorting Useful to validate whether the terms in a taxonomy are organized in a way that is commonly understood.  Ask people to sort narrower terms in a taxonomy into the broad categories or facets.  The card sort is considered closed if you provide the names of those broad categories.  Ask people if there are facets that they think should be added and why.  users are sufficient to get useful feedback.

13 Taxonomy Strategies LLC The business of organized information Taxonomy validation: Quantitative Method How evenly does it divide the content?  Documents will not distribute uniformly across categories  Zipf (1/x) distribution is expected behavior  80/20 rule in action (actually 70/20 rule) Leading candidate for splitting Leading candidates for merging Above the curve is better than expected

14 Taxonomy Strategies LLC The business of organized information What if I have to do it solo? Realize:  Its not totally solo – IT help, Graphics & UI help, Business Goals help, Funding help, Review & QA help…  You are the general contractor  It needs to be part of your objectives  Limit the objectives to what can be achieved by you, and by your organization Concentrate:  Resource allocation  (i.e. Manage your time)  Fundamental processes  Query log examination  Error correction procedure  Communications!!! Cherry-pick from Roles on a larger team:  Business Lead – align with organization goals, get needed resources, make cost/benefit decisions, report upstairs  IT Liaison – Work with IT specialists to get software installed, logs gathered, content harvested, etc. Consider impact of changes on tools and data  Taxonomy / Search Specialist – analyze behavior and suggest changes. Implement changes which pass cost/benefit muster  Website/User Representative – consider impact of changes on users and job performance

15 Taxonomy Strategies LLC The business of organized information Where do the benefits come from? Common taxonomy ROI scenarios Catalog site - ROI based on increased sales through improved:  Product findability  Product cross-sells and up-sells  Customer loyalty Call center - ROI based on cutting costs through:  Fewer customer calls due to improved website self-service  Faster, more accurate CSR responses through better information access Compliance – ROI based on:  Avoiding penalties for breaching regulations  Following required procedures (e.g. Medical claims) Knowledge worker productivity - ROI based on cutting costs through:  Less time searching for things  Less time recreating existing materials, with knock-on benefits of less confusion and reduced storage and backup costs Executive mandate  No ROI at the start, just someone with a vision and the budget to make it happen

16 Taxonomy Strategies LLC The business of organized information Agenda  Pop Quiz  FAQs – Frequently Asked Questions  SAQs – Seldom Asked Questions  Your Questions

17 Taxonomy Strategies LLC The business of organized information What should I be thinking about at the start of a taxonomy project? Taxonomy development and maintenance is the LEAST of three problems:  The Taxonomy Problem: How are we going to build and maintain the lists of pre-defined values that can go into some of the metadata elements?  The Tagging Problem: How are we going to populate metadata elements with complete and consistent values?  What can we expect to get from automatic classifiers? What kind of error detection and error correction procedures do we need? What fields do we need?  The ROI (Return On Investment) Problem: How are we going to use content, metadata, and vocabularies in applications to obtain business benefits?  More sales? Lower support costs? Greater productivity? Risk avoidance?  How much content? How big an operating budget? How to expose to users? Business Goals and Cultural Factors are major influences on tagging and taxonomy. These must be acknowledged at the start to avoid rework.

18 Taxonomy Strategies LLC The business of organized information What must change when the Taxonomy changes? There’s more to maintaining the Taxonomy than maintaining just the taxonomy.  The master copy of the taxonomy.  Announcements for stakeholders!  The information sent to downstream users of the taxonomy.  The versions and formats of the taxonomy distributed to others.  The list of changes.  The data tagged with the taxonomy?  The user interface which uses the taxonomy?  Backend system software which uses the taxonomy?  The training set for automatic classifiers?  The educational material for users, catalogers, programmers, etc.?

19 Taxonomy Strategies LLC The business of organized information Agenda  Pop Quiz  FAQs – Frequently Asked Questions  SAQs – Seldom Asked Questions  Your Questions

20 Taxonomy Strategies LLC The business of organized information Backup Slides

21 Taxonomy Strategies LLC The business of organized information Why do we usually recommend faceted taxonomies? Categorize in multiple, independent, categories. Allow combinations of categories to narrow the choice of items. 4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (10 4 )  Easier to maintain  Easier to reusue existing material  Can be easier to navigate, if software supports it Main Ingredients Cooking Methods Meal TypeCuisines Chocolate Dairy Fruits Grains Meat & Seafood Nuts Olives Pasta Spices & Seasonings Vegetables Breakfast Brunch Lunch Supper Dinner Snack African American Asian Caribbean Continental Eclectic/ Fusion/ International Jewish Latin American Mediterranean Middle Eastern Vegetarian Advanced Bake Broil Fry Grill Marinade Microwave No Cooking Poach Quick Roast Sauté Slow Cooking Steam Stir-fry 42 values to maintain ( ) 9900 combinations (10x6x11x15)

22 Taxonomy Strategies LLC The business of organized information What could possibly go wrong with a little edit? ERP (Enterprise Resource Planning) team made a change to the product line data element in the product hierarchy. They did not know this data was used by downstream applications outside of ERP. An item data standards council discovered the error. If the error had not been identified and fixed, the company’s sales force would not be correctly compensated. “Lack of the enterprise data standards process in the item subject area has cost us at least 30 person days of just ‘category’ rework.” Source: Danette McGilvray, Granite Falls Consulting, Inc. 22

23 Taxonomy Strategies LLC The business of organized information When should we NOT use facets? When you have to work with software that can’t handle them.  Remember, software is replaced but data is migrated. When you need to use an existing standard taxonomy. … By Content Type Calendars & Events Top Links… Holidays Upcoming Events Federal Reserve System… Beige Book Board of Governors FOMC More Calendars & Events… ERAC Officer Availability Staff Conference Toastmasters Tours Directories Documentation Forms News Policies & Procedures By Organization Federal Reserve System FRB Atlanta Board of Directors Executive Office Management Committee Research Division S&R Division Facets can help you build a useful hierarchy. This one is a mix of content type and organization.

24 Taxonomy Strategies LLC The business of organized information What are facets I might think about? E&P Lifecycle Hydro carbon System Geologic Age Process Mgmt Lease MgmtOther Orgs Basins, Reservoirs & Fields FacilitiesWellsDisciplines Countries & Regions Reserves Human Resources Content Types Production LocationsOrg Chart

Strategies LLCTaxonomy Sept. 28, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. Questions? Ron Daniel