Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strategies LLC Taxonomy May 14, 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy 1-2-3 Enterprise Search Summit 2007 Tutorial.

Similar presentations


Presentation on theme: "Strategies LLC Taxonomy May 14, 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy 1-2-3 Enterprise Search Summit 2007 Tutorial."— Presentation transcript:

1 Strategies LLC Taxonomy May 14, 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy 1-2-3 Enterprise Search Summit 2007 Tutorial

2 2 Taxonomy Strategies LLC The business of organized information Today’s agenda 9:00-9:055 minIntroduction 9:05-9:105 minWarm-up exercise 9:10-9:3525 minBuilding taxonomies 9:35-9:4510 minTaxonomy exercise 9:45-10:0520 minTaxonomy business case 10:05-10:2015 minTaxonomy & search 10:20-10:3515 minCoffee Break 10:35-11:0530 minTaxonomy ROI 11:05-11:1510 minROI exercise 11:15-11:4530 minTaxonomy governance 11:45-12:0015 minQ&A

3 3 Taxonomy Strategies LLC The business of organized information My taxonomy questions Priority (1-5)Questions Your title or role: Your org or industry: Your dept: Your name:(optional)

4 4 Taxonomy Strategies LLC The business of organized information Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance

5 5 Taxonomy Strategies LLC The business of organized information The Taxonomy problem: How to pick from > 5,000 faucets? By: v Category v Price v Brand v Color/Finish v # Handles v Series Name v Water Filter? v Faucet Spray v Handle Shape v Soap Dispenser?

6 6 Taxonomy Strategies LLC The business of organized information The main issue: What goes here? v When do the things in the list change? v How do we maintain the list? v What rules do we follow?

7 7 Taxonomy Strategies LLC The business of organized information What's involved in creating a taxonomy? v Metadata Scheme. Data fields for describing content so that it can be found and used. v Vocabularies. Collections of terms that are used to specify some of the metadata properties.  Relationships between content, fields or terms (hierarchical, equivalence, & associative)  Some vocabularies are big & hierarchical, some are small and flat. v Application Profile. Formal representation of metadata & vocabularies.

8 8 Taxonomy Strategies LLC The business of organized information Seven phases of taxonomy development Week:123456789101112 1Identify Objectives Conduct interviews 2Inventory Resources Identify, gather & review resources Define fields & purpose 3Specify Metadata 4Model Content Define content chunks & XML DTDs 5Specify Vocabularies Compile controlled vocabularies 6Specify Procedures Develop workflow, rules & procedures 7 Test & Train Manually tag small sample

9 9 Taxonomy Strategies LLC The business of organized information Taxonomy design phases need to be iterated 1Identify Objectives 2Inventory Resources 3Specify Metadata 4Model Content 5Specify Vocabularies 6Specify Procedures 7 Test & Train Interview core team and stakeholders Identify, gather & review resources Define fields & purpose Define content chunks & XML DTDs Compile controlled vocabularies Develop workflow rules & procedures Plan & Prototype Manually tag small sample Gather additional resources, if any Revise if needed, bake into alpha CMS Revise, use in alpha CMS alpha workflows in CMS Alpha Dev & Test Review tagged samples, default procedures Use alpha CMS to tag larger sample Modify CMS for beta Revise, use in beta CMS Modify & extend workflows Gather additional sources, if any Beta D&T Interview alpha users Use beta CMS to tag larger sample Finalize training materials & train staff Modify for 1.0 Revise using team procedu re Finalize procedure materials Final D&T Interview beta users

10 10 Taxonomy Strategies LLC The business of organized information Licensing an existing taxonomy See Factiva’s taxonomy www.taxonomywarehouse.comwww.taxonomywarehouse.com v There are usually license fees, but these will be less than the effort to develop an equivalent taxonomy. v But pre-existing taxonomies rarely fit an organization’s needs and may require extensive customization. Recommendation v Adopt a faceted approach. v Reuse existing (especially internal) vocabularies for as many of the facets as possible. v Plan on doing full-custom “Content Type” and “Topic” taxonomies.

11 11 Taxonomy Strategies LLC The business of organized information Free sources for 8 common taxonomies TaxonomyDefinitionPotential Sources OrganizationOrganizational structure.SP 800-87, U.S. Government Manual, Your organizational structure, etc. Content TypeStructured list of the various types of content being managed or used. Dublin Core Type Vocabulary, AGLS Document Type, Your records management policy, etc. IndustryBroad market categories such as lines of business, life events, or industry codes. SIC, NAICS, Your market segments, etc. LocationPlace of operations or constituencies. FIPS 5-2, FIPS 55-3, ISO 3166, UN Statistics Div, US Postal Service, Your sales regions, etc. FunctionFunctions and processes performed to accomplish mission and goals. Federal Enterprise Architecture Business Reference Model, Enterprise ontology, Your business functions, etc. TopicBusiness topics relevant to your mission & goals. Federal Register Thesaurus, NAL Agricultural Thesaurus, Your research areas, etc. AudienceSubset of constituents to whom a piece of content is directed or intended to be used. GEM, ERIC Thesaurus, IEEE LOM, Your psycho-graphics or personas, etc. Products & Services Names of products/programs & services. ERP system, Your products and services, etc.

12 12 Taxonomy Strategies LLC The business of organized information Typical product catalog: A-Z, then idiosyncratic categories

13 13 Taxonomy Strategies LLC The business of organized information How to analyze existing product catalog categories: Principles and priorities Preparing a product catalog for facet browsing (aka Guided Navigation) requires a category hierarchy and additional attributes. Principles 1. Categories and subcategories that could be swapped are candidates for conversion to attributes. 2. Repeated lists of subcategories signal a possible need for an attribute. 3. The number of attributes should not exceed six or seven, so not all attribute candidates should be used. Avoid selecting strongly correlated attributes, such as “Weight” and “Shipping Weight”. Priorities 1. Choose Categories that apply to many products, over those with few products. 2. Choose Attributes that apply to many Categories over those that apply only to very few categories.

14 14 Taxonomy Strategies LLC The business of organized information Product categories example: Wireless carrier Products Accessories Content Phones Services Batteries Cases Chargers Data Hands-Free Headsets Miscellaneous Conferencing Internet / Data Landline Phone Network & Roaming Relay Services Solutions Wireless Data Versatile Phones Smart Devices Basic Phones Prepaid Phones International Only Phones Mobile Broad- band Cards Purchased Subscription

15 15 Taxonomy Strategies LLC The business of organized information Product attributes example: Digital cameras in an electronics catalog v Types of attributes  Generic attributes – Brand/Product Family/Model – Price Range – Usually Ships  Merchandising attributes – Usage (E-mail, Internet Browsing, Programming, …) – Segment (Home, Business, Education, Government …) – Region & Country – Most Popular – New – Related Products  Specialized attributes – Capacity (Battery; Memory; MB; GB; BPS, …) – Resolution (DPI; Megapixels; XGA, XGA, UXGA, …) – Size (Display; Screen;...) – Standard (a, b, g, n, …; scsi, ata, sata, eide, …; dimm, simm, …) – Type (Camera; Battery; Display; Printer; Server; Storage; Switch; …) Resolution 3 Megapixels (4) 4 Megapixels (5) 5 Megapixels (27) 6-8 Megapixels (21) Brand Canon (15) Fuji (10) Kodak (17) Nikon (8) Olympus (9) Type Point & Shoot (25) Digital SLR (10) Packages (5) Price Range $100-250 (5) $250-500 (16) $500-1000 (19) More than $1000 (3)

16 16 Taxonomy Strategies LLC The business of organized information Faceted taxonomy theory & practice v How many terms are needed to provide sufficient granularity? Not as many as you think! v Post-coordinate indexing allows several simple controlled vocabularies to be combined, rather than using a single large pre-coordinated vocabulary.

17 17 Taxonomy Strategies LLC The business of organized information The power of faceted taxonomy 10,000  4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (10 4 )  Easier to maintain  Easier to tag by content authors  Can be easier to navigate v It’s more effective to increase the number of facets, than to increase the number of terms per facet. Advocacy Contractors & Grantees Environmental Professionals Federal Facilities General Public Industry Kids Researchers & Scientists Small Business Students Audience Advisory Exposure Food Safety Health Assessment Health Effect Health Risk Occupational Health Pesticide Effects Sun Protection Toxicity HealthIndustry Allergen Biological Contaminant Carcinogen Chemical Explosive Liquid Waste Microorganism Ozone Pesticide Radioactive Waste Substance Agriculture & Cattle Automobile Repair Chemical Dry Cleaning Electronics & Computer Energy Extractive Industries Food Processing Leather Tanning & Finishing Metal Finishing

18 18 Taxonomy Strategies LLC The business of organized information Automatically created taxonomies v Documents can be ‘clustered’ based on similarities and differences. v Problems:  Typically only a single hierarchy  No overall plan  Results hard for people to navigate What does “North” mean on this map?

19 19 Taxonomy Strategies LLC The business of organized information Automatic taxonomy construction software v Software can scan large quantities of content and extract statistically significant words and phrases. v Example:  Archive of 10 publications analyzed for topics related to “copyright.” v Software does a poor job of  De-duplication.  Turning significant words and phrases into a larger structure.  Discriminating between “gold” and “garbage.” v Software is good for  Getting an understanding of the key noun phrases in a large collection.  Providing test cases for evaluating a taxonomy. Source: Sample data courtesy of nStein.

20 20 Taxonomy Strategies LLC The business of organized information Most popular flickr tags on 20 Feb 2007 http://www.flickr.com/photos/tags/ http://www.flickr.com/photos/tags/ Sort flickr categories into 5 or fewer groups. Then label each group.

21 21 Taxonomy Strategies LLC The business of organized information Taxonomy exercise— Facet grouping v Universal taxonomy facets  By location (spatially)  By time (chronologically)  By type (genre)  By physical properties (size, color, shape, etc.)  By subject (topic) Richard Saul Wurman. Information Architects (1996)

22 22 Taxonomy Strategies LLC The business of organized information Taxonomy exercise— Facet grouping LocationTimeType ColorSubject Sort flickr categories into 5 or fewer groups. Then label each group.

23 23 Taxonomy Strategies LLC The business of organized information Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance

24 24 Taxonomy Strategies LLC The business of organized information Business case and motivations for taxonomies v How are we going to use content, metadata, and taxonomies in applications to obtain business benefits?

25 25 Taxonomy Strategies LLC The business of organized information What technology analysts have said: Add metadata to search on! v “Adding metadata to unstructured content allows it to be managed like structured content. Applications that use structured content work better.” v “Enriching content with structured metadata is critical for supporting search and personalized content delivery.” v “Content that has been adequately tagged with metadata can be leveraged in usage tracking, personalization and improved searching.” v “Better structure equals better access: Taxonomy serves as a framework for organizing the ever-growing and changing information within a company. The many dimensions of taxonomy can greatly facilitate Web site design, content management, and search engineering. If well done, taxonomy will allow for structured Web content, leading to improved information access.”

26 26 Taxonomy Strategies LLC The business of organized information Fundamentals of taxonomy ROI v Tagging content using a taxonomy is a cost, not a benefit. v There is no benefit without exposing the tagged content to users in some way that cuts costs or improves revenues. v Putting taxonomy into operation requires UI changes and/or backend system changes, as well as data changes. v You need to determine those changes, and their costs, as part of the ROI.

27 27 Taxonomy Strategies LLC The business of organized information Product utilization: Taxonomy compared to search v Conversion rate increases.  HomeDepot.com – Double digit increase.  1-800-Flowers.com – More than a 10% increase.  Otto Group (Kaleidoscope, Freemans, Grattan, and lookagain catalogs) – 130% increase. v Lift in average order size.

28 28 Taxonomy Strategies LLC The business of organized information Product catalog: Taxonomy compared to search Benefit: Increased conversion rate & revenue lift Web sales net income$ 80,000,000 Increased conversion rate30% $ 24,000,000 Order size lift 10% $ 8,000,000 Potential revenue increase per year$ 32,000,000

29 29 Taxonomy Strategies LLC The business of organized information Usability research: Taxonomy compared to search v “We found that users preferred a browsing oriented interface for a browsing task, and a direct search interface when they knew precisely what they wanted.” Marti Hearst (and others) v “The category interface is superior to the list interface in both subjective and objective measures.” Hao Chen & Susan Dumais

30 30 Taxonomy Strategies LLC The business of organized information Usability research: Taxonomy compared to search Median Search Time in Seconds In top 20 results Not in top 20 results Category is 36% faster Category is 48% faster Source: Chen & Dumais

31 31 Taxonomy Strategies LLC The business of organized information Time saved: Taxonomy compared to search 1 hour per day searching x 36% faster = 22 minutes each day 22 minutes x 250 working days per year = 5500 minutes or 92 hours per year

32 32 Taxonomy Strategies LLC The business of organized information Time saved: Taxonomy compared to search Benefit:Increase service efficiency Number of call center calls per month 50,000 Average cost per call$ 20 Call response costs per month$ 1,000,000 Total call response costs per year $12,000,000 Percentage of self-serviced calls due to improved information browsing30% Service costs savings per year $ 3,600,000

33 33 Taxonomy Strategies LLC The business of organized information Trusted advisers: Taxonomy avoids costs v “The amount of time wasted in futile searching for vital information is enormous, leading to staggering costs …” Sue Feldman, v Sun’s usability experts calculated that 21,000 employees were wasting an average of six minutes per day due to inconsistent intranet navigation structures. When lost time was multiplied by staff salaries, the estimated productivity loss exceeded $10M per year—about $500 per employee per year. Jakob Nielsen, useit.com

34 34 Taxonomy Strategies LLC The business of organized information Knowledge workers spend up to 2.5 hours each day looking for information … … But find what they are looking for only 40% of the time. Source: Kit Sims Taylor

35 35 Taxonomy Strategies LLC The business of organized information 25% 8% Knowledge workers spend more time re-creating existing content than creating new content Source: Kit Sims Taylor (cited by Sue Feldman in her original article)

36 36 Taxonomy Strategies LLC The business of organized information Cost saved by not recreating content Benefit:Increase in productivity Number of employees 100 Average employee salary $ 80,000 Employee costs per year $8,000,000 Increase in productivity from not re- creating content25% Employee cost savings per year $2,000,000

37 37 Taxonomy Strategies LLC The business of organized information Business case summary 1. Classifications and classification-like schemes are being used to facilitate information seeking in the workplace, and on the web. 2. Users take advantage (and prefer) this type of scheme (faceted navigation) when it is made available in the user interface. 3. Hierarchical or facet navigation can be guided by the User Interface. 4. Facet navigation is best combined with keyword searching. E.g., keyword search followed by faceted navigation of results.

38 38 Taxonomy Strategies LLC The business of organized information Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance

39 39 Taxonomy Strategies LLC The business of organized information Do taxonomies actually improve search? v Input (Query) Side  “Search” using a small set of pre-defined values instead of trying to guess what word or words might have been used in the content.  Have synonyms mapped together so searches for “car” and “automobile” return the same things. v Output (Results) Side  Organize search results into groups of related items.  Sorting and filtering  Refining search results

40 40 Taxonomy Strategies LLC The business of organized information Finding information should not be about “Feeling Lucky”

41 41 Taxonomy Strategies LLC The business of organized information Google search on “pcb” – Returns > 28M items Taxonomy could suggest “polychlorinated biphenyls”

42 42 Taxonomy Strategies LLC The business of organized information 169,169 items Categorized results Refine search by clicking on categories

43 43 Taxonomy Strategies LLC The business of organized information Taxonomy in action on the results side: www.CareerBuilder.com search on IT positions www.CareerBuilder.com By Category By Company By City By State

44 44 Taxonomy Strategies LLC The business of organized information Typical search on “database”: List of ranked hits on www.oracle.com/prNavigator.jspwww.oracle.com/prNavigator.jsp Select item

45 45 Taxonomy Strategies LLC The business of organized information Faceted search on “database”: Categorized results + Ranked list Select item, or Refine search by clicking on categories

46 46 Taxonomy Strategies LLC The business of organized information Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance

47 47 Taxonomy Strategies LLC The business of organized information Key Factors in ROI (Return on Investment) Breadth v “How many people will metadata affect?” Repeatability v “How many times a day will they use it? Cost/Benefit v “Is this a costly effort with little or no benefits?” Source: Todd Stephens, Dublin Core Global Corporate Circle

48 48 Taxonomy Strategies LLC The business of organized information Some common taxonomy ROI scenarios Product catalog v Increased conversions v Increased self-service & use v Increased productivity Customer support v Cutting requests for information costs v Increased web statistics (page hits) v Higher ACSI (American Customer Satisfaction Index) score Knowledge worker productivity v Less time searching, more time working v Avoiding re-creating information that already exists Compliance v Improved regulatory compliance v Improved enforcement v Higher PARS (Performance & Accountability Reports) v FDIC, SOX, HIPAA, etc. compliance

49 49 Taxonomy Strategies LLC The business of organized information How to estimate costs— Tagging Taxonomy FacetHier? Typical CV Size Time/ Value (min) Avg # values / Item$ / Min Cost/ Element AudienceN100.252 $ 0.42 $ 0.21 Content TypeN200.251 $ 0.42 $ 0.11 Organizational UnitY500.52 $ 0.42 Products & ServicesY5001.54 $ 0.42 $ 2.52 Geographic RegionY1000.52 $ 0.42 Broad TopicsY40024 $ 0.42 $ 3.36 TOTALS 1080515 $ 7.04 Inspired by: Ray Luoma, BAU Solutions

50 50 Taxonomy Strategies LLC The business of organized information How to estimate costs— Assumptions ASSUMPTIONS Enterprise SW License$ 100,000 Maintenance/Support15% SW Implementationx 200% Legacy Content Items100,000 Content Growth Rate15% Tagging/Item$ 7.04 Enterprise Taxonomy$ 100,000

51 51 Taxonomy Strategies LLC The business of organized information How to estimate costs— Total cost of ownership (TCO) DescriptionYear 1Year 2Year 3Year 4Year 5 SW Licenses $ 100,000 Maintenance $ 15,000 Implementation $ 200,000 App Tech Support $ 30,000 Tagging Legacy Content $ 703,500 Ongoing $ 105,525 $ 121,354 $ 139,557 $ 160,490 Taxonomy Creation $ 100,000 Maintenance $ 15,000 TOTAL $ 1,103,500 $ 165,525 $ 181,354 $ 199,557 $ 220,490

52 52 Taxonomy Strategies LLC The business of organized information Benefits Assumptions Productivity Assumptions Employee costs per year (100 employees, $75,000 per year)$ 7,500,000 Increase in productivity (from not recreating content)25% Cost savings$ 1,875,000 Percentage realized in first year10% Service Efficiency Assumptions Customer service calls cost/year$ 12,000,000 Efficiency (from customer self-service)30% Cost savings$ 3,600,000 Percentage realized in first year10%

53 53 Taxonomy Strategies LLC The business of organized information Sample ROI Calculations DescriptionYear 1Year 2Year 3Year 4Year 5 Costs Software Licenses/ Maintenance $ 100,000 $ 15,000 Implementation/Support $ 200,000 $ 30,000 Taxonomy Creation/ Maintenance $ 100,000 $ 15,000 Legacy/Ongoing Tagging $ 703,500 $ 105,525 $ 121,354 $ 139,557 $ 160,490 Benefits Productivity increases $ - $ 187,500$ 1,875,000 Service efficiency gains $ - $ 360,000$ 3,600,000 Yearly Net Benefits$(1,103,500) $ 381,975$ 5,293,646$ 5,275,443$ 5,254,510 Payback period1.1Years until Benefits = Costs Inspired by: Todd Stephens, Dublin Core Global Corporate Circle

54 54 Taxonomy Strategies LLC The business of organized information ROI exercise— Why tag? v Tagging content using a taxonomy is a cost, not a benefit. v There is no benefit without exposing the tagged content to users in some way that cuts costs or improves revenues. v Putting taxonomy into operation requires UI changes and/or backend system changes, as well as data changes. v You need to determine those changes, and their costs, as part of the ROI. v List the top 5 benefits from tagging content. Then, rank the benefits by priority. Priority (1-5)Questions

55 55 Taxonomy Strategies LLC The business of organized information ROI exercise— Benefits from tagging content Priority (1-5)Questions  List the top 5 benefits from tagging content. Then, rank the benefits by priority. Potential benefits from tagging content 1. Reduce information requests 2. Reduce cost per UU (unique user) 3. Expand to new audiences 4. Improve customer satisfaction 5. Improve performance & accountability 6. Increase number of successful website searches 7. Increase number of links (internal cross-cutting & external) 8. Reduce time to build websites 9. Increase metadata consistency & quality 10. Decrease time to create & publish marketing information 11. Improve e-commerce 12. Decrease product development lifecycle

56 56 Taxonomy Strategies LLC The business of organized information Why implement a taxonomy? v Find relevant information quicker. v Discover information you didn’t know you had. v Avoid duplicate efforts to “reinvent the wheel” v Learn from mistakes. v Create better quality work product. v Provide overview as well as details about a subject. v Demonstrate relationships between content. v Reduce complexity. Taxonomy & Content Classification

57 57 Taxonomy Strategies LLC The business of organized information Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance

58 58 Taxonomy Strategies LLC The business of organized information Taxonomy requires a business processes v Taxonomies must change, gradually, over time if they are to remain relevant. v Maintenance processes need to be specified so that the changes are based on rational cost/benefit decisions.

59 59 Taxonomy Strategies LLC The business of organized information Taxonomy change process overview Working Copies of CVs, maintain in Taxonomy Tool Site Search Tool Portal Project Archives ’ DMS’ MetataggingTool Search UI 2: NASA Taxonomy Team decides when to update snapshots of external CVs 4: Updated versions of CVs to Consumers NASA Taxonomy Governance Environment 3: Team adds value to snapshots through definitions, synonyms, classification rules, training materials, etc. Internally Created CVs Codes NASA Competencies CVs from other NASA Sources External Standard Vocabularies ’ ’ 2:Taxonomy Team decides when to update CV snapshots Taxonomy Facets 3:Team adds value via definitions, synonyms, classification rules, training materials, etc. 1:External controlled vocabularies (CVs) change on their own schedule Taxonomy Governance Environment 4:Updated versions of CVs published to consumers CV Consumers CV Sources Subject Codes Expertise Other Internal External Standard Site Search Tool Portal Working Papers Web CMS DAM Tagging Tool Search UI Internally Created Taxonomy Tool CV = Controlled Vocabulary

60 60 Taxonomy Strategies LLC The business of organized information Who should maintain the taxonomy? v The taxonomy (and metadata specification) should be produced by a cross-functional team which includes business, technical, information management, and content creation stakeholders. v The team should plan on maintaining the taxonomy as well as building it.  Maintenance will not (usually) be anyone’s full-time job.  Exact mix of people on team will change. v It should be built in an iterative fashion, with more content and broader review for each iteration.

61 61 Taxonomy Strategies LLC The business of organized information Taxonomy maintenance: Generic team charter v Taxonomy Team is responsible for maintaining:  The Taxonomy, a multi-faceted classification scheme.  Associated taxonomy materials, such as: – Editorial Style Guides. – Taxonomy Training Materials. – Metadata Standard.  Team rules and procedures for change management. v Taxonomy Team will consider costs and benefits of suggested changes. v Taxonomy Team will:  Manage relationship between providers of source vocabularies and consumers of the Taxonomy.  Identify new opportunities for use of the Taxonomy across the enterprise to improve information management practices.  Promote awareness and use of the Taxonomy.

62 62 Taxonomy Strategies LLC The business of organized information Taxonomy team: Generic roles  Keeps committee on track with larger business objectives.  Balances cost/benefit issues to decide appropriate levels of effort.  Obtains needed resources if those on committee can’t accomplish a particular task.  Estimates costs of proposed changes in terms of amount of data to be retagged, additional storage and processing burden, software changes, etc.  Helps obtain data from various systems.  Committee’s liaison to content creators.  Estimates costs of proposed changes in terms of editorial process changes, additional or reduced workload, etc.  Suggests potential taxonomy changes based on analysis of query logs, indexer feedback.  Makes edits to taxonomy, installs into system with aid of IT specialist.  Reality check on process change suggestions.

63 63 Taxonomy Strategies LLC The business of organized information Where taxonomy changes come from experience End User Firewall Taxonomy Content Tagging Logic Application UI Tagging UI Tagging Staff Taxonomy Editor Staff notes ‘missing’ concepts Query log analysis Requests from other parts of NASA experience End User Taxonomy Team Firewall Taxonomy Content Tagging Logic Tagging Logic Application UI Application UI Tagging UI Tagging UI Tagging Staff Taxonomy Editor Staff notes ‘missing’ concepts Query log analysis Requests from other parts of the organization Team Considerations 1.Business goals. 2.Changes in user experience. 3.Retagging cost. Recommendations by Editor 1.Small taxonomy changes (labels, synonyms) 2.Large taxonomy changes (retagging, application changes) 3.New “best bets” content. Application Logic

64 64 Taxonomy Strategies LLC The business of organized information Taxonomy maintenance processes v Different organizations will need to consider their own change processes.  Organization 1: A custodian is responsible for the content, but checks facts with department heads before making changes.  Organization 2: Analysts suggest changes, editors approve, copyeditors verify consistency.  Organization 3: Marketing reps ask for a change, taxonomy editor makes demo, web representative approves it. v Change process MUST also consider cost of implementing the change  Retagging data.  Reconfiguring auto-classifier.  Retraining staff.  Changes in user expectations.

65 65 Taxonomy Strategies LLC The business of organized information Taxonomy maintenance workflow Problem? Yes No Suggest new name/category Review new name Taxon- omy Copy edit new name Add to enterprise Taxonomy Analyst Editor Copywriter Sys Admin Taxonomy Tool

66 66 Taxonomy Strategies LLC The business of organized information Sample taxonomy editor: Data Harmony Hierarchy Browser Standard Term Info

67 67 Taxonomy Strategies LLC The business of organized information Taxonomy editing tools vendors Ability to Execute low high Completeness of Vision VisionariesNiche Players Most popular taxonomy editor is MS Excel An immature area– No vendors are in upper- right quadrant! MultiTes is widely used, cheap with functionality High functionality /high cost products ($100K+)

68 68 Taxonomy Strategies LLC The business of organized information Taxonomy maturity model v Taxonomy governance processes must fit the organization. v As consultants, we notice different levels of maturity in the business processes around content management, taxonomy, and metadata. v Honestly assess your organization’s metadata maturity in order to design appropriate governance processes. v The following slides present results from a survey of metadata and taxonomy practices at 87 organizations. How does your organization compare?

69 69 Taxonomy Strategies LLC The business of organized information 2005 Maturity survey: Search practices n=87 Not current practice Being developedIn practice Former practice NA or Unknown Search Box in standard place on all web pages. 20% (12)11% (7)62% (38)2% (1)5% (3) Search engine indexes multiple repositories in addition to web sites. 25% (15)21% (13)44% (27)2% (1)8% (5) Spell Checking. 31% (19)18% (11)38% (23)0% (0)13% (8) Synonym Searching. 41% (25)23% (14)30% (18)0% (0)7% (4) Search results grouped by date, location, or other factors in addition to simple relevance score. 37% (22)20% (12)37% (22)0% (0)7% (4) Queries are logged and the logs are regularly examined 31% (19)25% (15)31% (19)5% (3)8% (5) Common queries identified, 'best' pages for those queries are found, and search engine configured to return them at the top. (Best Bets) 46% (28)25% (15)21% (13)0% (0)8% (5) Advanced computation of relevance based on data in addition to the text of the document. 43% (26)16% (10)25% (15)0% (0)16% (10) A faceted search tool, such as Endeca, has been implemented for the organization's external site or product catalog search. 68% (41)7% (4)10% (6)0% (0)15% (9) A faceted search tool, such as Endeca, has been implemented for the organization's internal website(s) or portal. 57% (34)15% (9)17% (10)0% (0)12% (7)

70 70 Taxonomy Strategies LLC The business of organized information 2005 Maturity survey: Metadata practices n=87 Not current practice Being developedIn practice Former practice NA or Unknown Metadata standards are developed for the needs of each system with no overall attempt to unify them. 22% (13)12% (7)37% (22)20% (12)10% (6) An Organization-wide metadata standard exists and new systems consider it during development. 37% (22) 20% (12)0% (0)7% (4) The Organization-wide metadata standard is based on the Dublin Core. 52% (30)16% (9)21% (12)0% (0)12% (7) Multiple repositories comply with metadata standard. 52% (31)20% (12)17% (10)0% (0)12% (7) A Cataloging Policy document exists to teach people how to tag data in compliance with organizational metadata standard. 48% (29)20% (12) 0% (0)12% (7) The Cataloging Policy document is revised periodically. 48% (29)15% (9)17% (10)0% (0)20% (12) A centralized metadata repository exists to aggregate and unify metadata from disparate sources. 57% (34)17% (10) 0% (0)10% (6) Metadata is manually entered into web forms. 15% (9)12% (7)61% (36)3% (2)8% (5) Metadata is generated automatically by software. 38% (23)18% (11)27% (16)2% (1)15% (9) Metadata is generated automatically, then reviewed manually for correction. 48% (29)18% (11)17% (10)2% (1)15% (9)

71 71 Taxonomy Strategies LLC The business of organized information 2005 Maturity survey: Taxonomy practices n=87 Not current practice Being developedIn practice Former practice NA or Unknown Org Chart Taxonomy - One based primarily on the structure of the organization. 36% (21)10% (6)34% (20)5% (3)15% (9) Products Taxonomy - One based primarily on the products and/or services offered by the organization. 37% (22)10% (6)32% (19)5% (3)15% (9) Content Types Taxonomy - One based primarily on the different types of documents. 28% (16)21% (12)40% (23)5% (3)7% (4) Topical Taxonomy - One based primarily on topics of interest to the site users. 20% (12)36% (21)34% (20)3% (2)7% (4) Faceted Taxonomy - One which uses several of the approaches above. 32% (19)29% (17)34% (20)0% (0)5% (3) The Taxonomy, or a portion of it, was licensed from an outside taxonomy vendor. 75% (44)3% (2)14% (8)0% (0)8% (5) The Taxonomy follows a written 'style guide' to ensure its consistency over time. 47% (28)22% (13)20% (12)0% (0)10% (6) The Taxonomy is maintained using a taxonomy editing tool other than MS Excel. 35% (21)17% (10)40% (24)2% (1)7% (4) The Taxonomy was validated on a representative sample of content during its development. 28% (17)22% (13)33% (20)3% (2)13% (8) A Roadmap for the future evolution of the Taxonomy has been developed. 38% (23)40% (24)13% (8)0% (0)8% (5)

72 Strategies LLC Taxonomy May 14, 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Questions? Mike Lauruhn 415-378-2747 mlauruhn@taxonomystrategies.com mlauruhn@taxonomystrategies.com Donna Fritzsche 312-804-5629 dfritzsche@taxonomystrategies.com dfritzsche@taxonomystrategies.com Joseph A. Busch 415-377-7912 jbusch@taxonomystrategies.com Ron Daniel Jr 925-368-8371 rdaniel@taxonomystrategies.com jbusch@taxonomystrategies.com rdaniel@taxonomystrategies.com

73 73 Taxonomy Strategies LLC The business of organized information Taxonomy 1-2-3: Webography (1) H. Chen, S. Dumais. “Bringing order to the web: automatically categorizing search results.” Proceedings of CHI 2000. pp. 145-152. http://research.microsoft.com/copyright/accept.asp?path=http://research.microsoft.com/~sdumais/chi2001.pdf&pub=ACM http://research.microsoft.com/copyright/accept.asp?path=http://research.microsoft.com/~sdumais/chi2001.pdf&pub=ACM Sue Feldman. “The high cost of not finding information.” 13:3 KM World (March 2004) http://www.kmworld.com/publications/magazine/index.cfm?action=readarticle&Arti cle_ID=1725&Publication_ID=108 http://www.kmworld.com/publications/magazine/index.cfm?action=readarticle&Arti cle_ID=1725&Publication_ID=108 P.R. Hagen. Must search stink? Forrester Research, June 2000. K. Hall. Content tagging strategies. Giga Information Group, February 2001. M. Hearst, A. Elliott, J. English, R. Sinha, K. Swearingen & K. Yee. “Finding the flow in website search.” 45 Communications of the ACM (Sept 2002) http://www.ischool.berkeley.edu/~hearst/papers/cacm02.pdf http://www.ischool.berkeley.edu/~hearst/papers/cacm02.pdf J. Morrison. “How to create effective taxonomy.” ZDNet Asia, August 18 2004. http://www.zdnetasia.com/builder/program/dev/0,39045513,39190441,00.htm http://www.zdnetasia.com/builder/program/dev/0,39045513,39190441,00.htm

74 74 Taxonomy Strategies LLC The business of organized information Taxonomy 1-2-3: Webography (2) Jakob Nielsen. Web Design and Development. Eric T. Peterson. “Home Depot uses Endeca to consolidate search and navigation, dramatically increasing conversion: case study.” Jupiter Research (July 11, 2005) http://www.jupiterresearch.com/bin/item.pl/research:casestudy/79/id=96483/ http://www.jupiterresearch.com/bin/item.pl/research:casestudy/79/id=96483/ S. Phillips, E. Maguire, C. Shilakes. Content management: The new data infrastructure–Convergence and divergence out of chaos. Merrill Lynch, June 2001. K.S. Taylor. "The brief reign of the knowledge worker," 1998. http://online.bcc.ctc.edu/econ/kst/BriefReign/BRwebversion.htm http://online.bcc.ctc.edu/econ/kst/BriefReign/BRwebversion.htm Taxonomy & content classification: market milestone report. Dephi Group, 2002. http://www.delphiweb.com/knowledgebase/documents/upload/pdf/2176.pdf?sessi on=%5Bg_sid%5D http://www.delphiweb.com/knowledgebase/documents/upload/pdf/2176.pdf?sessi on=%5Bg_sid%5D Taxonomy Warehouse. www.taxonomywarehouse.comwww.taxonomywarehouse.com Richard Saul Wurman. Information Architects (1996)

75 75 Taxonomy Strategies LLC The business of organized information VendorsTaxonomy Editing ToolsURLs Knowledge Workbench www.convera.com/solutions/retrievalware/KnowledgeWorkbench.a spx Cuadra STAR/Thesaurus www.cuadra.com/products/thesaurus.html Thesaurus Master www.dataharmony.com/products/tm.htm Knowledge Engineering Workbench www.entrieva.com/entrieva/html_site/knowworkbench.htm MetaTagger www.interwoven.com/products/content_intelligence/index.html SmartDiscovery www.inxight.com/pdfs/Taxonomy_FinalWeb.pdf MS Excel Intelligent Topic Manager www.mondeca.com MultiTes Pro www.multites.com Taxonomy/Authority File Manager www.nstein.com/epub/ncm-taxonomy.asp Protégé http://protege.stanford.edu/ SchemaServer www.schemalogic.com Synaptica www.factiva.com/products/taxonomy/synaptica.asp?node=menuEl em1511 Taxonomy Manager www.teragram.com/solutions/taxonomy.htm Term Tree www.termtree.com.au Enterprise Vocabulary Server www.webchoir.com/products/wvs.html Designer www.wordmap.com/Enterprise/Taxonomy_and_metadata_manage ment.html


Download ppt "Strategies LLC Taxonomy May 14, 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy 1-2-3 Enterprise Search Summit 2007 Tutorial."

Similar presentations


Ads by Google