ROI & Impact: Quantitative & Qualitative Measures for Taxonomies Wednesday, 11 February 2009 12:00 – 12:30 PM MST Presented by Jay Ven Eman, Ph.D., CEO.

Slides:



Advertisements
Similar presentations
What Have You Done for Me Lately? Jude Hayes LexisNexis SLA Nashville 2004.
Advertisements

LeadManager™- Internet Marketing Lead Management Solution May, 2009.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
HOLLOWAY CONSULTING. Class Announcements  Service Learning Assignment:  Progress Report should be completed one week after initial meeting with the.
Researching Efficiently and Cost Effectively on Lexis Advance™ and Lexis.com 1.
Taxonomies of Knowledge: Building a Corporate Taxonomy Wendi Pohs, Iris Associates
© CRISP Technologies CRISP Technologies introduces InvoiceCheck ® “Your Automated Invoice Processing Solution”
Access Innovations, Inc. Marjorie M.K. Hlava Jay Ven Eman.
Copyright © 2015 Pearson Education, Inc. AIS Development Strategies Chapter
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
MARIMOR INDUSTRIES, INC. SECURE SCAN
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Missouri Public Service Commission Electronic Filing & Information System a Case Study for Business Process Management Todd Craig, CIO Missouri Public.
Taxonomies in Electronic Records Management Systems May 21, 2002.
INFO 624 Week 3 Retrieval System Evaluation
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
1 Copyright © 2005 Access Innovations, Inc. OWL Mapping Thesaurus Format NEXT GENERATION KNOWLEDGE ORGANIZATION SYSTEMS: INTEGRATION CHALLENGES AND STRATEGIES.
Sunday May 4 – 5 PM Bradford, Hlava, McNaughton
Implementing Metadata Marjorie M K Hlava, President Access Innovations, Inc. Albuquerque, NM
Enterprise Content Management (ECM) Simulator for Return on Investment (ROI)
Content Management and Process Automation Presented by Mark Chambers SE Regional Manager Document Imaging Solutions, Inc.
Improving Government Effectiveness by Automating Data Capture: A Government Case Study Presented by: Jeff Toren Kofax Image Products Presented by: Ray.
Prohibited agreements: Article 101 (3) Julija Jerneva ( )
Introductions Jim Enzinna, Chief, Licensing Division Mark DiNapoli, Assistant Chief, Licensing Division Tracie Coleman, Head, Information Section Vince.
Load Test Planning Especially with HP LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Chapter 3 – Opportunity Cost of Capital and Capital Budgeting
1//hw Cherniak Software Development Corporation ARM Features Presentation Alacrity Results Management (ARM) Major Feature Description.
Point of Sale Collection, cleanup and data analysis.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
BENCHMARKING An Essential Reporting Tool Presented By: Nancy Brooks 2008 NAEP Annual Meeting.
DAS: State Controller's Division1January 2010 Department of Administrative Services State Controller’s Division Updated January, 2010.
Using Taxonomies Effectively in the Organization v. 2.0 KnowledgeNets 2001 Vivian Bliss Microsoft Knowledge Network Group
ICT – Tutorial 1 Pizza Hut case study. Q1) Provide a simple sketch of the Micros Fidelio RMS system. Indicate how the system interfaces with the input.
Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall 4.1.
Part VII: Sharing Business Analyst resources across the enterprise with Business Analyst Server Getting to Know ESRI Business Analyst Fred L. Miller, PhD.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall. Performance Evaluation Chapter 10 1.
How Global Companies Can Close the Globalization Gap with DITA
Human Resource Management Lecture 27 MGT 350. Last Lecture What is change. why do we require change. You have to be comfortable with the change before.
PRESENTED BY: AMIT BISWAS BANKING FINANCIAL SERVICES.
Metrics Simple definition: Metrics are a set of measurements that quantify results. Metrics are used to establish benchmarks, make comparisons, and evaluate.
April, 2008 Better Together! Integrated GP & CRM AN INDEPENDENT MEMBER OF BAKER TILLY INTERNATIONAL 505 AFFILIATE OFFICES WORLDWIDE.
1 Perform! Benchmark™ - OVERVIEW & DATA CAPTURE INFORMATION Current State Benchmarking & Best Practices Analysis.
Information Retrieval Evaluation and the Retrieval Process.
Chapter 17 Purchasing & Quality Copyright 2006 Prentice Hall Publishing Company 1 Purchasing, Quality Control, and Vendor Analysis.
Using Taxonomies Effectively in the Organization KMWorld 2000 Mike Crandall Microsoft Information Services
Electronic Scriptorium, Ltd. AIIM Minnesota Chapter Metadata and Taxonomy Presentation Copyright Electronic Scriptorium, Ltd. All rights reserved, 1991.
Strategies LLC Taxonomy September 27, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. Making the Business Case for Taxonomy Joseph A.
TOPIC: Transportation Research Thesaurus: Taxonomy Development and Use Cases 14 February :00 PM EST Presented by Jay Ven Eman, Ph.D., CEO Access.
Financial and Managerial Accounting Wild, Shaw, and Chiappetta Fifth Edition Wild, Shaw, and Chiappetta Fifth Edition McGraw-Hill/Irwin Copyright © 2013.
2007 Ryko Annual Sales Meeting 2007 Ryko Annual Sales Meeting Category Management Services (CMS) 2007 Ryko Annual Sales Meeting 2007 Ryko Annual Sales.
Information Architecture & Design Week 10 Schedule - Construction of IA and Web - Rosenfeld Chapters 17 & 18 - IA Tools - Presentations.
WebFOCUS Magnify: Search Based Applications Dr. Rado Kotorov Technical Director of Strategic Product Management.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Information Architecture & Design Week 10 Schedule -Construction of IA and Web -Rosenfeld Chapters 17 & 18 -Research Topic Presentations -Research Papers.
Topic Six: Marketing Information Systems 1.  ALERT: We could spend the entire semester on this topic and not cover it in its entirety therefore every.
Slide 1 Software Construction Software Construction Lecture 3.
INFORMATION RETRIEVAL MEASUREMENT OF RELEVANCE EFFECTIVENESS 1Adrienn Skrop.
CHAPTER 5 Network Applications. Chapter Outline 5.1 Network Applications 5.2 Web E-Learning and Distance Learning 5.4 Telecommuting.
KNOWLEDGE MANAGEMENT (KM) Session # 33. Corporate Intranet A Conceptual Model INTRANET Production Team— New Product Budget Director— New Product Knowledge.
Cut down on the time it takes employees to process invoices using Square 9’s SmartSearch integration with Microsoft Dynamics GP. SmartSearch allows invoice.
Innovative Novartis Knowledge Center
Automating Accounts Payable
Using Metrics to Tune Up Your SharePoint Estate
Potential financial motivations for end-user programming
Tools of Software Development
Best Practices: AP Automation Dan Thomson
Construction and Materials Management System
NetSuite’s Value Propositions
Smart companies carefully track their investments in every part of their business. By carefully monitoring and managing their return on investment (ROI)
Presentation transcript:

ROI & Impact: Quantitative & Qualitative Measures for Taxonomies Wednesday, 11 February :00 – 12:30 PM MST Presented by Jay Ven Eman, Ph.D., CEO Access Innovations, Inc. / Data Harmony / / DHUG 2009

First, some questions  Do you know what a taxonomy is?  Does your boss’s boss know? Care?  What are YOU trying to accomplish?  What are your objectives?  What isn’t working? What is? How badly? How much? Who? Where? Copyright  2007 Access Innovations, Inc.

First, some questions - 2  Who are your searchers? Internal? Intranet? External? Web? Fee based (commercial)? How many? What do they do? How do they do it?  What are they seeking?  Why? Copyright  2007 Access Innovations, Inc.

First, some questions - 3  Where are they looking?  How many searching environments? Physical? Internal resources? External resources? Search interfaces?  And so on… Copyright  2007 Access Innovations, Inc.

“Meaning” starts with a knowledge organization system (KOS)  Uncontrolled list  Name authority file  Synonym set/ring  Controlled vocabulary  Taxonomy  Thesaurus Not complex - $ Highly complex - $$$$ LOTS OF OVERLAP! Topic Map Ontology SKOS

The Pain of Search Copyright  2007 Access Innovations, Inc. The Pain of Search Percent Number of Employees Search & Use Timel Per Week Time Searching Per Week Time Analysing Per Week Average Loaded Salary Annual Cost of Looking Search Time ReductionDifference Mission critical1000Hours $ Per Hour10% High ,736,0007,862,400873,600 Medium ,928,00040,435,2004,492,800 Low ,120,0002,808,000312,000 $56,784,000$51,105,600$5,678,400

ROI - Segments  Cost of taxonomy system  Indexing costs  Cost of getting system ready  Ongoing maintenance  Increased efficiency  Increased quality of retrieval  Cost of legacy system maintenance

Copyright © 2005 Access Innovations, Inc. Taxonomy construction ProcessTerms/hr# of terms Cost/hrCost From scratch45000$75$93,750 License K License & customize ,500+ 5,000 Auto- generate/cleanup + tool , ,000 Mapping ,875

Indexing & Search Metrics  Hit, Miss, Noise  Subjective Relevance Aboutness  Statistical Precision Recall Level of effort

Hit, Miss, Noise  Hit – exactly what a human indexer would use  Miss – human indexer would use but system did not assign  Noise – system assigned but human did not Relevant noise – could have been assigned Irrelevant noise – just plain wrong

Subjective  Relevance Reflects how akin it is to the users request  Aboutness Reflects the topical match between the document content and the term How well the topic describes what the document is about  Varies with level of conceptual terms vs. factual terms in the thesaurus

Subjective  “There is now a 92% accuracy rating accuracy on accounting and regulatory document search based on hit, miss and noise or relevance, precision and recall statistics…Access Innovations.” USGAO  “IEEE had their system up and running in three days, in full production in less than two weeks.” Institute of Electrical and Electronics Engineers (IEEE)  “The American Economic Association said its editors think using it is fun and makes time fly!” American Economic Association (AEA)  “ ProQuest CSA have achieved a 7 fold increase in productivity – thus they have four licenses.” ProQuest CSA  “Weather Channel finds things 50% faster using Data Harmony. A significant saving in time.” The Weather Channel

Statistical  Precision Correct retrieval / Total retrieval Hits / hits + noise  Recall Correct retrieval / Total correct in system Hits / Hits + misses  Level of effort Hits / Hits + misses + noise

Cost Goals  Cost Savings Software/hardware More efficient delivery systems Retirement of legacy systems  Cost Avoidance Additional staff not needed to scale Lower training costs

Productivity Goals  Productivity gains Employee productivity – fourfold Get up to speed faster Learn vocabulary faster Able to capture peoples knowledge in the rule base Staff savings / redeployment Elimination of new hires

Additional Benefits  Revenue Generation Higher hit rates  More purchases off the site  Competitive advantage Shorter product / sales cycles Faster implementation Better search experience  Ability to meet regulatory requirements

Go – No Go  Reach 85% precision to launch for productivity - assisted  Reach 85% for filtering or categorization Sorting for production  Level of effort to get to 85%  Integration into the workflow is efficient

Benchmarks  15 – 20% irrelevant returns / noise  Amount of work needed to achieve 85% level  How good is good enough? Satisfice = satisfaction + suffice How much error can you put up with?

Example ROI Calculation  Assume – 5,000 term thesaurus 1.5 synonyms per terms 7,500 terms total  Assume 85% accuracy Use assisted for indexing Use automatically for filtering  Assume $75 per hour for staff  Assume 10,000 records for test batch

Indexing costs with Data Harmony  80% of rules built automatically  7,500 x.8 = 6,000  20% require complex rules Average rule takes 5 minutes (Actually MUCH faster using M.A.I. GUI) 5 x 1,500 = 7,500 minutes 125 hours x $75 = $9,375

Indexing Costs  Base cost of MAIstro EE - $60,000  Cost of getting system ready Programming support and integration  Estimated at 2 weeks programming $125 / hour = $10,000 Rule building  Estimated at 125 hours $75 / hour = $9,375  Possible need to re-run training set several times  Ongoing maintenance Estimated at 15% of purchase price for license = $9,000 Rule building for new terms 50 terms per quarter  200 terms x.8 = 160 automatic  40 at 5 minutes per term = 200 minutes /60 = 3.33 hours x $75 = $250  Targeted initial accuracy at 85%

Indexing costs  Year one $60,000 + $10,000 + $9,375 = $79,375  Years thereafter = $9250  85% accuracy

ROI  Taxonomy costs = $67,500  Indexing costs = $79,375  Pain of search – difference = $5,678,400  If off by factor of 4, then a positive ROI of 241% Copyright  2007 Access Innovations, Inc.

ROI & Impact: Quantitative & Qualitative Measures for Taxonomies Wednesday, 11 February :00 – 12:30 PM MST Presented by Jay Ven Eman, Ph.D., CEO Access Innovations, Inc. / Data Harmony / / Thank you!