Taxonomies in Electronic Records Management Systems May 21, 2002.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

eClassifier: Tool for Taxonomies
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Top Tips Enterprise Content Management Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
The Engine Driving Business Management in Project Centric Environments MAGSOFT INTERNATIONAL LLC.
Stefania Bergamasco, Cecilia Colasanti An integrated approach to turn statistics into knowledge combining data warehouse, controlled vocabularies and advanced.
Taxonomies of Knowledge: Building a Corporate Taxonomy Wendi Pohs, Iris Associates
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Tom Sheridan IT Director Gas Technology Institute (GTI)
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Merging Taxonomies. Assertion Creation and maintenance of large ontologies will require the capability to merge taxonomies This problem is similar to.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Alternatives to Metadata IMT 589 February 25, 2006.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Libraries and Institutional Content Management Systems
IBE312: Ch15 Building an IA Team & Ch16 Tools & Software 2013.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Text Analytics And Text Mining Best of Text and Data
ROI & Impact: Quantitative & Qualitative Measures for Taxonomies Wednesday, 11 February :00 – 12:30 PM MST Presented by Jay Ven Eman, Ph.D., CEO.
An Introduction to Content Management. By the end of the session you will be able to... Explain what a content management system is Apply the principles.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
Metadata Standards and Applications 1. Introduction to Digital Libraries and Metadata.
The Engine Driving Purchasing Management in Complex Environments MAGSOFT INTERNATIONAL LLC.
Ontology Summit2007 Survey Response Analysis -- Issues Ken Baclawski Northeastern University.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung,
Markup and Validation Agents in Vijjana – A Pragmatic model for Self- Organizing, Collaborative, Domain- Centric Knowledge Networks S. Devalapalli, R.
Data Mining By Dave Maung.
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
Information Architecture & Design Week 5 Schedule -Planning IA Structures -Other Readings -Research Topic Presentations Nadalia your Presentations.
Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Text Analytics in Action: Using Text Analytics as a Toolset TBC 4:15 p.m. - 5:00 p.m. Marjorie Hlava Semantic enrichment / Semantic Fingerprinting.
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
1 Value of Taxonomies in Knowledge Management Joe Schehr VP Knowledge Management and Technology Solutions LexisNexis.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Research Methods School of Economic Information Engineering Dr. Xu Yun :
After testing users Compile Data Compile Data Summarize Summarize Analyze Analyze Develop recommendations Develop recommendations Produce final report.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Documents.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
ARMA Boston Spring Seminar 2011 Jesse Wilkins, CRM.
January 26, 2010 WAPRO Electronic Records Management 101 WAPRO Electronic Records Management 101 Washington Association of Public Records Officers Kyle.
Capture This! PO105 James Green. Table of Contents Capture Overview Laserfiche Tools Case Scenarios Questions and Answers.
Empowering the Knowledge Worker End-User Software Engineering in Knowledge Management Witold Staniszkis The 17th International.
Database-driven Websites Sok King Tng INF 385E 26 Oct 2006.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Electronic Document Management By Portford Solutions Group, Inc.
Witold Staniszkis Empowering the Knowledge Worker End-User Software Engineering in Knowledge Management Witold Staniszkis
Information Organization: Overview
Research on Knowledge Element Relation and Knowledge Service for Agricultural Literature Resource Xie nengfu; Sun wei and Zhang xuefu 3rd April 2017.
Development of the Amphibian Anatomical Ontology
NAVIGATION SYSTEMS GROUP #4.
Taxonomies, Lexicons and Organizing Knowledge
Spicing Up Your Knowledge Management Strategy
Web Mining Department of Computer Science and Engg.
Taxonomies and Classification for Organizing Content
IS213 Project Presentation - April 6, 2006
Information Organization: Overview
PolyAnalyst™ text mining tool Allstate Insurance example
Presentation transcript:

Taxonomies in Electronic Records Management Systems May 21, 2002

2 Terms  Controlled Vocabulary »A collection of preferred terms that indicates which terms are preferred and which are variants of the preferred terms.  Thesaurus »A type of controlled vocabulary that shows the hierarchical (parent-child), associative (related terms) and equivalent (synonymous) relationships among terms.  Taxonomy »Hierarchical classification of elements within a domain. One type of taxonomy is a File Plan.  Ontology »A hierarchical classification that is more complex and subtle than a taxonomy. It explains relationships between objects by mapping relationships, such as “part of” or “located in”. Also called knowledge mapping.

3 Why Use a Taxonomy  Management of Records »Structure for Classification »Navigational Tool »Reduced Burden on Users  More Consistent Than Humans  Sheer Volume of Information »Document Level Vs Folder Level »High Speed Processing  More Than 80% of All Information Is Unstructured

4 Example: FirstGov.gov

5 Example: File Plan

6 Example: Visual Map

7 How Do Taxonomy Tools Work?  General »Understand Relevancy to Categories »Create Knowledge Clusters »Enable Types to Be Combined  Training Based »Require Representative Samples »Identify Patterns »Create Statistical Models  Rule Based »Process Rules Devised and Hand-coded by Humans »Contain Keywords and Logical Relationships  Linguistics Based »Use Algorithms »Understand Linguistic and Semantic Elements

8 Taxonomy Uses in Electronic Recordkeeping Systems  Auto Categorization  Searching and Browsing  File Plan Creation and Maintenance

9 Auto Categorization

10 Auto Categorization Case Studies  National Archives and Records Administration »12,000 Documents »Granular File Plan »Single Repository  University of Nevada for Department of Energy »150,000 Documents »99.5% Accuracy in Identifying Non Records »Less Than 1 in 20 Documents Required Human Intervention  Department of Education »90,000 Documents »Accuracy Enhanced by Narrowing Categories »100% Accuracy Categorizing to Retention Periods

11 Auto Categorization Anecdotes  Factiva »1500 Topics »Target of 45% Accuracy »Achieving 60-80% Accuracy  Gartner Group Findings »Typical Accuracy Is 80-95% When Broad Non-overlapping Categories Are Used  One Vendor’s Literature »75-80% Accuracy Is Typical

12 Common Themes  Mutually Exclusive Categories Increase Accuracy  Big Bucket Theory  Easy Retrieval Vs Easy Filing  Stove Piping Vs Open System  Human Effort Necessary »Select Training Set »Quality Control »Fine Tune

13 Comments on Accuracy  No Case Study Achieved 100% in Categorization  Accuracy Rises With Fewer Categories  Short Documents Can Have Too Little Content  Long Documents Can Cover Too Many Topics  Fly in Ointment »Accuracy Diminishes at Each Level Down in the File Plan »In a System Where Auto Categorization Is 80% Accurate, the Expected Accuracy for the Proper Assignment of a Document At the Third Level Down Would Be About 51%  Critical Element - Records Management »Control of File Plan »Understanding of Technology

14 Searching and Browsing

15 Searching and Browsing The only thing harder than finding something is finding it again.  Searching »Looking For Something You Know About »Generally Easy in Electronic Documents »The Document Comes to You  Browsing »Looking Through a Collection to See What Is There »Generally Difficult in Electronic Documents »You Go to the Document(s)  Contextual Browsing »Accessing Other Relevant Content Related to the Content Being Viewed. »Other Objects May Not Have Been Grouped Together »Prospective Navigation

16 The Beauty of a Taxonomy Tool  Delivers Information You Did Not Know You Had  Identifies Unknown Associations Between Documents  Summarizes or Abstracts Content  Uses Visual Maps  Does Not Require User to Know Location of the Information

17 Visual Map

18 Visual Map Drilled to Document Level

19 File Plan Creation

20 File Plan Creation Using a Taxonomy Tool  Information Architecture Based on Content  Electronically Generated File Plan “It is possible to produce affinities through automatic categorization without a pre-existing taxonomy. These categories can then be edited and renamed. Once categories have been created by humans, documents and other information objects can be automatically assigned to those categories.” Gartner Group

21 Feasibility of Using Taxonomy Software for File Plan Creation  Feasible to Develop a True Records Management File Plan Using Software  Feasible to Populate an RMA With Electronically Generated File Plan  Feasible to Compile a Quantity of Quality Documents to Mine for Creating the Taxonomy

22 Then Why Hasn’t It Been Done?  Existing Retention Schedules Not Built This Way »Map Required File Plan Elements to Appropriate Retention Classification OR »Re-Engineer Retention Schedules  Usability for File Plan Development Untested »Statistically Correct BUT »May Not Appear Natural to Users

23 Scenario  Humans Create Top Level of File Plan  Software Mines Data - Free Categorization  Software Forms Category Patterns  Humans Use Results to Create One Subsidiary Level in File Plan  Humans Associate Retention Schedules at Secondary Level of File Plan  Software Auto Categorizes Documents Into File Plan

24 Hybrid Solution

25 Conclusion  Use for Support – Not Full Automation  Ongoing Human Commitment to Plan, Create, and Maintain  Consider Portfolio Approach – Mixing Products  Very Effective for Searching and Browsing  Capture and Search Legacy Documents That Otherwise Would Be Too Costly to Process  Integrate With Document Imaging System  Potential Is Huge

26 Resources

27 Web Sites With Energy Glossaries/Thesauri      

28 Cool Stuff  Thesaurus Management Tools » » »  Books »Content Management Bible, Bob Boiko »Information Architecture for the World Wide Web, Louis Rosenfeld & Peter Morville  Free Search Engine for Your Web Site »

29 More Cool Stuff  DOE Related Use of Taxonomy Tool for Searching and Browsing »  Controlled Vocabularies, Thesauri and Classification Systems Available on the Web » »  Information Architecture White Papers and Publications »  Virtual Library »

30 THANK YOU! Angela Tayfun, CRM AT&T Government Solutions, Inc Gallows Road Vienna, VA Ph: