Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004.

Slides:



Advertisements
Similar presentations
Technology Roadmap Project Harold Flescher VP-Elect, Technical Activities August 2008, Region 1 Meeting.
Advertisements

Wincite Knowledge Warehousing and Networking Sophisticated Simplicity.
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
COMBASE: strategic content management system Soft Format, 2006.
Making Search Relevant SchemaLogic Gary Carlson Chief Taxonomist
Metadata Strategies Alternatives for creating value from metadata Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 15 Creating Collaborative Partnerships.
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
Prentice Hall, Database Systems Week 1 Introduction By Zekrullah Popal.
Tom Sheridan IT Director Gas Technology Institute (GTI)
1. THE TRANSLATION NETWORK Case Studies 2 THE TRANSLATION NETWORK6 Intel OSTC uses Drupal for localization for the Tizen initiative into 6 languages.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
W w w. f a c t i v a. c o m © 2002 Dow Jones Reuters Business Interactive LLC (trading as Factiva). All rights reserved. The Keys to Successful Strategic.
GlobalWisdom Software Bravo TM Reviewer for Online Editors Abhijit Patil.
Sydney Knowledge Management Forum Derek Jardine Managing Director Information Solutions Information Solutions Stop searching,
Requirements Specification
Knowledge Management Tools and Knowledge Portals Chapter 13.
Chapter 11 Managing Knowledge.
Automating the Contract Review, Approval and Record Filing processes at NAIC By: Eric Chamas, NAIC Rick Macartney, Gimmal.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Implementing Metadata Marjorie M K Hlava, President Access Innovations, Inc. Albuquerque, NM
Artificial Intelligence
Presentation By: Brian Mais. What Is It? Content Management Systems(CMS) describes software that manage content, workflow, and collaboration online and.
Text Analytics And Text Mining Best of Text and Data
Software Development Concepts ITEC Software Development Software Development refers to all that is involved between the conception of the desired.
0 © WIPO – 2003 PF & CJF CLAIMS Computer-Assisted Categorisation of Patent Documents in the International Patent Classification Patrick Fiévet, CLAIMS.
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
Chapter 5 Web Management Tools and Web Portals. Awad –Electronic Commerce 2/e © 2004 Pearson Prentice Hall 2 Portals:The Basics Portals are considered.
Solutions. People. Innovation.1 Content Transformation in the Next Decade Solutions. People. Innovation.
Business Computing 550 Lesson 1. Fundamentals of Information Systems, Fifth Edition An Introduction to Information Systems in Organizations.
Yahoo! Acquires Inktomi March 19 th, Yahoo!
KM Technology Assessment “Knowledge and team collaboration servers” DSC8030/CIS8260 Dr. Samaddar Summer 2004 Jon A. Preston.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
Configuration Management (CM)
Metadata and Taxonomies The Best of Both Worlds Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
From FAUST to VOYAGER efforts to maintain map and geodata stocks 17th Conference of the LIBER Groupe des Cartothécaires TALLINN, Estonia June 2010.
KMS Products By Justin Saunders. Overview This presentation will discuss the following: –A list of KMS products selected for review –The typical components.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
EPA’s Environmental Terminology System and Services (ETSS) Michael Pendleton Data Standards Branch, EPA/OEI Ecoiformatics Technical Collaborative Indicators.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Alexey Kolosoff, Michael Bogatyrev 1 Tula State University Faculty of Cybernetics Laboratory of Information Systems.
1 Optimizing compiler tools and building blocks project Alexander Drozdov, PhD Sergey Novikov, PhD.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Topic Maps introduction Peter-Paul Kruijsen CTO, Morpheus software ISOC seminar, april 5 th 2005.
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Smart searches … Smarter metadata Evan Bailey and Sue Carpenter Knowledge Sharing Services.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Automatic Categorization of Patent Applications Presentation to the 3rd IPC Workshop, WIPO, Feb , The need for automatic categorization of.
SMART Regulation: A Primer By: OMG Finance task Force, DTC Members.
How Sage ERP X3 Systems Can Benefit Businesses.  Sage X3 is an affordable and flexible ERP solution designed to help mid-sized companies manage business.
Chapter 11 Managing Knowledge.
Organization and Knowledge Management
Introduction Characteristics Advantages Limitations
Design and Manufacturing in a Distributed Computer Environment
Chapter 11 Managing Knowledge.
Tools of Software Development
Linda MacDonald, Hay Group
Trust and Culture on the Web
CLAIMS CLassification Automated InforMation System
QKS Classifier Overview
Presentation to SISAI Luxembourg, 12 June 2012
Jonathan Griffin, Managing Director, IFIS Publishing &
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 Tools of Software Development l 2 types of tools used by software engineers:
Promotion performance matters more than ever
Presentation transcript:

Beth Golden Manager, Editorial Services Factiva Intelligent Indexing SLA 2004

Agenda Factiva Intelligent Indexing Application of Factiva Intelligent Indexing Pros and Cons Quality Control

Factiva Intelligent Indexing Factiva Taxonomy 320,000 companies 760+ industries 450+ news subjects 370+ regions 22 languages

FII Structure One universal taxonomy Building blocks Inclusive hierarchy Polyarchy Synonyms and alias names Full descriptions Variable depth and breadth

Polyarchy Internet/Online services E-commerce Internet browsers Internet portals Internet search engines Internet service providers etc. Computers Computer hardware Computer services Computer stores Networking Semiconductors Software Applications software GroupWare Intelligent agents Internet browsers etc.

Factiva Intelligent Indexing Company Codes Industry Codes Subject Codes Region Codes Codes On documents Search

FII Application Code mapping Entity extraction Rule-based system Linguistic analysis software Manual review

Code Mapping Most information providers provide some form of metadata. This is matched to relevant Factiva indexing terms. Advantages: Easy and quick Efficient use of existing data Disadvantages: Mismatches between coding schemes Different interpretations of same concepts Variable quality – which sources do you trust?

Entity extraction This tool finds company names which are then compared to our controlled vocabulary. Advantages: Consistent Precise Disadvantages: Ambiguous names High maintenance costs

Symbology Snapshot

Rule-based system Sets of IF-THEN statements established by editors, information architects, or subject-matter experts. Advantages: Good at highly formulaic content Precise Disadvantages: Need thousands of rules for a complete system Maintenance of the rules themselves becomes VERY expensive! Only captures explicit concepts

Example

Linguistics-based categorization This tool is currently employed across all English, French, German and Spanish language publications. A combination of linguistic analysis and statistical algorithms allows new content to be compared to example data and coded appropriately. Advantages: Scales to millions of documents, thousands of categories, multiple languages Copes well with change Fits editorial workflow Good fine-tuning tools – editorial control Codes implicit as well as explicit concepts Disadvantages: Training time and cost

Editorial Control Set relevance levels Maintain training set Stop words - correlation and multiple meanings "Chechnya" to the industries model, as it was triggering the freelance journalist code (because so many of them were dying there)

Manual coding About 200 editors spread across main time zones Advantages: Humans easily grasp the gist of the story Cope well with exceptions Visible/Controllable Disadvantages: Very resource-intensive = Expensive Slow Inconsistent (subjective and temporal) Not scalable

Review process Lists reviewed every three months, redefinition, new codes, expansion changes Market research/customer feedback and behavior Changes to parent schemes/standards Editorial/Quality control feedback Internal coding forum 45-day notice period

Quality control Sampling by editors Scoring for precision and recall Analysis by source, language, code, editor etc. Feedback to editors and systems Corrective action

Results Three million articles coded a month All receive a level of autocoding Seventy-nine percent automation or more than two million are auto- coded with no further manual review

Recap Factivas taxonomy is Factiva Intelligent Indexing Factiva uses a hybrid methodology for application Factiva has a coding team for governance and maintenance End result: Factiva Intelligent Indexing leverages our editorial strengths, combining human experience and expertise with the latest automation software to implement a completely flexible and granular indexing system across all of our content.