GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.

Slides:



Advertisements
Similar presentations
Advanced Decision Support for Archival Processing of Presidential E-Records: Results and Demonstration William Underwood, P.I. Georgia Tech Research Institute.
Advertisements

SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. P4 Task 2 Fact Extraction using a CNL Current Status David Mott, Dave Braines, ETS,
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.
Word sense disambiguation and information retrieval Chapter 17 Jurafsky, D. & Martin J. H. SPEECH and LANGUAGE PROCESSING Jarmo Ritola -
Reasoning Methodologies in Information Technology R. Weber College of Information Science & Technology Drexel University.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Application of NLP in Information Retrieval Nirdesh Chauhan Ajay Garg Veeranna A.Y. Neelmani Singh.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
Advance Information Retrieval Topics Hassan Bashiri.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
BioText Infrastructure Ariel Schwartz Gaurav Bhalotia 10/07/2002.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
WuArchivalContr.ppt-1 Information Technology & Telecommunications Laboratory Presidential Electronic Records Pilot Operating System (PERPOS) William Underwood.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
9/8/20151 Natural Language Processing Lecture Notes 1.
Evolution of a Prototype Archival System for Preserving & Reviewing Electronic Records 2008 SAA Annual Meeting August 30, 2008.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
Survey of Semantic Annotation Platforms
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Complex Linguistic Features for Text Classification: A Comprehensive Study Alessandro Moschitti and Roberto Basili University of Texas at Dallas, University.
Flexible Text Mining using Interactive Information Extraction David Milward
Abstract Question answering is an important task of natural language processing. Unification-based grammars have emerged as formalisms for reasoning about.
ITTL.ppt-1 Information Technology & Telecommunications Laboratory Semantic Technologies Applied to FOIA Review William Underwood Partnerships in Innovation:
Information retrieval 1 Boolean retrieval. Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text)
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Structure of IR Systems INST 734 Module 1 Doug Oard.
Natural Language Processing for Information Retrieval -KVMV Kiran ( )‏ -Neeraj Bisht ( )‏ -L.Srikanth ( )‏
LIS 6771 Indexing with a Controlled Vocabulary Basic Concepts.
Artificial Intelligence: Natural Language
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Data Mining: Text Mining
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Natural Language Processing for Information Retrieval D a v i d D. L e w i s AT&T Bell Lab.’s K a r e n S p a r c k J o n e s University of Cambridge Ferhat.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Using Semantic Relations to Improve Information Retrieval
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Natural Language Processing Tasneem Ghnaimat Spring 2013.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Web IR: Recent Trends; Future of Web Search
Machine Learning in Natural Language Processing
CS246: Information Retrieval
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Presentation transcript:

GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and Future of Search and Retrieval Technology” WG1 Mid-Year Meeting Cambridge, Maryland April 21-22, 2002

GTRI.ppt-2 Research Sponsored by ERA Program of NARA Application of Natural Language Processing Technology to effectively: Summarize Series of Presidential e-records Identify FOIA exemptions and PRA restrictions in Presidential e-records Search for e-records relevant to a FOIA request Search for e-records in massive collections in support legal discovery

GTRI.ppt-3 NLP Methods in Document Retrieval Morphological processing Identifying words Parsing-Linguistic representation Word sense disambiguation Represent, identify and exploit semantic relationships Conceptual indexing Matching concepts in query to conceptual index

GTRI.ppt-4 Current Weaknesses of NLP in Information Retrieval NLP methods of document retrieval have failed to perform better that Boolean and statistical methods. Why? Broad nature of retrieval tasks Lack of weighting scheme for compound terms Poor word sense ambiguation for documents and queries. Need to handle verbs as well as nouns and noun phrases. Poor POS tagging Need better parsing algorithms and grammars. Inadequate handling of negation

GTRI.ppt-5 Advanced NLP Methods Applied to PERPOS Research Tasks Morphological analysis Word sense disambiguation Larger lexicon Domain-dependent Lexicons. Information extraction to identify classes of words Template filling to identify communication acts of records (nominate, request information, provide information) Learning and identification of document types Method of reasoning with negation in NL Conceptual taxonomy Rule-based reasoning Question answering technology

GTRI.ppt-6 Plausible, Hybrid Approach to Investigating e-discovery Formulate e-discovery task not just in search terms but also complaint itself including parties and laws involved. Express the kinds of evidence that would enable one to prove the case as a series of questions or if-then rules drawn from precedent cases. And experience. Use a COTS text retrieval system with Boolean queries and statistical method to retrieve documents using key terms related to the case. Use contextual knowledge with questions and NLP methods, (e.g., question answering) to review the retrieved documents to determine more precisely those relevant to the case, i.e., those that would represent evidence.