An Evaluation Tool for Natural Language Processing Systems Audrey N. Mbeje Department of Computer Science Ball State University November 09, 2000
Contents I.Introduction Problem Description Significance of the Study II.Definition of Terms Computational Linguistics Context III.Literature Review IV.Methodology V.Anticipated Results VI.Time Schedule VII.Deliverables VIII.Future Research & Conclusion
Problem Description Problem Background: Human interactive discourse provides many challenges for natural language processing (NLP) systems. One of the main challenges is representing the speaker’s intended meaning in its context. Thus the focus of current research on NLP has been to develop the technology that will enable the computer to understand news events in the context they occur in the real world. The evolving technology, however, is linguistically inclined and is less concerned about the quality of the software. Additionally, it does not reflect uniform principles of software evaluation.
Goal: The goal of the proposed study is to improve the quality of the natural language processing technology by assessing NLP system inventions for linguistic and technical quality assurance before they are implemented. We are suggesting a natural language processing system evaluation tool that will provide both the linguistic and software quality assurance. The proposed study is based on the assumption that progress in developing NLP technology depends on using evaluation methods that better model the speakers’ natural discourse and the quality software.
Significance of the Study The study will benefit the theory of natural language processing, particularly the research area concerned with context in NLP systems. The study is proposing an integration of linguistic principles and software design principles in NLP systems evaluation which would be a contribution in the current progress in NLP technology. The proposed tool will improve the NLP system usability by offering quality assurance for reliability and validity of the software technically and linguistically.
Definition of Terms 1.Computational Linguistics: -Discipline between linguistics and computer science which is concerned with the computational aspects of human language faculty. -Belongs to the cognitive sciences, artificial intelligence (AI) specifically. -Has two components applied and theoretical
Definition of Terms (cont’d) -With the applied component the interest is in the practical outcome of modeling human language use. The goal is to create software products that have some knowledge of human language. -The theoretical aspect deals with issues of formal theories about the linguistic knowledge that a human needs for generating and understanding language. (The proposed evaluation tool is intended for the applied component of CL.)
Definition of Terms (cont’d) 2.Context: -Rough definition of the term -We say that an utterance x presupposes a fact y, if uttering x only makes sense if the context (e.g., world knowledge or earlier utterance in the same conversation) provides enough information to conclude that y is the case. Consider example 2a 2a.Mary’s husband is out of town. The noun phrase presupposes Mary is married. Computational linguists are concerned with making NLP systems understand such contextual information.
Literature Review Much research on the problem of in-depth story understanding by computer was performed starting in the 1970’s. In the 1990’s the interest shifted towards information extraction and word sense disambiguation. The end of the 1990 marked another shift in focus back to in-depth story understanding by the computer.
McCarthy (1990) discusses the problem of getting the computer to understand the following text from the New York Times: A 61-year old furniture salesman was pushed down the shaft of a freight elevator yesterday in his downtown Brooklyn store by two robbers while a third attempted to crush him with the elevator car because they were dissatisfied with the $1,200 they had forced him to give them. The buffer springs at the bottom of the shaft prevented the car from crushing the salesman John J. Hug, after he was pushed from the first floor to the basement.
The car stopped about 12 inches above him as he flattened himself at the bottom of the pit. (Mueller, 1999) McCarthy’s concern was beyond mere word sense disambiguation and information extraction. He suggested that the computer should be able to demonstrate such contextual questions as: Who was in the store when the events began? Who had the money at the end? What would have happened if Mr. Hug had not flattened himself at the bottom of the pit? etc.
Literature Review (cont’d) Current research on contextual understanding is concerned with such problems as the one stated above. Several NLP systems have been suggested whose orientations is mainly linguistic. This study is suggesting an evaluation tool for such NLP systems integrating linguistic and technical principles, namely, speed.
Methodology Create an algorithm simulating aspects of human language faculty, namely, speed and ability to decode contextual discourse. -Evaluation technologies to evaluate the NLP systems for context decoding and speed using existing evaluation technology.
Methodology (cont’d) -Do the same test using the proposed tool. -Compare the results Note: The proposed evaluation tool will be evaluated for validity and reliability before its implementation using outside researcher’s evaluation tool.
Anticipated Results The proposed tool should effectively evaluate NLP systems for context and speed.
Time Schedule August - November: Proposal Writing & Presentation November - December: Proposal Review January – March: Literature Review April – July: Data Gathering Evaluation Tool Designing Evaluation Tool Testing August - November: Thesis Writing & Defense
Deliverables 1.Natural Language Processing Evaluation Tool 2.Research Presentation at a Conference 3.Research Publication
Conclusion and Future Research Computing context of a natural language discourse is an essential task for a natural language processing system. The proposed evaluation tool for NLP system will have a potential for modification to incorporate new design principles for improved usability.
The End********