Natural Language in AI.

Slides:



Advertisements
Similar presentations
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Advertisements

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Statistical NLP: Lecture 3
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Generation of Referring Expressions: Modeling Partner Effects Surabhi Gupta Advisor: Amanda Stent Department of Computer Science.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
By Rohana Mahmud (NLP week 1-2)
Natural Language Processing AI - Weeks 19 & 20 Natural Language Processing Lee McCluskey, room 2/07
CSE (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches.
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
C SC 620 Advanced Topics in Natural Language Processing 3/9 Lecture 14.
1/23 Applications of NLP. 2/23 Applications Text-to-speech, speech-to-text Dialogues sytems / conversation machines NL interfaces to –QA systems –IR systems.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Natural Language Understanding
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
9/8/20151 Natural Language Processing Lecture Notes 1.
CCSB354 ARTIFICIAL INTELLIGENCE (AI)
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Artificial intelligence project
1 Computational Linguistics Ling 200 Spring 2006.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
Language Technology I © 2005 Hans Uszkoreit Language Technology I 2005/06 Hans Uszkoreit Universität des Saarlandes and German Research Center for Artificial.
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
Introduction to CL & NLP CMSC April 1, 2003.
Artificial Intelligence: Natural Language
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Artificial Intelligence: Natural Language
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
CSE573 Autumn /23/98 Natural Language Processing Administrative –PS3 due today –PS4 out Wednesday, due Friday 3/13 (last day of class) special.
Natural Language Processing Chapter 1 : Introduction.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Natural Language Processing Slides adapted from Pedro Domingos
1 An Introduction to Computational Linguistics Mohammad Bahrani.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
Natural Language Processing (NLP)
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NATURAL LANGUAGE PROCESSING
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Natural Language Processing (NLP)
Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Statistical NLP: Lecture 3
Machine Learning in Natural Language Processing
Natural Language Processing
Natural Language Processing (NLP)
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Presentation transcript:

Natural Language in AI

Outline Text-based natural language Dialogue-based natural language

Methods in Natural Language Processing Methods in NLP can be oriented to two categories of tasks: NL generation NL understanding

Natural Language problems dialogue-based NL interfaces spoken and written communication uses natural language understanding discourse (any string more than 1 Sentence long) text-based text categorization, text generation, information extraction, machine translation

Text-Based Natural Language

Text-based NL problems story/text understanding; information extraction: extracting information from text; translating documents, manuals, communications; drafting documents; summarizing texts; text generation, categorization or clustering, text DB retrieval, text mining, topic identification;

Text-based Natural Language Topics Information extraction Machine translation Drafting Text summarization

Information Extraction Extracting specific types of information from large volumes of unrestricted text; The IE system must be input with domain guidelines that specify what to find and what to extract; They seek for the portions that might contain the relevant information intended. IE systems are not required to understand completely the text source;

Types of IE Knowledge-based Information Extraction Machine learning IE Template-based, Wrappers Template Mining

Knowledge-based Information Extraction Types of IE Knowledge-based Information Extraction Use of linguistic patterns to support the interpretation of input texts in knowledge-based information extraction. Machine learning IE inductive learning mechanism to automatically construct a knowledge base of patterns.

Template-based, Wrappers Types of IE Template-based, Wrappers IE’s output is a populated database, which can be used as a case base The values for the slots are strings from the source text The resulting database works as a template Template Mining well suited for areas, “where the text is terse and sentences are unambiguous and declarative in nature”.

Relation between IE and NLP Using linguistic patterns: knowledge-based (represents patterns) inductive learning based (learns patterns) template mining (skips parsing) NLP is needed whenever there is need for disambiguating negation and ordering makes a difference in meaning

Examples of applications of IE

References of IE Robert Gaizauskas and Yorick Wilks (1998) Information Extraction: Beyond Document Retrieval. Computational Linguistics and Chinese Language Processing, vol. 3, no. 2, pp. 17-60. Riloff, E. Lehnert, W. (1994). Information Extraction as a Basis for High-Precision Text Classification. ACM Transactions in Information Systems, 12, 3, 296-333. Lehnert, W., McCarthy, J., Soderland, S., Riloff, E., Cardie, C., Peterson, J., Feng, F.,Dolan, C., and Goldman, S., (1993) UMASS/HUGHES: Description of the CIRCUS System Used for MUC-5. Proceedings of the Fifth Message Understanding Conference,pp. 277-291. San Mateo, CA:Morgan Kaufmann. S. Soderland and W. Lehnert (1994) Wrap Up: a Trainable Discourse Module for Information Extraction, Journal of Artificial Intelligence Research, 2, 131-168. Natural Language Processing Laboratory Online Information Extraction Bibliography online at: http://www-nlp.cs.umass.edu/ciir-pubs/tepubs.html

Text-based Natural Language Topics Information extraction Machine translation Drafting Text summarization

Can you translate this sentence? Ever since computers were invented, it has been natural to wonder whether they might be able to learn. By Tom Mitchell

Describe the steps you used to translate the sentence

List the words you used in the translated sentence and associate to the ones in the source sentence

Ever since computers were invented it has been natural to wonder whether they might be able to learn. Desde que computadores foram inventados tem sido natural imaginar que eles sejam capazes de aprender.

Online translators http://babelfish.altavista.com/babelfish/tr http://world.altavista.com/tr http://www.systransoft.com/ What’s wrong with them?

Can you translate this sentence? …cursing my head for things that I've said till I finally died, which started the whole world living…

What works? The KANT project: Knowledge-based, Accurate Translation for technical documentation founded in 1989 large-scale, practical translation systems for technical documentation Kant project homepage: http://www.lti.cs.cmu.edu/Research/Kant/

KANT uses a controlled vocabulary and grammar for each language explicit yet focused semantic models for each technical domain achieves very high accuracy in translation multilingual document production has been applied to the domains electric power utility management heavy equipment technical documentation.

Machine Translation Unrestricted MT is still inadequate. Will it ever change? Why would MT target outperforming human translation? An alternative is using humans to edit the original document into a subset of the original language (canonical form) Cost of MT lexicons of 20,000-100,000 words grammars with 100 to 10,000 rules

Text-based Natural Language Topics Information extraction Machine translation Drafting Text summarization

Drafting applications in the legal domain use of rhetorical structure drafting of wills petitions for restraining orders use of rhetorical structure

Example Rhetorical Structure

Text-based Natural Language Topics Information extraction Machine translation Drafting Text summarization

Summarize text

Describe the steps you used to summarize text

Text summarization applications Generate a summary of many documents; Generate a summary of one document only; Headline generation;

Text summarization The traditional idea of summarization is to extract sentences and concatenate them. Human beings produce summaries of documents by creating new sentences that capture the most salient pieces of information in the original document and that are grammatical, that cohere with one another, and . Given that large collections of text/abstract pairs are available online, it is now possible to envision algorithms that are trained to mimic this process. From Knight, K. and Marcu, D. 2000.

Text summarization steps Identify most relevant segments; Apply rules for deleting redundant parts; Compress/aggregate long sentences; Assess coherence of segments; Revise.

Example

Dialogue-based natural language

Dialogue-based natural language NL Understanding Speech recognition intonation, pronunciation, speed Natural Language Processing syntactic , semantic , pragmatic analysis Natural Language Generation intention, generation, speech synthesis

Speech recognition analog signal from voice is digitized identify phonemes produced template matching attempts to match phonemes from a library of sounds with sounds produced outcome is a list of phonemes and probabilities find the words using hidden Markov modeling

How to recognize speech How to wreck a nice beach Ice cream I scream

Speech Recognition Methods speech recognition can also be implemented with an inductive method such as neural networks individual and continuous recognizers controlled vocabulary can increase chances of success e.g., Jupiter limit to one speaker , when multiple speakers are needed, retraining may be often necessary speech understanding includes speech recognition and understanding of the recognized utterance

- Syntactic Analysis - Parsing - Semantics - Pragmatics Natural Language Understanding - Syntactic Analysis - Parsing - Semantics - Pragmatics

Syntactic analysis a parser recovers the phrase structure of an utterance, given a grammar (rules of syntax) parser’s outcome is the structure (groups of words and respective parts of speech) phrase structure is represented in a parse tree Parsing is the first step towards determining the meaning of an utterance

Parsing Parsing: method to analyze a sentence to determine its structure according to the grammar Grammar: formal specification of the structures allowable in the language

Examples of Symbols in a Grammar (S) sentence (NP) noun phrase (VP) verb phrase (PP) prepositional phrase (RelClause) relative clause (Det) determiner determiner Grammar. A word belonging to a group of noun modifiers, which include articles, demonstratives, possessive adjectives, and words such as any, both, or whose, and occupying the first position in a noun phrase or the second or third position after another determiner.

Grammar rules S  NP VP NP  Det Adjective N S  VP VP VP  V Adjective S  VP PP NP  Adjective N S  NP VP VP Dictionary entries: VP  V S V  ate VP  V NP NAME  John VP  V PP Det(art)  the NP  Noun N  cat PP  P Noun NP  Det Noun

Parsing Tree S NP VP Article Noun Verb Adjective The terrain is insurmountable Parsing Tree

the outcome of the syntactic analysis can still be a series of alternate structures with respective probabilities sometimes grammar rules can disambiguate a sentence, “John set the set of chairs” Sometimes they can’t. …the next step is semantic analysis

Semantic analysis semantics provide a partial representation for meaning represents the sentence in meaningful parts uses possible syntactic structures and meaning builds a parse tree with associated semantics semantics typically represented with logic

Compositional semantics The semantics of a phrase is a function of the semantics of its sub-phrases It does not depend on any other phrase So, if we know the meaning of sub-phrases, then we know the meaning of the phrases “A goal of semantic interpretation is to find a way that the meaning of the whole sentence can be put together in a simple way from the meanings of the parts of the sentence.” (Alison, 1997 p. 112)

Semantic analysis transitiveness of a verb enhances the meaning in a parse tree (e.g., jump is intransitive, love is transitive) -John died Mary Is there a period missing or is it: -John dyed Mary

Pragmatic analysis uses context uses partial representation includes purpose and performs disambiguation Where, when, by whom an utterance was said

Example using Ontology Fred saw the plane flying over Zurich. Fred saw the mountains flying over Zurich. Traditional NL systems will have difficulty resolving this syntactic ambiguity, but because CYC knows that planes fly and mountains do not, it will be able to parse these sentences just as easily as a human. It's difficult to see how this could be done without relying on a large database of common sense. http://www.cyc.com/products2.html

Example using Ontology because it includes context it can recognize that another sentence that followed the previous: The man saw the plane flying over Zurick. It was dark, when he looked up to the sky again the plane was gone. Another interpretation would be given if the following sentence was: The man saw the plane flying over Zurick. He also saw the building where the plane crashed.

Pronoun disambiguation using Ontology The police arrested the demonstrators because they feared violence. The police arrested the demonstrators because they advocated violence. Mary saw the coat in the store window and wanted it. Mary saw the coat in the store window and pressed her nose up against it.

Communication and Planning Decide what to say relates to planning Understanding relates to plan recognition

Currently NLP logic-based NLP is less accurate statistical natural language processing increases accuracy to around 98% still not good, given that the average size of a sentence in a newspaper is such that this accuracy can result in 1 error per sentence

Processes in NL communication Natural Language Generation Processes in NL communication Communication involves three steps by the speaker: the intention to convey an idea (what to say) the mental generation of words (how to say) their synthesis (say it)

what to say text planning result of reasoning (e.g., retrieval) utterances that achieve a goal, may include ordering result of reasoning (e.g., retrieval) a confirmation or thanks (Jupiter sounds a beep) question motivated by need of confirmation question motivated by need of missing information

how to say how to convert a semantic representation into a sentence grammatically correct proper choice of words in limited problem types, templates are helpful e.g., JUPITER says “I have no knowledge of that” starts sentences with: In (city) (day of the week), chances… finishes sentences with: Is there something else? or “Can I help you with something else?”

say it! speech synthesis from words into speech signal applications of neural networks templates with recordings from humans record every word in a dictionary record every phoneme (worst choice!) JUPITER uses a commercial speech synthesizer

Example Nitrogen is a prototype natural language generation system that combines symbolic rules with linguistic information gathered statistically from large online text corpora. http://www.isi.edu/natural-language/mt/nitrogen/ http://www.mri.mq.edu.au/~peba/MLPeba/system.html http://cslu.cse.ogi.edu/HLTsurvey/ch4node3.html#SECTION4

JUPITER 1-888-573-8255 http://www.sls.lcs.mit.edu/sls/whatwedo/applications/jupiter.html "What will the weather be like in Boston tomorrow?" Jupiter invokes the following procedure: - Speech recognition: SUMMIT converts the spoken sentence into text - Language understanding: TINA parses the text into a semantic frame -- a grammatical structure containing the basic terms needed to query the Jupiter database - Language generation: GENESIS uses the semantic frame's basic terms to build a Structured Query Language (SQL) query for the database - Information retrieval: Jupiter executes the SQL query and retrieves the requested information from the database - Language generation: TINA and GENESIS convert the query result into a natural language sentence - Information delivery: Jupiter delivers the generated sentence to the user via voice (using a speech synthesizer) and/or display