Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language in AI.

Similar presentations


Presentation on theme: "Natural Language in AI."— Presentation transcript:

1 Natural Language in AI

2 Outline Text-based natural language Dialogue-based natural language

3 Methods in Natural Language Processing
Methods in NLP can be oriented to two categories of tasks: NL generation NL understanding

4 Natural Language problems
dialogue-based NL interfaces spoken and written communication uses natural language understanding discourse (any string more than 1 Sentence long) text-based text categorization, text generation, information extraction, machine translation

5 Text-Based Natural Language

6 Text-based NL problems
story/text understanding; information extraction: extracting information from text; translating documents, manuals, communications; drafting documents; summarizing texts; text generation, categorization or clustering, text DB retrieval, text mining, topic identification;

7 Text-based Natural Language Topics
Information extraction Machine translation Drafting Text summarization

8 Information Extraction
Extracting specific types of information from large volumes of unrestricted text; The IE system must be input with domain guidelines that specify what to find and what to extract; They seek for the portions that might contain the relevant information intended. IE systems are not required to understand completely the text source;

9 Types of IE Knowledge-based Information Extraction Machine learning IE
Template-based, Wrappers Template Mining

10 Knowledge-based Information Extraction
Types of IE Knowledge-based Information Extraction Use of linguistic patterns to support the interpretation of input texts in knowledge-based information extraction. Machine learning IE inductive learning mechanism to automatically construct a knowledge base of patterns.

11 Template-based, Wrappers
Types of IE Template-based, Wrappers IE’s output is a populated database, which can be used as a case base The values for the slots are strings from the source text The resulting database works as a template Template Mining well suited for areas, “where the text is terse and sentences are unambiguous and declarative in nature”.

12 Relation between IE and NLP
Using linguistic patterns: knowledge-based (represents patterns) inductive learning based (learns patterns) template mining (skips parsing) NLP is needed whenever there is need for disambiguating negation and ordering makes a difference in meaning

13 Examples of applications of IE

14 References of IE Robert Gaizauskas and Yorick Wilks (1998) Information Extraction: Beyond Document Retrieval. Computational Linguistics and Chinese Language Processing, vol. 3, no. 2, pp Riloff, E. Lehnert, W. (1994). Information Extraction as a Basis for High-Precision Text Classification. ACM Transactions in Information Systems, 12, 3, Lehnert, W., McCarthy, J., Soderland, S., Riloff, E., Cardie, C., Peterson, J., Feng, F.,Dolan, C., and Goldman, S., (1993) UMASS/HUGHES: Description of the CIRCUS System Used for MUC-5. Proceedings of the Fifth Message Understanding Conference,pp San Mateo, CA:Morgan Kaufmann. S. Soderland and W. Lehnert (1994) Wrap Up: a Trainable Discourse Module for Information Extraction, Journal of Artificial Intelligence Research, 2, Natural Language Processing Laboratory Online Information Extraction Bibliography online at:

15 Text-based Natural Language Topics
Information extraction Machine translation Drafting Text summarization

16 Can you translate this sentence?
Ever since computers were invented, it has been natural to wonder whether they might be able to learn. By Tom Mitchell

17 Describe the steps you used to translate the sentence

18 List the words you used in the translated sentence and associate to the ones in the source sentence

19 Ever since computers were invented it has been natural to wonder whether they might be able to learn. Desde que computadores foram inventados tem sido natural imaginar que eles sejam capazes de aprender.

20 Online translators What’s wrong with them?

21 Can you translate this sentence?
…cursing my head for things that I've said till I finally died, which started the whole world living…

22 What works? The KANT project:
Knowledge-based, Accurate Translation for technical documentation founded in 1989 large-scale, practical translation systems for technical documentation Kant project homepage:

23 KANT uses a controlled vocabulary and grammar for each language
explicit yet focused semantic models for each technical domain achieves very high accuracy in translation multilingual document production has been applied to the domains electric power utility management heavy equipment technical documentation.

24 Machine Translation Unrestricted MT is still inadequate. Will it ever change? Why would MT target outperforming human translation? An alternative is using humans to edit the original document into a subset of the original language (canonical form) Cost of MT lexicons of 20, ,000 words grammars with 100 to 10,000 rules

25 Text-based Natural Language Topics
Information extraction Machine translation Drafting Text summarization

26 Drafting applications in the legal domain use of rhetorical structure
drafting of wills petitions for restraining orders use of rhetorical structure

27 Example Rhetorical Structure

28 Text-based Natural Language Topics
Information extraction Machine translation Drafting Text summarization

29 Summarize text

30 Describe the steps you used to summarize text

31 Text summarization applications
Generate a summary of many documents; Generate a summary of one document only; Headline generation;

32 Text summarization The traditional idea of summarization is to extract sentences and concatenate them. Human beings produce summaries of documents by creating new sentences that capture the most salient pieces of information in the original document and that are grammatical, that cohere with one another, and . Given that large collections of text/abstract pairs are available online, it is now possible to envision algorithms that are trained to mimic this process. From Knight, K. and Marcu, D

33 Text summarization steps
Identify most relevant segments; Apply rules for deleting redundant parts; Compress/aggregate long sentences; Assess coherence of segments; Revise.

34 Example

35 Dialogue-based natural language

36 Dialogue-based natural language
NL Understanding Speech recognition intonation, pronunciation, speed Natural Language Processing syntactic , semantic , pragmatic analysis Natural Language Generation intention, generation, speech synthesis

37 Speech recognition analog signal from voice is digitized
identify phonemes produced template matching attempts to match phonemes from a library of sounds with sounds produced outcome is a list of phonemes and probabilities find the words using hidden Markov modeling

38 How to recognize speech
How to wreck a nice beach Ice cream I scream

39 Speech Recognition Methods
speech recognition can also be implemented with an inductive method such as neural networks individual and continuous recognizers controlled vocabulary can increase chances of success e.g., Jupiter limit to one speaker , when multiple speakers are needed, retraining may be often necessary speech understanding includes speech recognition and understanding of the recognized utterance

40 - Syntactic Analysis - Parsing - Semantics - Pragmatics
Natural Language Understanding - Syntactic Analysis - Parsing - Semantics - Pragmatics

41 Syntactic analysis a parser recovers the phrase structure of an utterance, given a grammar (rules of syntax) parser’s outcome is the structure (groups of words and respective parts of speech) phrase structure is represented in a parse tree Parsing is the first step towards determining the meaning of an utterance

42 Parsing Parsing: method to analyze a sentence to determine its structure according to the grammar Grammar: formal specification of the structures allowable in the language

43 Examples of Symbols in a Grammar
(S) sentence (NP) noun phrase (VP) verb phrase (PP) prepositional phrase (RelClause) relative clause (Det) determiner determiner Grammar. A word belonging to a group of noun modifiers, which include articles, demonstratives, possessive adjectives, and words such as any, both, or whose, and occupying the first position in a noun phrase or the second or third position after another determiner.

44 Grammar rules S  NP VP NP  Det Adjective N
S  VP VP VP  V Adjective S  VP PP NP  Adjective N S  NP VP VP Dictionary entries: VP  V S V  ate VP  V NP NAME  John VP  V PP Det(art)  the NP  Noun N  cat PP  P Noun NP  Det Noun

45 Parsing Tree S NP VP Article Noun Verb Adjective The terrain
is insurmountable Parsing Tree

46 the outcome of the syntactic analysis can still be a series of alternate structures with respective probabilities sometimes grammar rules can disambiguate a sentence, “John set the set of chairs” Sometimes they can’t. …the next step is semantic analysis

47 Semantic analysis semantics provide a partial representation for meaning represents the sentence in meaningful parts uses possible syntactic structures and meaning builds a parse tree with associated semantics semantics typically represented with logic

48 Compositional semantics
The semantics of a phrase is a function of the semantics of its sub-phrases It does not depend on any other phrase So, if we know the meaning of sub-phrases, then we know the meaning of the phrases “A goal of semantic interpretation is to find a way that the meaning of the whole sentence can be put together in a simple way from the meanings of the parts of the sentence.” (Alison, 1997 p. 112)

49 Semantic analysis transitiveness of a verb enhances the meaning in a parse tree (e.g., jump is intransitive, love is transitive) -John died Mary Is there a period missing or is it: -John dyed Mary

50 Pragmatic analysis uses context uses partial representation
includes purpose and performs disambiguation Where, when, by whom an utterance was said

51 Example using Ontology
Fred saw the plane flying over Zurich. Fred saw the mountains flying over Zurich. Traditional NL systems will have difficulty resolving this syntactic ambiguity, but because CYC knows that planes fly and mountains do not, it will be able to parse these sentences just as easily as a human. It's difficult to see how this could be done without relying on a large database of common sense.

52 Example using Ontology
because it includes context it can recognize that another sentence that followed the previous: The man saw the plane flying over Zurick. It was dark, when he looked up to the sky again the plane was gone. Another interpretation would be given if the following sentence was: The man saw the plane flying over Zurick. He also saw the building where the plane crashed.

53 Pronoun disambiguation using Ontology
The police arrested the demonstrators because they feared violence. The police arrested the demonstrators because they advocated violence. Mary saw the coat in the store window and wanted it. Mary saw the coat in the store window and pressed her nose up against it.

54 Communication and Planning
Decide what to say relates to planning Understanding relates to plan recognition

55 Currently NLP logic-based NLP is less accurate
statistical natural language processing increases accuracy to around 98% still not good, given that the average size of a sentence in a newspaper is such that this accuracy can result in 1 error per sentence

56 Processes in NL communication
Natural Language Generation Processes in NL communication Communication involves three steps by the speaker: the intention to convey an idea (what to say) the mental generation of words (how to say) their synthesis (say it)

57 what to say text planning result of reasoning (e.g., retrieval)
utterances that achieve a goal, may include ordering result of reasoning (e.g., retrieval) a confirmation or thanks (Jupiter sounds a beep) question motivated by need of confirmation question motivated by need of missing information

58 how to say how to convert a semantic representation into a sentence
grammatically correct proper choice of words in limited problem types, templates are helpful e.g., JUPITER says “I have no knowledge of that” starts sentences with: In (city) (day of the week), chances… finishes sentences with: Is there something else? or “Can I help you with something else?”

59 say it! speech synthesis from words into speech signal
applications of neural networks templates with recordings from humans record every word in a dictionary record every phoneme (worst choice!) JUPITER uses a commercial speech synthesizer

60 Example Nitrogen is a prototype natural language generation system
that combines symbolic rules with linguistic information gathered statistically from large online text corpora.

61 JUPITER "What will the weather be like in Boston tomorrow?" Jupiter invokes the following procedure: - Speech recognition: SUMMIT converts the spoken sentence into text - Language understanding: TINA parses the text into a semantic frame -- a grammatical structure containing the basic terms needed to query the Jupiter database - Language generation: GENESIS uses the semantic frame's basic terms to build a Structured Query Language (SQL) query for the database - Information retrieval: Jupiter executes the SQL query and retrieves the requested information from the database - Language generation: TINA and GENESIS convert the query result into a natural language sentence - Information delivery: Jupiter delivers the generated sentence to the user via voice (using a speech synthesizer) and/or display


Download ppt "Natural Language in AI."

Similar presentations


Ads by Google