Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Slides:



Advertisements
Similar presentations
Prolog programming....Dr.Yasser Nada. Chapter 8 Parsing in Prolog Taif University Fall 2010 Dr. Yasser Ahmed nada prolog programming....Dr.Yasser Nada.
Advertisements

 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Augmented Transition Networks
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Understanding Natural Language
1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2.
Parsing: Features & ATN & Prolog By
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
LING 364: Introduction to Formal Semantics Lecture 4 January 24th.
Amirkabir University of Technology Computer Engineering Faculty AILAB Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing Course,
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Features and Unification
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Natural Language Processing
LING 364: Introduction to Formal Semantics Lecture 5 January 26th.
Syntax 3: Back to State Networks... Recursive Transition Networks John Barnden School of Computer Science University of Birmingham Natural Language Processing.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
Syntax Nuha AlWadaani.
1 A Chart Parser for Analyzing Modern Standard Arabic Sentence Eman Othman Computer Science Dept., Institute of Statistical Studies and Research (ISSR),
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
Computing Science, University of Aberdeen1 CS4025: Grammars l Grammars for English l Features l Definite Clause Grammars See J&M Chapter 9 in 1 st ed,
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Natural Language Processing Lecture 6 : Revision.
SYNTAX Lecture -1 SMRITI SINGH.
Understanding Natural Language
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
Transition Network Grammars for Natural Language Analysis - W. A. Woods In-Su Yoon Pusan National University School of Electrical and Computer Engineering.
Notes on Pinker ch.7 Grammar, parsing, meaning. What is a grammar? A grammar is a code or function that is a database specifying what kind of sounds correspond.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
1 Definite Clause Grammars for Language Analysis – A Survey of the Formalism and a Comparison with Augmented Transition Networks 인공지능 연구실 Hee keun Heo.
Rules, Movement, Ambiguity
Basic Parsing Algorithms: Earley Parser and Left Corner Parsing
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
November 2004csa3050: Sentence Parsing II1 CSA350: NLP Algorithms Sentence Parsing 2 Top Down Bottom-Up Left Corner BUP Implementation in Prolog.
1 Recursive Transition Networks Allen ’ s Chapters 3 J&M ’ s Chapter 10.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
NATURAL LANGUAGE PROCESSING
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
SYNTAX.
Natural Language Processing Vasile Rus
Natural Language Processing Vasile Rus
Chapter 3 – Describing Syntax
Structure, Constituency & Movement
Beginning Syntax Linda Thomas
Basic Parsing with Context Free Grammars Chapter 13
Natural Language Processing
Chapter Eight Syntax.
Natural Language Processing (NLP)
CS 388: Natural Language Processing: Syntactic Parsing
Chapter Eight Syntax.
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is used. It requires an analysis of the sentence on several different levels: Syntactic Semantic Pragmatic Discourse Syntactic:Syntax (grammar) of the sentence is checked. Syntax is a tool for describing the structure of sentences in the language. Semantics: denotes the ‘literal’ meaning we ascribe to a sentence.

Pragmatics: refers to intended meaning of a sentence. How sentences are used in a different contexts and how context affects the interpretation of the sentence. Discourse: refers to conversation between two or more individuals. Basic parsing Techniques Context free grammars S  NP, VP VP  verb, NP NP  det, NP NP  det, noun, NP NP  det, adj *, NP One can have top-down or bottom-up parsers.

Simple Transition Networks More convenient for visualizing the grammar (CFG) It consists of nodes and labeled arcs. Network for NP. Arcs are labeled by word category. Starting at a given node, we can traverse an arc if the current word in the sentence is in the category on the arc.

This network recognizes the same set of sentences as the following CFG. NP  det, NP1 NP1  adj, NP1 NP1  noun Simple Transition Network formalism is not powerful enough to describe all languages that can be described by CFG. Recursive grammar can’t be defined by Transition Network. Recursive Transition Networks (RTN) To get the descriptive power of CFG, you need a notion of recursion in the network grammar. In RTN which is like a simple Transition Network except that it allows arc labels that refer to other networks rather than word categories.

The network for simple English sentences can be expressed as Uppercase labels refers to networks. The arc from S to S1 can be followed only if the NP network can be successfully traversed to a pop arc. RTN might allow to have an arc labeled with its own name. Any language generated by CFG can be generated by RTN and vice verse. Thus they are equivalent in their generative capacity.

Implementation of RTN in Prolog  Vocabulary for RTN can be stored as a set of facts as: word_type (“ an“, det ). word_type (“ the“, det ). word_type (“man“, noun ). word_type (“apple “,noun ). word_type (“ eats“, verb ).  The top level clause for RTN is called run and is as follows: run :- set_state(s0), writeln(“Enter your sentence), readln(Sent), analyze(Sent), writeln(“Your sentence is syntactically correct”), clear_state. run :-writeln(“Your sentence is wrong s yntactically”), clear_state.

set_state(S) :- assert(current_state(S)). /* Initialize current state to s0 */ analyze (S) :- S = “.”, final_state(_). analyze (S) :- get_state(NS), !, transition(NS, S, S1), analyze (S1). get_state(S) :- current_state(S). /* transition states are listed as */ transition (s0, A, B) :- check_np(np, A, B), set_state(s1). transition(s1, A, B) :- get_token(A, W, B), word_type(W, verb), set_state(s2).

transition (s2, A, B) :- check_np(np, A, B), set_state(s3). transition (np, A, B) :- get_token(A, W, B), word_type(W, det), set_state(np1). transition (np1, A, B) :- get_token(A, W, B), word_type(W, noun), set_state(np2). transition (np1, A, B) :- get_token(A, W, B), word_type(W, adj), set_state(np1). check_np(np2, A, B) :- A = B, !. check_np(St, A, B) :- transition(St, A, C), get_state(Ns), check_np(Ns, C, B), !.

final_state(s3) :- current_state(s3). /* get a token from the sentence */ get_token(A, W, B) :- ? clear_state :- retract(current_state(_)), fail. clear_state. Query: ?- run the man eats an apple. Yes. ?- run man eat apple. Yes.(This is actually wrong but we have not taken care of number agreement into account)

These formalisms are limited in the following ways. They could only accept or reject a sentence rather than producing a analysis of the structure of the sentence. Augmenting RTN formalism which involves both generalizing the network notation and introducing more information about words in the structure, collecting information and testing features while paring becomes augmented transition network (ATN). Similar kind of augmentation & extension can be made to CFG called Definite Clause Grammar. Recording sentence Structure while parsing in ATN. Collect the structure of legal sentence in order to further analyze them.

For instance we can identify one particular noun phase as the syntactic subject (SUBJ) of a sentence and another as the syntactic object of the verb (OBJ) Within noun phrase we might identify the det structure, adjective, the head noun and so on. Thus the sentence “Jack found a bag” might be represented by the structure (S (SUBJ(NP NAMEjack) MAIN-Vfound TENSEPAST OBJ(NPDETa HEAD bag ) Such a structure is created using RTN parser by allowing each network to have a set of registers.

Registers are local to each network. Thus each time a new network is pushed and new set of empty registers are created. When the network is popped, the registers disappear. Registers can be set to the values and these can be retrieved from registers. NP network has registers names such as, DET, ADJS, HEAD and NUM. Registers are set by actions that can be specified on the arc. When an arc is followed, the actions associated with it are executed. The most common act involves setting the register to a certain value. When a pop arc is followed, all the registers set in the current network are automatically collected to form a structure consisting of the network followed by a list of the registers with their values.

When a category arc, such as name or verb etc. is followed, the word in the input is put into a special variable named as *. Thus the plausible action on the arc from S1 to S2 would be to set NAME register to the current word. It is written as NAME  *

The push arcs such as NP must be treated differently. Typically many words will be used by the network called using push arc. The network used in push would have set of registers that capture the structure of the constituent that was parsed. The structure built by the pushed network is returned in the value *. NP SS1 Thus the action on the arc from S to S1 might be SUBJ  *

Therefore, a RTN with registers and tests with actions on those registers, is an Augmented Transition network (ATN).

ArcTestActions NP/1noneDET  * NUM  NUM* NP/2noneNAME  * NUM  NUM* NP1/1 NUM  NUM*  {then action is taken HEAD  * otherwise it fails }NUM  NUM  NUM* NP1/2none ADJS  Append (ADJS, *) S/1noneSUBJ  * S1/1 NUM SUBJ  NUM*  {then action is taken MAIN_V  * otherwise it fails } NUM  NUM SUBJ  NUM* S2/1none OBJJ  *

Notations: NUM*- is the NUM register of the structure in * NUM SUBJ - is the NUB register of the structure in SUBJ The values of the registers are often viewed as sets and the intersection (  ) and union (  ) of sets are allowed to combine the values of different registers. For the registers that may take a list of values, an append function is permitted. Append (ADJS,*) returns the list that is the list in the register ADJS with the value of * appended on the end. The sentence, with the word positions indicated, is as follows: 1 The 2 dogs 3 love 4 john 5.

A simple Lexicon: WordRepresentation dogs(NOUNROOT dog NUM {3p} ) dog(NOUNROOT dog NUM {3s} ) the(DETROOT the NUM (3s, 3p} ) Love(VERBROOT love NUM {3p} ) John(NAMEROOT john )

Trace of S Network StepNodePositionArc followedRegisters 1S1S / 1 - 2NP1NP / 1[ DET  the, NUM  {3s, 3p}] 3NP12NP / 1 {check {3s, 3p }  {3p}   } [HEAD  dogs, NUM  {3p}] 4NP23NP2 / 1return structure {NP { DET  the HEAD  dogs NUM  {3p} } 5-3S / 1 succeedsSUBJ  {NP { DET  the HEAD  dogs NUM  {3p} } S13S1 / 1 {check {3p }  {3p}   }[MAIN_V  love, NUM  {3p}]

6S24S2 / 1OBJ  * 7NP4NP / 2{ NAME  john NUM  {3p} } 8NP25NP2 / 1return structure OBJ  {NP { NAME  john NUM  {3p} } 9S35S3 / 1 return succeeds{S {SUBJ  {NP { DET  the HEAD  dogs, NUM  {3p} } MAIN_V  love, NUM  {3p}] OBJ  {NP { NAME  john NUM  {3p} } } )

Implementation of ATN in Prolog Database clauses will be used to store and read the registers. run :-set_state (s0) /* initialize state */, write (“ATN analyses your sent”), nl, write (“Please type in your sentence”), readln (Sent), analyse (Sent), write (“Your sent is syntactically correct), nl, clear_dbase. run :-write(“ your sent is syntactically wrong”), clear_dbase. clear_dbase :- retract(_), fail. clear_dbase.

analyse (S) :- S = “.”, final_state(_). analyse(S) :- current_state(N_state), !, transition(N_state, S, S1), !, analyse(S1). /* main transistions */ transition(s0, A, B) :- get_token(A, W, B), word_type(W, verb), asserta(type_reg (“QUEST”)), asserta(verb_reg(W)), set_state(s2). transition(s0, A, B) :- check_np(np, A, B), asserta(type_reg (“DECL”), build_phrase(np, STR), assert(subj_reg (STR)), set_state(s1).

/* NP transition */ check_np(np2, C, B) :- B = C, !. check_np(np3, C, B) :- B = C, !. check_np(St, A, B) :- transition(St, A, C), get_state(N), check_np(N, C, B). /* Build Phrases */ build_phrase(np, STR) :- det_reg(DET), adj_reg(ADJ), noun_reg(NOUN), get_template(np, T), fill_template(T, AUX, T1), fill_template(T1, VERB, STR).