BİL711 Natural Language Processing

Slides:



Advertisements
Similar presentations
Artificial Intelligence
Advertisements

1 Knowledge Representation Introduction KR and Logic.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Semantics (Chapter 17) Muhammed Al-Mulhem March 1, 2009.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Natural Language Processing Lecture 2: Semantics.
Semantics (Representing Meaning)
CMSC 723: Intro to Computational Linguistics November 24, 2004 Lecture 12: Lexical Semantics Bonnie Dorr Christof Monz.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Knowledge Representation
Natural Language Processing Lecture 22: Meaning Representation Languages.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Computational Semantics Chapter 18 November 2012 We will not do all of this… Lecture #12.
CS 4705 Slides adapted from Julia Hirschberg. Homework: Note POS tag corrections. Use POS tags as guide. You may change them if they hold you back.
NLP and Speech 2004 Semantics I Semantics II
Outline Recap Knowledge Representation I Textbook: Chapters 6, 7, 9 and 10.
NLP and Speech 2004 Feature Structures Feature Structures and Unification.
Meaning Representation and Semantic Analysis Ling 571 Deep Processing Techniques for NLP February 9, 2011.
Introduction to Semantics To be able to reason about the meanings of utterances, we need to have ways of representing the meanings of utterances. A formal.
CS 4705 Semantic Analysis: Syntax-Driven Semantics.
CS 4705 Semantics: Representations and Analyses. What kinds of meaning do we want to capture? Categories/entities –IBM, Jane, a black cat, Pres. Bush.
CS 4705 Lecture 17 Semantic Analysis: Syntax-Driven Semantics.
Let remember from the previous lesson what is Knowledge representation
Categories – relations or individuals? What are the differences in representing collie as a relation vs. an individual? As a relation: collie(lassie) –
Fall 2004 Lecture Notes #7 EECS 595 / LING 541 / SI 661 Natural Language Processing.
CS 4705 Semantic Analysis: Syntax-Driven Semantics.
Describing Syntax and Semantics
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Categories – relations or individuals? What are the differences in representing collie as a relation vs. an individual? As a relation: collie(lassie) –
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
February 2009Introduction to Semantics1 Logic, Representation and Inference Introduction to Semantics What is semantics for? Role of FOL Montague Approach.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Chapter 15. Semantic Analysis
1 Features and Unification Chapter 15 October 2012 Lecture #10.
Representing Meaning Lecture Sep 2007.
Chapter 14. Representing Meaning From: Chapter 14 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition,
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
LDK R Logics for Data and Knowledge Representation Modeling First version by Alessandro Agostini and Fausto Giunchiglia Second version by Fausto Giunchiglia.
Fall 2005 Lecture Notes #6 EECS 595 / LING 541 / SI 661 Natural Language Processing.
November 2003CSA4050: Semantics I1 CSA4050: Advanced Topics in NLP Semantics I What is semantics for? Role of FOL Montague Approach.
1 Natural Language Processing Lecture Notes 11 Chapter 15 (part 1)
Semantic Analysis CMSC Natural Language Processing May 8, 2003.
SEMANTIC ANALYSIS WAES3303
Pattern-directed inference systems
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
Computing Science, University of Aberdeen1 CS4025: Logic-Based Semantics l Compositionality in practice l Producing logic-based meaning representations.
Artificial Intelligence: Natural Language
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Rules, Movement, Ambiguity
Theory and Applications
Knowledge Representation
NLP. Introduction to NLP Pros –Compositional –Declarative Cons –Limited expressive power –Represents facts.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Artificial Intelligence Knowledge Representation.
Semantics (Representing Meaning)
Representations of Meaning
Knowledge Representation
CPE 480 Natural Language Processing
Lecture 24 Semantics September 28, /20/2018.
CSCI 5832 Natural Language Processing
CS 4705: Semantic Analysis: Syntax-Driven Semantics
CSC 594 Topics in AI – Applied Natural Language Processing
بسم الله الرحمن الرحيم ICS 482 Natural Language Processing
Semantics September 23, /22/2018.
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
Semantics: Representations and Analyses
Knowledge Representation I (Propositional Logic)
Presentation transcript:

BİL711 Natural Language Processing April 21, 2017 Representing Meaning Meaning Representation: Capturing the meaning of of linguistic utterances using formal notation. Meaning Representation Languages: Frameworks that are used to specify the syntax and semantics of these meaning representations. Semantic Analysis: Mapping the linguistic utterances to these meaning representations. Correct meaning representation should be selected for the application. For certain language tasks require some form of semantic processing: following a recipe answering an essay question in exam ... BİL711 Natural Language Processing

Meaning Representation Languages Let us look at four frequently used meaning representation languages. First Order Predicate Calculus Semantic Network Conceptual Dependency Diagram Frame-Based Representation Let us look at the representation of “I have a car” in these four formalism. BİL711 Natural Language Processing

Meaning Representation Example - I have a car First Order Predicate Calculus: x,yHaving(x)Haver(Speaker,x)  HadThing(y,x) Car(y) Semantic Network: Haver Speaker Having HadThing Car Conceptual Dependency: Speaker Car Frame-Based Representation: Haver: Speaker HadThing: Car POSS-BY BİL711 Natural Language Processing

What do We Expect from Meaning Representations To be computationally effective, we expect certain properties in meaning representations: Verifiability -- Ability to determine the truth value of the representation. Unambiguous Representations -- A representation must be unambiguous. Canonical Form -- Utterances which means the same thing should map to the same meaning representation. Inference and Variables -- Ability to draw valid conclusions based on the meaning representations of inputs and the background knowledge. Expressiveness -- Ability to express wide range of subject matter. BİL711 Natural Language Processing

BİL711 Natural Language Processing Verifiability Verifiability -- Ability to determine the truth value of the representation by looking at the information available in the knowledge base. Example: Assume that we have the entry serve(Subway,VegetarianFood) in our KB. Question: Does Subway serve vegetarian food? The question should be converted into a logical form (a meaning representation). We should able to verify the truth value of the logical form of the question against our KB. BİL711 Natural Language Processing

Unambiguous Representations Unambiguous Representations -- A meaning representation must be unambiguous. Example: Assume that we are looking the representation of “I want to eat some place near Bilkent”. There will be different meanings of this sentence, and we will prefer one of them. But that chosen meaning representation CANNOT be ambiguous. Vagueness: Vagueness can make it difficult to determine meaning representation, but it does not cause multiple representations. I want to eat Turkish food. Here Turkish food is vague, but it does not cause multiple representations. Meaning representations should be able to maintain a certain level of vagueness. BİL711 Natural Language Processing

BİL711 Natural Language Processing Canonical Form Distinct inputs can map to the same meaning representation. Does Kirac have vegetarian food? Do they have vegetarian food at Kirac? Are vegetarian dishes served at Kirac? We shouldn’t map these sentences to different meaning representations. Canonical Form -- The notion that inputs that mean same thing should have the same meaning representation. To able to map distinct inputs to the same meaning representation, we should able to know that different phrases mean the same thing such as vegetarian food and vegetarian dishes. A word has various word senses, and one of these senses is synonymous with a sense of another word. Word Sense Disambiguation -- The process of choosing the right sense in context. BİL711 Natural Language Processing

Inference and Variables Inference -- Ability to draw valid conclusions based on the meaning representations of inputs and the background knowledge. We should be able to find the truth value of propositions that are not explicitly in KB -- inference. Example: I would like to find a restaurant that serves vegetarian food. This example is complex and we should use variables in its representation. serves(x,VegetarianFood) -- a part of our meaning representation If there is a restaurant serves vegetarian food, our inference mechanism should be able to find it by binding the variable x to that restaurant. BİL711 Natural Language Processing

BİL711 Natural Language Processing Expressiveness Expressiveness -- Ability to express wide range of subject matter. The ideal situation: a single meaning representation language that could adequately represent the meaning of any sensible natural language utterance. Although this ideal situation may not be possible, but the first order predicate calculus (FOPC) is expressive enough to handle a lot of things. In fact, it is claimed that anything can be representable with other three representation language, it can be also representable with FOPC. We will concentrate on FOPC, but other representation languages are also used. For example, Text Meaning Representation (TMR) used in the machine translation system of NMSU is a frame based representation. BİL711 Natural Language Processing

Predicate-Argument Structure All natural languages have a form of predicate-argument arrangement at the core of their semantic structure. Specific relations hold among the constituent words and phrases of the sentence. (predicate and its arguments) Our meaning representation should support the predicate-argument structure induced by the language. In fact, there is a relation between syntactic frames and semantic frames. We will try to find these relations between syntactic frames and semantic frames. Example: Want(somebody,something) -- Want is predicate with two arguments BİL711 Natural Language Processing

Predicate-Argument Structure (cont.) Syntactic Structures: I want Turkish food. NP want NP I want to spend less than five dollars. NP want InfVP I want it to be close by here. NP want NP InfVP Verb sub-categorization rules allow the linking of the arguments of syntactic structures with the semantic roles of these arguments in the semantic representation of that sentence. The study of semantic roles associated with verbs is known as thematic role. In syntactic structures, there are restrictions on the categories of their arguments. Similarly, there are also semantic restrictions on the arguments of the predicates. The selectional restrictions specify semantic restrictions on the arguments of verbs. BİL711 Natural Language Processing

Predicate-Argument Structure (cont.) Other objects (other than verbs) in natural languages may have predicate-argument structure. A Turkish restaurant under fifteen dollars. Under(TurkishRestaurant,$15) meaning representation is associated with the preposition under. The preposition under can be characterized by a two-argument predicate. Make a reservation for this evening for a table for two persons at 8. Reservation(Hearer,Today,8PM,2) meaning representation is associated with the noun reservation (not with make). Our meaning representation should support : variable arity predicate-argument structures the semantic labeling of arguments to predicates semantic constraints on the fillers of argument roles. BİL711 Natural Language Processing

First Order Predicate Calculus (FOPC) First Order Predicate Calculus (FOPC) is a flexible, well-understood, and computationally tractable approach. So, FOPC satisfies the most of the things that we expect from a meaning representation language. FOPC provides a sound computational basis for verifiability, inference, and expressiveness requirements. The most attractive feature of FOPC is that it makes very few specific commitments for how things should be represented. BİL711 Natural Language Processing

BİL711 Natural Language Processing Structure of FOPC Formula  AtomicFormula | Formula Connective Formula | Quantifier Variable,… Formula | Formula | (Formula) AtomicFormula  Predicate(Term,…) Term  Function(Term,…) | Constant | Variable Connective   |  |  Quantifier   |  Constant  A | VegetarianFood | TurkishRestuarant | … Variable  x | y | … Predicate  Serves | Want | Under | … Function  LocationOf | CuisineOf | ... BİL711 Natural Language Processing

BİL711 Natural Language Processing FOPC Example I only have five dollars and I don’t have a lot of time. Have(Speaker,FiveDollars)  Have(Speaker,LotOfTime) A restaurant that serves Turkish food near Bilkent. x Restaurant(x)  Serves(x,TurkishFood)  Near(LocationOf(x),LocationOf(Bilkent)) All vegetarian restaurants serve vegetarian food. x VegetarianRestuarant(x)  Serves(x,VegetarianFood) BİL711 Natural Language Processing

BİL711 Natural Language Processing Semantics of FOPC The truth value of each FOPC formula can be computed using meanings of the elements of FOPC. Truth tables for     Meanings of   Assigned meanings to Predicates, Constant, Functions in an interpretation. The truth values of our examples: Have(Speaker,FiveDollars)  Have(Speaker,LotOfTime) x Restaurant(x)  Serves(x,TurkishFood)  Near(LocationOf(x),LocationOf(Bilkent)) x VegetarianRestuarant(x)  Serves(x,VegetarianFood) BİL711 Natural Language Processing

BİL711 Natural Language Processing Inference Ability to determine the truth value of a formula not explicitly contained in a KB. We should have inference rules to infer new formulas from formulas available in a KB. For example, modes ponens is a inference rule.      Example: VegetarianRestaurant(Kirac) x VegetarianRestuarant(x)  Serves(x,VegetarianFood) Serves(Kirac,VegetarianFood) BİL711 Natural Language Processing

BİL711 Natural Language Processing Inference (cont.) We may use forward chaining or backward chaining in the implementations of inference rules. Implementation of certain inference rules for FOPC is not computationally effective. Resolution is a computationally effective inference rule. Prolog uses resolution and backward chaining. Inference rules must be sound and complete. Sound -- If a formula is derivable using inference rules, it must be valid Complete -- If a formula is valid, it must be derivable. BİL711 Natural Language Processing

Inference -- Prolog Example Prolog uses resolution and backward chaining. father(X,Y) :- parent(X,Y), male(X). parent(john,bill). parent(mary,bill). male(john). female(mary). ?- father(F,bill). BİL711 Natural Language Processing

Representation of Categories The semantics of the arguments are expressed in the form of selectional restrictions. These selectional restrictions are expressed in the form of semantically-based categories. The most common way to represent a category is to create a unary predicate. VegaterianRestraunt(Kirac) Here categories are relations (not objects), and difficult to make assertions about categories. We cannot use MostPopular(Kirac,VegetarianRestraunt) because VegetarianRestraunt is not an object. The arguments of formulas must be Terms (Predicates cannot be arguments ın FOPC). BİL711 Natural Language Processing

Representation of Categories -- Reification Solution is to make each category an object. This technique is know as reification. Thus we can define relations between objects and categories and relations between categories. Membership relation ISA between objects and categories. ISA(Kirac,VegetarianRestraunt) A category inclusion relation AKO between categories. AKO(VegetarianRestraunt,Restraunt) BİL711 Natural Language Processing

Representations of Events The simplest approach to predicate-argument representation of a verb is to have the same number of arguments present in that verb’s subcategorization frame. But this simple approach may cause some difficulties: determining correct number of arguments. Ensuring soundness and completeness Example: I ate. Eating1(Speaker) I ate a turkey sandwich Eating2(Speaker,TurkeySandwich) I ate a turkey sandwich at my desk. Eating3(Speaker,TurkeySandwich,Desk) I ate at my desk. Eating4(Speaker,Desk) I ate lunch. Eating5(Speaker,Lunch) I ate a turkey sandwich for lunch. Eating6(Speaker,TurkeySandwich,Lunch) I ate a turkey sandwich for lunch at my desk. Eating7(Speaker,TurkeySandwich,Lunch,Desk) BİL711 Natural Language Processing

Representations of Events -- Another Approach Using the maximum number of the arguments and the existential quantifiers will not solve the problem. I ate at my desk. x,y Eating(Speaker,x,y,Desk) I ate lunch. x,y Eating(Speaker,x,Lunch,y) I ate lunch at my desk. x Eating(Speaker,x,Lunch,Desk) If we know that 1st and 2nd formulas represent the same event, they can be combined as 3rd formula. But we cannot do this, because we cannot relate events in this approach. BİL711 Natural Language Processing

Representations of Events -- A Solution We employ reification to elevate events to objects. I ate. -- x ISA(x,Eating)  Eater(x,Speaker) I ate a turkey sandwich -- x ISA(x,Eating)  Eater(x,Speaker)  Eaten(x,TurkeySandwich) I ate at my desk. -- x ISA(x,Eating)  Eater(x,Speaker)  PlaceEaten(x,Desk) I ate lunch. -- x ISA(x,Eating)  Eater(x,Speaker)  MealEaten(x,Lunch) With the reified-event approach: There is no need to specify a fixed number of arguments Many roles can be glued when they appear in the input. We do not need to define relations between different versions of eating (postulate) BİL711 Natural Language Processing

Representations of Time Time flows forward, and the events are asocaiated with either points or intervals in time. An ordering among events can be gotten by putting them on the timeline. There can be different schemas for represesenting this kind of temoral information. (the study of temporal logic) The tense of a sentence will correspond to an ordering of events related with that sentence. (the study of tense logic) BİL711 Natural Language Processing

Representations of Time -- Example 1. I arrived in Ankara. 2. I am arriving in Ankara. 3. I will arrive in Ankara. All three sentences can be represented with the following formula without any temporal information. w ISA(w,Arriving)  Arriver(w,Speaker)  Destination(w,Ankara) We can add the following representations of temporal information to represent the tenses of these examples. 1. w,i,e ISA(w,Arriving)  Arriver(w,Speaker)  Destination(w,Ankara)  IntervalOf(w,i)  EndPoint(i,e)  Precedes(e,Now) 2. w,i,e ISA(w,Arriving)  Arriver(w,Speaker)  Destination(w,Ankara)  IntervalOf(w,i)  MemberOf(i,Now) 3. w,i,e ISA(w,Arriving)  Arriver(w,Speaker)  Destination(w,Ankara)  IntervalOf(w,i)  EndPoint(i,e)  Precedes(Now,e) BİL711 Natural Language Processing

Representations of Time (cont.) The relation between simple verb tenses and points in time is not straightforward. We fly from Ankara to Istanbul. -- present tense refers to a future event Flight 12 will be at gate an hour now. -- future tense refers to a past event In some formalisms, the tense of a sentence is expressed with the relation among times of events in that sentence, time of a reference point, and time of utterance. BİL711 Natural Language Processing

Reinhenbach’s Approach to Representing Tenses U Past Perfect I had eaten. Past I ate. R,E U U,R,E Present I eat. E R,U Present Perfect I have eaten. U,R E Future I will eat. U E R Future Perfect I will have eaten. BİL711 Natural Language Processing

Representation of Aspect The notion of Aspect concerns with: whether an event has ended or is ongoing whether it is conceptualized as happening at a point in time or over some interval. whether a particular state exists because of it. Event expressions can be divided into four aspectual classes: Stative -- an event participant having a property at a point of time. I know my departure gate. Activity -- an event associated with some interval (without a clear end point) I drove a Ferrari. Accomplishment -- an event with a natural end point and results in a particular state. He booked me a reservation. Achievement -- an event results in a particular state, but an instant event. He found the gate. BİL711 Natural Language Processing

Representations of Beliefs We can represent a belief as follows: I believe that Mary ate Turkish food. u,v ISA(u,Believing)  ISA(v,Eating)  Believer(u,Speaker)  Believed(u,v)  Eater(v,Mary)  Eaten(v,TurkishFood) But from this, we can get the following (which may not be correct). v ISA(v,Eating)  Eater(v,Mary)  Eaten(v,TurkishFood) We may think that we can represent this as follows, but it will not be a FOPC formula. Believing(Speaker,Eating(Mary,TurkishFood)) A solution is to augment FOPC with operators. (modal logic with modal operators). Believing(Speaker, v ISA(v,Eating)  Eater(v,Mary)  Eaten(v,TurkishFood)) Inference will be complicated with modal logic. BİL711 Natural Language Processing

BİL711 Natural Language Processing Frames We may use other representation languages instead of FOPC. But they will be equivalent to their representations in FOPC. For example, we may use frames to represent our believing example. BELIEVING BELIEVER Speaker EATING BELIEVED EATER Mary EATEN TurkishFood BİL711 Natural Language Processing

BİL711 Natural Language Processing Semantic Analysis Semantic Analysis -- Meaning representations are assigned to linguistic inputs. We need static knowledge from grammar and lexicon. How much semantic analysis do we need? Deep Analysis -- Through syntactic and semantic analysis of the text to capture all pertinent information in the text. Information Extraction -- does not require complete syntactic and semantic analysis. With a cascade of FSAs to produce a robust semantic analyzer. BİL711 Natural Language Processing

Syntax-Driven Semantic Analysis Principle of Compositionality -- the meaning of a sentence can be composed of meanings of its parts. Ordering and groupings will be important. Kirac serves meat. e ISA(e,Serving)  Server(e,Kirac)  Served(e,Meat) S NP VP NP ProperNoun Verb MassNoun Kirac serves meat BİL711 Natural Language Processing

Semantic Augmentation to CFG Rules CFG Rules are attached with semantic attachments. These semantic attachments specify how to compute the meaning representation of a construction from the meanings of its constituent parts. A CFG rule with semantic attachment will be as follows: A  1,…,n { f(j.sem,…,k.sem) } The meaning representation of A, A.sem, will be calculated by applying the function f to the semantic representations of some constituents. BİL711 Natural Language Processing

BİL711 Natural Language Processing Naïve Approach ProperNoun  Kirac { Kirac } MassNoun  meat { Meat } NP  ProperNoun {ProperNoun.sem } NP  MassNoun { MassNoun.sem } Verb  serves {e,x,y ISA(e,Serving)  Server(e,x)  Served(e,y) } But we cannot propagate this representation to upper levels. BİL711 Natural Language Processing

Using Lambda Notations ProperNoun  Kirac { Kirac } MassNoun  meat { Meat } NP  ProperNoun {ProperNoun.sem } NP  MassNoun { MassNoun.sem } Verb  serves {xy e ISA(e,Serving)  Server(e,y)  Served(e,x) } VP  Verb NP { Verb.sem(NP.sem) } S  NP VP { VP.sem(NP.sem) } application of lambda expression lambda expression BİL711 Natural Language Processing

BİL711 Natural Language Processing Quasi-Logical Form During semantic analysis, we may use quantified expressions as terms. In this case, our formula will not be a FOPC formula. We call this form of formulas as quasi-logical form. A quasi-logical form should be converted into a normal FOPC formula by applying simple syntactic translations. Server(e,<x ISA(x,Restaurant)>) a quasi-logical formula  x ISA(x,Restaurant )  Server(e,x) a normal FOPC formula BİL711 Natural Language Processing

Integrating Semantic Analysis into Earley Algorithm April 21, 2017 Integrating Semantic Analysis into Earley Algorithm Modifications required to integrate a semantic analysis into an Earley parser are: The rules of the grammar will have an extra field to hold semantic attachments. The states in the chart will have an extra field to hold the meaning representation of the constituent. The ENQUEUE function will be changed so that when a complete state is entered into the chart its semantics are computed and stored in the state’s semantic field. BİL711 Natural Language Processing