Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC 503 Computational Linguistics

Similar presentations


Presentation on theme: "CPSC 503 Computational Linguistics"— Presentation transcript:

1 CPSC 503 Computational Linguistics
Features and Unification Lecture 13 Giuseppe Carenini 2/28/2019 CPSC503 Spring 2004

2 Today 3/3 Representing Syntactic Knowledge
Feature Structures & Unification Definitions Using F&U in a CFG (Implementing Unification) Parsing with F&U (key ideas) Types and Inheritance 2/28/2019 CPSC503 Spring 2004

3 Representing Syntactic knowledge (English)
English Grammar CFGs Recursion FSA CFGs appear to be just about what we need to account for a lot of basic syntactic structure in English. But there are problems That can be dealt with adequately, although not elegantly, by staying within the CFG framework. There are simpler, more elegant, solutions that take us out of the CFG framework (beyond its formal power) We will use feature structures and the constraint-based unification formalism (recursion NP -> NP PP) Agreement Sub-categorization 2/28/2019 CPSC503 Spring 2004

4 Agreement This dog Those dogs This dog eats You have it Those dogs eat
*This dogs *Those dog *This dog eat *You has it *Those dogs eats Number person 2/28/2019 CPSC503 Spring 2004

5 Subcategorization Sneeze: John sneezed
Find: Please find [a flight to NY]NP Give: Give [me]NP[a cheaper fare]NP Help: Can you help [me]NP[with a flight]PP Prefer: I prefer [to leave earlier]TO-VP Told: I was told [United has a flight]S 2/28/2019 CPSC503 Spring 2004

6 Features: Object-oriented approach to grammar rules and categories
Think of the terminal and non-terminals as complex objects with associated properties (called features) that can be manipulated. Features take on different values Go back to subject verb agreement case An alternative is to rethink the terminal and non-terminals as complex objects with associated properties (called features) that can be manipulated. Features take on different values The application of grammar rules is constrainted by testing on these features The application of grammar rules is constrained by testing on these features (constraint-based) 2/28/2019 CPSC503 Spring 2004

7 Feature Structures Def.: set of feature-value pairs where
Features are atomic symbols and Values are either atomic symbols or feature structures Feature1 Featurek1 Valuek1 Featurek Featurek2 Valuek2 … ... Feature1 Value1 Feature2 Value2 Featuren Valuen 2/28/2019 CPSC503 Spring 2004

8 Example of Syntactic Feature Structures
Number SG Number SG Person 3 First start with a number feature whose value can be either sg or pl. Next add an additional feature to capture the person feature. Finally encode the grammatical category of the constituent. This structure can be used to represent a third person singular NP. Cat NP Number SG Person 3 2/28/2019 CPSC503 Spring 2004

9 Example of Syntactic Bundles of Features
Feature Values can be feature structures themselves. This is useful when certain features commonly co-occur, as number and person. Cat NP Number SG Agreement Person 3 Having feature structures as values leads to the notion of a feature path. Hence an alternative graphical structure for viewing them. 2/28/2019 CPSC503 Spring 2004

10 Feature Structures as DAGs
Cat NP Number SG Agreement Person 3 Features appear as labeled edges; values as nodes. 2/28/2019 CPSC503 Spring 2004

11 Reentrant Structure Multiple features in a feature structure may share the same values. They share the same structure, not just that they have the same value! Numerical indices indicate the shared value. Cat S Number SG Agreement Person 3 Head Subject Agreement 1 We’ll allow multiple features in a feature structure to share the same values. By this we mean that they share the same structure, not just that they have the same value. 2/28/2019 CPSC503 Spring 2004

12 Reentrant DAGs 1 Cat S Number SG Agreement Person 3 Head
Subject Agreement 1 2/28/2019 CPSC503 Spring 2004

13 Paths in Feature Structures
It will also be useful to talk about paths through feature structures. Cat S Number SG Agreement Person 3 Head Subject Agreement 1 <HEAD AGREEMENT NUMBER> <HEAD SUBJECT AGREEMENT NUMBER> ?=? 2/28/2019 CPSC503 Spring 2004

14 Unification of Feature Structures
Two main operations merge the information in two structures check the compatibility of two structures Unification is the computational technique to do both these operations Two main operations of feature structures are necessary to use them to enhance CFGs - information content - reject if they are incompatible Efficient and powerful operation Output: a new feature structure that is more specific (has more information) than, or is identical to, each of the input structures. 2/28/2019 CPSC503 Spring 2004

15 Unification: Features
Two feature structures can be unified if their component features are compatible. Same value [number sg] U [number sg] = [number sg] [number sg] U [number pl] = fails! One unspecified value [number sg] U [number [] ] = [number sg] We say two feature structures can be unified if the component features that make them up are compatible. 2/28/2019 CPSC503 Spring 2004

16 Unification: Feature Structures
Structures are compatible if they contain no features that are incompatible. If so, unification returns the union of all feature/value pairs. Simple Example [number sg] U [person 3] = number sg person 3 2/28/2019 CPSC503 Spring 2004

17 Unification Operation
Agreement [Number sg] Subject [Agreement [Number sg]] U [Subject [Agreement [Person 3]]] = Agreement [Number sg] Number sg Subject Agreement Person 3 2/28/2019 CPSC503 Spring 2004

18 The Unification Operation
[Head [Subject [Agreement [Number PL]]]] U Cat S Number SG Agreement Person 3 Head Subject Agreement 1 1 = ? 2/28/2019 CPSC503 Spring 2004

19 Properties of Unification
Monotonic: if some description is true of a feature structure, it will still be true after unifying it with another feature structure. Order independent: given a set of feature structures to unify, we can unify them in any order and we’ll get the same result. 2/28/2019 CPSC503 Spring 2004

20 Features, Unification, and Grammars
Grammar constituents are complex objects with associated feature structures. The application of grammar rules is constrained by sets of unification constraints: to guide the composition of feature structures for larger grammatical constituents to enforce compatibility constraints between specified parts of grammatical constructions. We’ll incorporate all this into our grammars in two ways: We’ll assume that constituents (both lexical items and gram. Categories) are objects which have feature-structures associated with them We’ll associate sets of unification constraints with grammar rules that must be satisfied for the rule to be satisfied. - guide composition of feature structure for larger gram constituents based on fstruct of their component parts - enforce compatibility constraints between specified parts of the grammar 2/28/2019 CPSC503 Spring 2004

21 Unification Constraints
{ set of constraints } < βi feature path > = atomic value < βi feature path > = < βk feature path > Values found at the end must unify 2/28/2019 CPSC503 Spring 2004

22 Agreement NP  Det Nominal Nominal  Noun Noun  flight Noun  flights
< Det AGREEMENT > = < Nominal AGREEMENT > < NP AGREEMENT > = < Nominal AGREEMENT > Nominal  Noun < Nominal AGREEMENT > = < Noun AGREEMENT > Noun  flight < Noun AGREEMENT NUMBER > = SG Noun  flights < Noun AGREEMENT NUMBER > = PL Det  this < Det AGREEMENT NUMBER > = SG ..only scratched the surface of the English agreement system…. 2/28/2019 CPSC503 Spring 2004

23 Add Example of syntactic tree for NP “this flight” showing what feature structures are associated with each constituent ..only scratched the surface of the English agreement system…. 2/28/2019 CPSC503 Spring 2004

24 Dependency grammars and Lexicalized grammars
Head of a phrase NP  Det Nominal <Det HEAD AGREEMENT> = <Nominal HEAD AGREEMENT> < NP HEAD > = < Nominal HEAD > Nominal  Noun < Nominal HEAD > = < Noun HEAD > Noun  flight < Noun HEAD AGREEMENT NUMBER > = SG Noun  flights < Noun HEAD AGREEMENT NUMBER > = PL The features of most grammatical categories are copied from one of the children to its parent. The children that provide the features is called the head of the phrase Dependency grammars and Lexicalized grammars 2/28/2019 CPSC503 Spring 2004

25 Sub-categorization Verb  lex-item Verb  want Verb  leave
[Verb HEAD SUBCAT <sub-cat. frame>] CAT VP FORM INFINITIVE <Verb HEAD SUBCAT>= Verb  want Verb  leave CAT NP , CAT PP <Verb HEAD SUBCAT>= The features of most grammatical categories are copied from one of the children to its parent VP  Verb NP PP CAT NP , CAT PP <Verb HEAD SUBCAT>= <VP HEAD>= <Verb HEAD> Do this for all VP rules! 2/28/2019 CPSC503 Spring 2004

26 Subcat is complex! Small set of potential phrases
Some subcats for “ask” Quo asked ”what was it like?” NP asking a question NPNP asked myself a question Sto ask him to tell you ………… Large number of potential phrases Each verb can have several subcats not only verbs subcat (table) COMLEX comprehensive subcategorization tagsets and subcat for verbs sdjectives and noun Not just verbs! COMLEX, 1998 2/28/2019 CPSC503 Spring 2004

27 Unification and Parsing
OK, let’s assume we’ve augmented our grammar with sets of path-like unification constraints. What changes do we need to make to a parser to make use of them? Building feature structures and associating them with a subtree Unifying feature structures as subtrees are created Blocking ill-formed constituents Luckily, The order independent nature of unification allows us to largely ignore the actual search strategy used by the parser building feature structures for grammatical constituents and 2/28/2019 CPSC503 Spring 2004

28 Building Feature Structures
Notational conversion: express unification constraints as a feature structure (that can be passed directly to unification algorithm) E.g., S  NP VP <NP HEAD AGREEMENT> = <VP HEAD AGREEMENT> <S HEAD> = <VP HEAD> So whenever this rule is applied this feature structure can be used MAYBE: Features of most grammatical categories are copied from head child to parent (e.g., from V to VP, Nom to NP, N to Nom) S [head ] NP [head [agreement ]] VP [head [agreement ]] 1 2 2/28/2019 CPSC503 Spring 2004

29 Unification and Earley Parsing
With respect to an Earley-style parser… Building feature structures (represented as DAGs) and associate them with states in the chart Unifying feature structures as states are advanced in the chart Block ill-formed states from entering the chart 2/28/2019 CPSC503 Spring 2004

30 Augmenting States with DAGs
We just add a new field to the representation of the states (dotted-rule, location, backpointers, Dag) np [head ] det [head [agreement [number sg]]] nominal[head [agreement ]] 1 2 E.g., NP  Det . Nominal [0,1], [SDet], DAG1 Point out what SDet could have been: Det <- this. [0,1], [], [det [head [agreement [number sg]]] 2/28/2019 CPSC503 Spring 2004

31 Unifying States and Blocking
Keep much of the Earley Algorithm the same. When a new state is created (by combining other states): make sure the individual DAGs unify. add the new DAG (resulting from the unification) to the new state. i ... ... k ... ... Alter COMPLETER We want to unify the DAGs of existing states as they are combined as specified by the grammatical constraints. Also...Don’t add states that have DAGs that are more specific than states in chart; NP-> NP . PP [z,k] [..], dag2 PP-> Prep NP . [k,i] [..], dag1 ... ... NP-> NP PP .[z,i] [..], unify(dag1, dag2) ... 2/28/2019 CPSC503 Spring 2004

32 Creation of new state by combining existing states
Nothing worthwhile is accomplished by entering into a chart entry a state that is more specific than an “identical” state already in the chart Do not enqueue if identical and more specific dag 2/28/2019 CPSC503 Spring 2004

33 Types and Inheritance Two problems with feature structures:
1. No way to place restrictions on feature values. [NUMBER FEMININE] or [PERSON SG] 2. No way to capture generalizations across feature structures. E.g., sharing information from similar subcategories General solution: Types Each feature structure is labeled by type. Types are organized into type hierarchies. Unification is modified to also unify the types of feature structures • Conversely, each type has appropriateness conditions … where more specific types inherit properties from more abstract ones. 2/28/2019 CPSC503 Spring 2004

34 Simple Types Simple types (aka atomic types) : replace simple atomic values (like SG or PL) in standard feature structures E.g., a simple type agr can be the value of the AGREE(MENT) feature. A type hierarchy for the subtypes of type agr • Conversely, each type has appropriateness conditions … where more specific types inherit properties from more abstract ones. 2/28/2019 CPSC503 Spring 2004

35 Complex Types = U Complex types – like verb
specify a set of features appropriate for the type set restrictions on the values of the features specify equality constraints between values E.g verb specify two features: AGREE and VFORM verb AGREE agr VFORM vform VFORM takes values of type vform with subtypes finite, infinitive, gerund base, present-participle, past-participle, and passive-participle Complex types are also part of type hierarchies. Subtypes of complex types inherit all the features of their parents, along with the constraints on their values Unification is modified to work with types: verb AGREE 1st VFORM vform verb-st AGREE sg VFORM finite AGREE 1-sg U = 2/28/2019 CPSC503 Spring 2004

36 Representing Syntactic knowledge (English)
English Grammar CFGs + F&U Recursion FSA CFGs appear to be just about what we need to account for a lot of basic syntactic structure in English. But there are problems That can be dealt with adequately, although not elegantly, by staying within the CFG framework. There are simpler, more elegant, solutions that take us out of the CFG framework (beyond its formal power) We will use feature structures and the constraint-based unification formalism (recursion NP -> NP PP) Agreement Sub-categorization 2/28/2019 CPSC503 Spring 2004

37 Next Time Project proposal deadline (bring you write-up to class)
Project proposal Presentation 5min For content, follow instructions at course project web page Bring 15 handouts If you want to use PowerPoint send me your presentation by 2/28/2019 CPSC503 Spring 2004


Download ppt "CPSC 503 Computational Linguistics"

Similar presentations


Ads by Google