Download presentation
Presentation is loading. Please wait.
1
Starting With Complex Primitives Pays Off: Complicate Locally, Simplify Globally ARAVIND K. JOSHI Department of Computer and Information Science and Institute for Research in Cognitive Science
2
2 Outline Introduction Towards CLSG Syntactic description Semantic composition Statistical processing Psycholinguistic properties Applications to other domains Discourse structure Folded structure of biomolecular sequences Summary
3
3 Introduction Formal systems to specify a grammar formalism Start with primitives (basic primitive structures or building blocks) as simple as possible and then introduce various operations for constructing more complex structures Such systems are string rewriting systems, requiring string adjacency of function and argument Alternatively,
4
4 Introduction: CLSG Start with complex (more complicated) primitives which directly capture some crucial linguistic properties and then introduce some general operations for composing them -- Complicate Locally, Simplify Globally (CLSG) CLSG systems are structure rewriting systems, requiring structure adjacency of function and argument CLSG approach is characterized by localizing almost all complexity in the set of primitives, a key property
5
5 Introduction: CLSG – localization of complexity Specification of the set of complex primitives becomes the main task of a linguistic theory CLSG pushes non-local dependencies to become local, i. e., they arise initially in the primitive structures to start with
6
6 CLSG CLSG approach as led to several new insights into Syntactic description Semantic composition Language generation Statistical processing Psycholinguistic properties Discourse structure
7
7 Context-free Grammars The domain of locality is the one level tree -- primitive building blocks CFG, G S NP VP VP V NP VP VP ADV NP DET N DET the N man/car V likes ADV passionately S NPVP man VPADV DET N passionately likes VP NPVADV N N car DET the VP NP V
8
8 Context-free Grammars The arguments of the predicate are not in the same local domain They can be brought together in the same domain -- by introducing a rule S NP V NP However, then the structure is lost Further the local domains of a CFG are not necessarily lexicalized Domain of Locality and Lexicalization
9
9 Towards CLSG: Lexicalization Lexical item One or more elementary structures (trees, directed acyclic graphs), which are syntactically and semantically encapsulated. Universal combining operations Grammar Lexicon
10
10 Lexicalized Grammars Context-free grammar (CFG) CFG, G S NP VP VP V NP VP VP ADV NP Harry NP peanuts V likes ADV passionately (Non-lexical) (Lexical)S NPVP Harry VPADV V NP passionately likespeanuts
11
11 Weak Lexicalization Greibach Normal Form (GNF) CFG rules are of the form A a B 1 B 2... B n A a This lexicalization gives the same set of strings but not the same set of trees, i.e., the same set of structural descriptions. Hence, it is a weak lexicalization.
12
12 Strong Lexicalization Same set of strings and same set of trees or structural descriptions. Tree substitution grammars (TSG) –Increased domain of locality –Substitution as the only combining operation
13
13 :: X X X Substitution
14
14 Strong Lexicalization Tree substitution grammars (TSG) CFG, G S NP VP VP V NP NP Harry NP peanuts V likes TSG, G’ 1 S NP VP V NP likes 22 NP Harry 3 NP peanuts
15
15 Insufficiency of TSG Formal insufficiency of TSG G: S SS (non-lexical) S a (lexical) CFG: TSG: G’: 1 : S SS S a 2:2: S SS S a 3:3: S a
16
16 Insufficiency of TSG TSG: G’: 1 : S SS S a 2:2: S SS S a 3:3: S a : S S S SS SS S S a a a a a G’ can generate all strings of G but not all trees of G. CFGs cannot be lexicalized by TSG’s, i.e., only by substitution. grows on both sides of the root
17
17 X X* X X X Tree adjoined to tree at the node labeled X in the tree Adjoining
18
18 With Adjoining TSG: G’: 1 : S S*S a 2:2: S S a 3 : a S G: S SS S a Adjoining 2 to 3 at the S node, the root node and then adjoining 1 to the S node of the derived tree we have . : S SS SS a a a CFGs can be lexicalized by LTAGs. Adjoining is crucial for lexicalization. Adjoining arises out of lexicalization
19
19 Lexicalized LTAG Finite set of elementary trees anchored on lexical items -- extended projections of lexical anchors, -- encapsulate syntactic and semantic dependencies Elementary trees: Initial and Auxiliary Operations: Substitution and Adjoining Derivation: –Derivation Tree How elementary trees are put together. –Derived tree
20
20 agreement: person, number, gender subcategorization: sleeps: null; eats: NP; gives: NP NP; thinks: S filler-gap: who did John ask Bill to invite e word order: within and across clauses as in scrambling and clitic movement function – argument: all arguments of the lexical anchor are localized Localization of Dependencies
21
21 Localization of Dependencies word-clusters (flexible idioms): non-compositional aspect take a walk, give a cold shoulder to word co-occurrences lexical semantic aspects statistical dependencies among heads anaphoric dependencies
22
22 S NP V likes S NP V likes NP e S transitive object extraction some other trees for likes: subject extraction, topicalization, subject relative, object relative, passive, etc. VP LTAG: Examples
23
23 S NP V likes NP e S VP S NP V S* think VP V S does S* NP who Harry Bill LTAG: A derivation
24
24 S NP V likes NP e S VP S NP V S* think VP V S does S* NP who Harry Bill substitution adjoining who does Bill think Harry likes LTAG: A Derivation
25
25 LTAG: Derived Tree S NP S V does S NP V think VP S NP V likes e VP who Harry Bill who does Bill think Harry likes
26
26 who does Bill think Harry likes likes who think Harry does Bill * Compositional semantics on this derivation structure * Related to dependency diagrams substitution adjoining LTAG: Derivation Tree
27
27 S a Sb S ab S a S b S a b Topology of Elementary Trees: Nested Dependencies Topology of elementary trees, and determines the Nature of dependencies described by the TAG grammar G: a a a…b b b
28
28 S a S b S a S b S* Topology of elementary trees and determines the kinds of dependencies that can be characterized b is one level below a and to the right of the spine Topology of Elementary Trees: Crossed dependencies
29
29 S a S b S a S b S* S a S b S S a b a a b b Linear structure Topology of Elementary Trees: Crossed dependencies
30
30 Examples: Nested Dependencies Center embedding of relative clauses in English The rat 1 the cat 2 chased 2 ate 1 the cheese Center embedding of complement clauses in German Hans 1 Peter 2 Marie 3 schwimmen 3 lassen 2 sah 1 (Hans saw Peter make Marie swim)
31
31 Examples: Crossed Dependencies Center embedding of complement clauses in Dutch Jan 1 Piet 2 Marie 3 zag 1 laten 2 zwemmen 3 (Jan saw Piet make Marie swim) It is possible to obtain a wide range of complex dependencies, i.e., complex combinations of nested and crossed dependencies. Such patterns arise in word order phenomena such as scrambling and clitic climbing and also due to scope ambiguities
32
32 Factoring recursion from the domain of dependencies (FRD) and extended domain of locality (EDL) All interesting properties of LTAG follow from FRD and EDL: mathematical, linguistic and processing Belong to the class of so-called mildly context-sensitive grammars Automaton equivalent of TAG, embedded pushdown automaton, EPDA LTAG: Some Important Properties
33
33 Processing of crossed and nested dependencies Jan 1 Piet 2 Marie 3 zag 1 laten 2 zwemmen 3 Crossed dependencies (CD): Nested dependencies (ND): Hans 1 Peter 2 Marie 3 schwimmen 3 lassen 2 sah 1 (Jan saw Peter make Marie swim) CD’s are easier to process (about one-half) than ND’s (Bach, Brown, and Marslen-Wilson (1986) Principle of partial interpretation (PPI) EPDA model correctly predicts BBM results Joshi (1990)
34
34 Some Important Properties of LTAG Extended domain of locality (EDL) –Localizing dependencies –Set of elementary trees are the domains for specifying linguistic constraints Factoring recursion from the domain of dependencies (FRD) All interesting properties of LTAG follow from EDL and FRD: mathematical, linguistic, and processing Belongs to the class of mildly context- sensitive grammars
35
35 A different perspective on LTAG Treat the elementary trees associated with a lexical item as if they are super part of speech (super-POS or supertags) Local statistical techniques have been remarkably successful in disambiguating standard POS Apply these techniques for disambiguating supertags -- almost parsing
36
36 Supertag disambiguation -- supertagging Given a corpus parsed by an LTAG grammar –we have statistics of supertags -- unigram, bigram, trigram, etc. –these statistics combine the lexical statistics as well as the statistics of the constructions in which the lexical items appear
37
37 Supertagging the purchase price includes two ancillary companies On the average a lexical item has about 8 to 10 supertags
38
38 Supertagging the purchase price includes two ancillary companies - Select the correct supertag for each word -- shown in green - Correct supertag for a word means the supertag that corresponds to that word in the correct parse of the sentence
39
39 Supertagging -- performance - Performance of a trigram supertagger - Performance on the WSJ corpus, Srinivas (1997), Chen (2002) Size of the training corpus Size of the test corpus # of words correctly supertagged % correct Baseline 47,000 35,391 75.3% 1 million 47,000 43,334 92.2%
40
40 Abstract character of supertagging Complex (richer) descriptions of primitives (anchors) –contrary to the standard mathematical convention –descriptions of primitives are simple –complex descriptions are made from simple descriptions Associate with each primitive all information associated with it
41
41 Complex descriptions of primitives Making descriptions of primitives more complex –increases the local ambiguity, i.e., there are more descriptions for each primitive –however, these richer descriptions of primitives locally constrain each other –analogy to a jigsaw puzzle -- the richer the description of each primitive the better
42
42 Complex descriptions of primitives Making the descriptions of primitives more complex –allows statistics to be computed over these complex descriptions –these statistics are more meaningful –local statistical computations over these complex descriptions lead to robust and efficient processing
43
43 Flexible Composition X Split at x X X supertree of at X subtree of at X Adjoining as Wrapping
44
44 X X X X X wrapped around i.e., the two components and are wrapped around supertree of at X subtree of at X Flexible Composition Adjoining as Wrapping
45
45 S V NP likes NP(wh) e S VP S NP V S*S* think VP substitution adjoining Flexible Composition Wrapping as substitutions and adjunctions NP - We can also view this composition as wrapped around - Non-directional composition
46
46 S* V NP likes NP(wh) e S VP S NP V S*S* think VP substitution adjoining Adjoining as Wrapping Wrapping as substitutions and adjunctions NP S and are the two components of attached (adjoined) to the root node S of attached (substituted) at the foot node S of
47
47 Multi-component LTAG (MC-LTAG) - The components are used together in one composition step with the individual components being composed with either substitution or adjoining - The representation can be used for both -- predicate argument relationships -- scope information - The two pieces of information are together before the single composition step - However, after the composition there may be intervening material between the components
48
48 Tree-Local Multi-component LTAG (MC-LTAG) - How can the components of MC-LTAG compose preserving locality of LTAG - Tree-Local MC-LTAG -- Components of a set compose only with an elementary tree or an elementary component - Flexible composition - Tree-Local MC-LTAGs are weakly equivalent to LTAGs - However, Tree-Local MC-LTAGs provide structural descriptions not obtainable by LTAGs - Increased strong generative power
49
49 Scope ambiguities: Example S* NP DET NN every S* NP DET NN some S NP VPVP V hates N student N course ( every student hates some course)
50
50 Derivation with scope information: Example S* NP DET NN every S* NP DET NN some S NP VPVP V hates N student N course ( every student hates some course)
51
51 Derivation tree with scope information: Example (hates) (E) (every) (some) (S) (student) (course) 0 0 1 2. 2 2 2 ( every student hates some course) - and are both adjoined at the root of (hates) - They can be adjoined in any order - will outscope (S) if (E)is adjoined before (S) - Scope information represented in the LTAG system itself
52
52 Competence/Performance Distinction: A New Twist For a property, P, of language, how does one decide whether P is a competence or a performance property? The answer is not given a-priori It depends on the formal devices (grammars and corresponding machines) available for describing language
53
53 Competence/Performance Distinction: A New Twist With MC-TAG and flexible composition all word order patterns up to two levels of embedding can be described with correct structural structural descriptions assigned, i.e., with correct semantics Examples: center embedding of complement clauses, clitic movement, scope ambiguities, etc. Beyond two levels of embedding, although all word order patterns can be described, there is no guarantee that correct semantics can be assigned to all strings No corresponding result known so far for center embedding of relative clauses as in English
54
54 Summary Complex primitive structures (building blocks) CLSG: Complicate Locally, Simplify Globally CLSG makes non-local dependencies become local, i.e., they are encapsulated in the primitive building blocks New insights into Syntactic description Semantic composition Statistical processing Psycholinguistic properties Applications to other domains
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.