Modeling Long-Distance Dependencies in Double R July 2008 Jerry Ball Human Effectiveness Directorate Air Force Research Laboratory
2 Double R Model Goal: Model the basic grammatical patterns of English to support development of cognitively plausible and functional language comprehension systems – Declaratives – “The man hit the ball” – Questions Yes-No Questions – “Did the man hit the ball?” Wh Questions – “Where did the man hit the ball?” – Imperatives – “Hit the ball!” – Relative Clauses – “The ball that the man hit” – Wh Clauses – “I know where the man hit the ball” – Passive constructions – “The ball was hit”
3 Empirical Evidence Basic grammatical patterns have been most extensively studied in generative grammar – The focus in generative grammar has been on studying the syntactic form of linguistic expressions in isolation from meaning and processing The “Simpler Syntax” of Culicover and Jackendoff (2005) is redressing the consideration of meaning and simplifying syntax as a side effect O’Grady’s “Syntactic Carpentry” (2005) integrates processing as well (see also Hawkins, 2004) Reference grammars (Huddleston & Pullum, 2002; Quirk et al., 1985) provide a wealth of examples which integrate form, function and meaning
4 Long-Distance Dependencies Long-distance dependencies are the sin qua non of modern linguistic theorizing – An important motivation for Chomsky’s transformational grammar – deep structures with arguments in place are mapped to surface structures with arguments “moved” by various transformations Introduction of traces supported the collapsing of deep and surface structure – traces mark the original location Construction specific transformations were generalized to Move subject to universal, parameterized constraints – Many basic grammatical constructions involve long- distance dependencies Wh questions, relative clauses, passive constructions… – Require retention of grammatical information for extended stretches of input
5 Long-Distance Dependencies Binding of pronouns and anaphors: – Anaphors (“himself”) vs. pronouns (“him”) John i kicked himself i ( i = i ) (Principle A of GB Theory) John i kicked him j ( i not = j )(Principle B of GB Theory) – Proper binding often requires use of semantic information (but considered syntactic in generative grammar) John i and Mary j were talking. She j told him i … (gender) John i is reading a book j. It j is about… (animacy) John i is reading the comics j. They j are… (number)
6 Long-Distance Dependencies Verb Control – Object Control: “He i persuaded me j PRO j to go” PRO j is an “implicit” pronoun (a trace without movement) – Subject Control: “He i promised me j PRO i to go” Raising Verbs – “He i seems t i to like me” t i is a trace of a “raised” argument
7 Long-Distance Dependencies Passive Constructions – “The ball i was kicked t i by the man” The object is “raised” out of its normal position and the subject is pushed into an oblique complement position “by the man” Wh Questions – “Who i did John j decide PRO j to see t i ” Relative Clauses – “The ball i that the man kicked t i ”
8 Modeling Long-Distance Dependencies An ontology of DM chunk types supports the grammatical distinctions Productions match buffer elements at the appropriate level of the ontology given the function of the production, e.g. – Production matches pronoun “he…” project nominal and put in subject buffer – Production matches predicate specifier (e.g. “…is…”) project a declarative clause – Production matches declarative clause and a nominal in subject buffer (e.g. “he is…”) integrate the nominal as the subject of the clause – Production matches transitive verb (e.g. “hitting”) functioning as clausal head (e.g. “he is hitting…”) and a nominal (e.g. “…the ball”) integrate the nominal as the object of the verb
9 Ontology of Situation Referring Expressions Decl-sit-refer-expr Yes-no-quest-sit-refer-expr – “Is he going?” Wh-quest-sit-refer-expr – “Where did he go?” Imp-sit-refer-expr – “Don’t go!” Wh-sit-refer-expr – “I know where he went” Rel-sit-refer-expr – “The book that you like” Note: Situation Referring Expression corresponds to Clause in other approaches What are the grammatical cues that trigger recognition of an expression type? These cues need to be accessible!
10 Slots in Referring Expressions Bind-indx (all referring expression types) – Identifier for referring expression Parent (all chunk types) – Links child to parent chunk – Used to avoid multiply integrating chunk into other chunks Token (all chunk types) – Distinguishes types from tokens (and type-tokens) Grammatically relevant semantic info – Animate (all object referring expression types) – Gender (all animate referring expression types) – Number (all object referring expression types) – Person (all object referring expression types)
11 Recognizing Wh-Quest and Wh- Situation Referring Expressions (p cog-process-obj-refer-expr--> project-wh-quest-sit-refer-expr =goal> isa process-obj-refer-expr =wh-focus> isa wh-refer-expr ;; “where” =most-recent-child-sre-head> isa operator-pred-spec ;; “did” =retrieval-2> isa obj-refer-expr ;; “he” =subject> isa nothing =context> isa context - sit-context "wh-quest-sit-refer-expr“ ==> project wh-quest-sit-refer-expr Where did he…? (p cog-process-pred-type project-wh-sit-refer-expr =goal> isa process-pred-type =wh-focus> isa wh-refer-expr ;; “where” =subject> isa refer-expr ;; “he” =retrieval-2> isa pred-type ;; “went” =context> isa context - sit-context "wh-sit-refer-expr" - sit-context "wh-quest-sit-refer-expr" ==> project wh-sit-refer-expr …where he went Note: the more grammatical cues, the greater the likelihood of being correct! “Who kicked…?” “Where the heck is...?” “Why is there…?
12 Modeling Long-Distance Dependencies Model needs simultaneous access to multiple grammatical elements – Serial retrieval from DM is not a viable option – Buffers support simultaneous access – buffers on left-hand side of production constitute focus of attention – limited to ~4 (Cowan, 2000) besides goal and context buffers – Can’t predict in advance of production selection which grammatical elements will be needed – Buffers and productions are functionally motivated – they are needed in the processing of various constructions A model with fewer buffers (and productions) that handles a similar set of phenomena might be a better model, but a model with fewer buffers that handles fewer phenomena is not comparable (Ball, in preparation)
13 Double R Buffers – Single Chunk Subject – stores the subject Wh-focus – stores the fronted wh expression Rel-focus – stores the relative clause marker Context – stores contextual information Construct – buffer for constructing DM chunks – Dual path processing – construct chunk vs. retrieve chunk Retrieval-2 – buffer for storing retrieved or constructed DM chunks – Retrieval buffer only used temporarily, retrieved chunk is copied into retrieval-2 for subsequent processing Most-recent-loc-refer-expr – just the most recent – Supports locative fronting “On the table is the book”
14 Double R Buffers – Multiple Chunk Most-recent-child- obj-refer-expr Most-recent-parent- obj-refer-expr Most-recent-grandparent- obj-refer-expr Most-recent-child- obj-refer-expr-head Most-recent-parent- obj-refer-expr-head Most-recent-grandparent- obj-refer-expr-head St-wm-1 St-wm-2 St-wm-3 St-wm-4 Four generic Short-Term Working Memory buffers Obj-Refer-Expr buffersObj-Refer-Expr-Head buffers Note: object referring expression corresponds to nominal in other approaches
15 Double R Buffers – Multiple Chunk Most-recent-child- sit-refer-expr Most-recent-parent- sit-refer-expr Most-recent-grandparent- sit-refer-expr Most-recent-child- sit-refer-expr-head Most-recent-parent- sit-refer-expr-head Most-recent-grandparent- sit-refer-expr-head Note 1: with the introduction of obj-refer-expr and sit-refer-expr specific buffers, the short-term working memory buffers are infrequently used (primarily for conjunctions and adverbs) Note 2: child, parent and grandparent buffers are all directly accessible, whereas only st-wm-1 is directly accessible Sit-Refer-Expr buffersSit-Refer-Expr-Head buffers
16 Long-Distance Dependencies I want to go Infinitive sit-refer-expr has implied subj with trace bound to matrix subj Combination of “bind-indx” and “trace” needed to indicate long-distance dependency Traces only occur in argument positions! Note: entire representation is not accessible at once!
17 Long-Distance Dependencies Subject control (verb): matrix clause subject binds to subject of infinitive situation complement – subject must be accessible He promised me to go Alternative view: antecedent & trace both bind to same object in situation model
18 Long-Distance Dependencies Object control (verb): matrix clause (indirect) object binds to subject of infinitive situation complement – object must be accessible He persuaded me to go
19 Who did he kick the ball to? Object of preposition is bound to fronted who-obj-refer-expr – wh-focus must be accessible Long-Distance Dependencies
20 Long-Distance Dependencies The man that I gave the book I-Obj Trace to Obj-Refer-Expr with animate or human head Rel-focus co-indexed with Obj-Refer-Expr rel-focus and subject must be accessible (rel-focus is optional)
21 Long-Distance Dependencies The book that I gave the man Obj Trace to Obj-Refer-Expr with inanimate head Rel-focus co-indexed with Obj-Refer-Expr rel-focus and subject must be accessible (rel-focus is optional)
22 Architectural Constraints No hard architectural limit on the number of buffers Buffers provide the context for production selection and execution – Highly context sensitive Productions limited to accessing ~4 buffers on left- hand side (beside goal and context buffers) – Focus of attention (Cowan, 2000) – “Conscious activity corresponds to the manipulation of the contents of these buffers by production rules” (Anderson, 2007) Can humans learn to buffer useful information? – Fronted Wh-expression buffer very useful in English, but not needed in in situ languages like Chinese
23 Processing Constraints A “mildly” deterministic, serial processing mechanism (selection and integration) operating over a parallel, probabilistic substrate (activation) Interactive and non-autonomous processing (no distinctly syntactic representations exist) Incremental processing with immediate determination of meaning – word by word No algorithmic backtracking or lookahead – a mechanism of context accommodation (Ball et al. 2007) used instead Forward chaining only Declarative and explicit linguistic representations generated via implicit execution of productions Operates in real-time on Marr’s algorithmic level (serial and parallel processing are relevant) – No slow down with length of input
24 Summary Additions to model are – motivated by functional considerations – driven by empirical evidence – constrained by well-established cognitive constraints on language processing Goal is a large-scale, functional language comprehension system implemented in the ACT-R cognitive architecture Model currently handles a fairly wide-range of grammatical constructions including numerous forms of long-distance dependency
25 Questions?
26 Ball, J., Heiberg, A. & Silber, R. (2007). Toward a Large-Scale Model of Language Comprehension in ACT-R 6. Proceedings of the 8th International Conference on Cognitive Modeling. References Ball, J. (in preparation). A Naturalistic Functional Approach to Modeling Language Comprehension. Culicover, P. & Jackendoff, R. (2005). Simpler Syntax. Oxford: Oxford University Press. Huddleston, R. & Pullum G. (2002). The Cambridge Grammar of the English Language. NY: Cambridge Unversity Press. O’Grady, William (2005). Syntactic Carpentry, an Emergentist Approach to Syntax. Mahway, NJ: LEA. Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. (1985). A Comprehensive Grammar of the English Language. Essex, UK: Pearson Education Limited. Cowan, N. (2000). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, Hawkins, J. (2004). Efficiency and Complexity in Grammars. Oxford: Oxford University Press. Anderson, J. (2007). How Can the Human Mind Occur in the Physical Universe. Oxford: Oxford University Press.
27 Long-Distance Dependencies The ball by the table was kicked by the man passive cue (be + V-ed or V-en) Subject co-indexed with Object subject must be accessible