A computational study of cross-situational techniques for learning word-to- meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March 5, 2006
The Problem: Mapping Words to Concepts ► Child hears John went to school ► Child sees GO(John, TO(school)) ► Child must learn John John went GO(x, y) to TO(x) school school
Two Problems ► Referential uncertainty MOVE(John, feet) WEAR(John, RED(shirt)) ► Determining the correct alignment John TO(x) walked school to John school GO(x, y)
Helpful Constraints ► Partial Knowledge ► Cross-situational inference ► Covering constraints ► Exclusivity
Partial Knowledge ► Child hears Mary lifted the block ► Child sees CAUSE(Mary, GO(block, UP)) WANT(Mary, block) BE(block, ON(table)) ► If the child knows lift contains CAUSE, the second two hypotheses can be ruled out.
Cross-situational inference ► John lifted the ball CAUSE(John, GO(ball, UP)) ► Mary lifted the block CAUSE(Mary, GO(block, UP)) ► Thus, lifted {UP, GO(x, y), GO(x, UP), CAUSE(x, y), CAUSE(x, GO(y, z)), CAUSE(x, GO(y, UP))}
Covering constraints ► Assume: all components of an utterance’s meaning come from the meanings of words in that utterance. ► If it is known that CAUSE is not part of the meaning of John, the or ball, it must be part of the meaning of lifted. ► (But what about constructional meaning?)
Exclusivity ► Assume: any portion of the meaning of an utterance comes from no more than one of its words. ► If John walked WALK(John) and John John Then walked can be no more than walked WALK(x)
Three more problems ► Bootstrapping ► Noisy Input ► Homonymy
Bootstrapping ► Lexical acquisition is much easier if some of the language is already known ► Some of Siskind’s strategies (e.g. cross- situational learning) work without such knowledge ► Others (e.g. exclusivity) require it. ► The algorithm starts off slow, then speeds up
Noise ► Only a subset of all possible meanings will be available to the algorithm ► If none of them contain the correct meaning, cross-situational learning would cause those words never to be acquired ► Some portion of the input must be ignored. ► (A statistical approach is rejected – it is not clear why)
Homonymy ► Similar to noisy input, cross-situational techniques would fail to find a consistent mapping for homonymous words. ► When an inconsistency is found, a split is made. ► If the split is corroborated, a new sense is created; otherwise it is noise.
The problem, formally stated ► From: a sequence of utterances Each utterance is an unordered collection of words Each utterance is paired with a set of conceptual expressions ► To: a lexicon The lexicon maps each word to a set of conceptual expressions, one for each sense of the word
Composition ► Select one sense for each word ► Find all ways of combining these conceptual expressions ► The meaning of an utterance is derived only from the meaning of its component words. ► Every conceptual expression in the meanings of the words must appear in the final conceptual expression (copies are possible)
The simplified algorithm: no noise or homonymy ► Two learning stages Stage 1: The set of conceptual symbols E.g. {CAUSE, GO, UP} Stage 2: The conceptual expression CAUSE(x, GO(y, UP))
Stage 1: Conceptual symbol set ► Maintain sets of necessary and possible conceptual symbols for each word ► Initialize the former to the empty set and the latter to the universal set ► Utterances will increase the necessary set and decrease the possible set, until they converge on the actual conceptual symbol set
Stage 2: Conceptual expression ► Maintain a set of possible conceptual expressions for each word ► Initialize to the set of all expressions that can be composed from the actual conceptual symbol set ► New utterances will decrease the possible conceptual expression set until only one remains
Example necessaryPossible John {John} {John, ball} Took{CAUSE} {CAUSE, WANT, GO, TO, arm} The{} {WANT, arm} Ball {ball} {ball, arm}
Selecting the meaning John took the ball ► CAUSE(John, GO(ball, TO(John))) ► WANT(John, ball) ► CAUSE(John, GO(PART-OF (LEFT(arm), John), TO(ball))) Second is eliminated because no CAUSE Third is eliminated because no word has LEFT or PART-OF
Updated table necessaryPossible John {John} Took {CAUSE, GO, TO} The{}{} Ball {ball}
Stage 2 CAUSE(John, GO(ball, TO(John))) John {John} Took {CAUSE(x, GO(y, TO(x)))} The{} Ball {ball}
Noise and Homonymy ► Noisy or homonymous data can corrupt the lexicon ► Adding an incorrect element to the set of necessary elements ► Taking a correct element away from the set of possible elements ► This may or may not create an inconsistent entry
Extended algorithm ► Necessary and possible conceptual symbols are mapped to senses rather than words ► Words are mapped to their senses ► Each sense has a confidence factor
Sense assignment ► For each utterance, find the cross-product of all the senses ► Choose the “best” consistent sense assignment ► Update the entries for those senses as before ► Add to a sense’s confidence factor each time it is used in a preferred assignment
Inconsistent utterances ► Add the minimal number of new senses until the utterance is no longer inconsistent – three possibilities ► If the current utterance is noise, new senses are bad (and will be ignored) ► There really are new senses ► The original senses were bad, and the right senses are only now being added. ► On occasion, remove senses with low confidence factors
Four simulations ► Vary the task along five parameters ► Vocabulary growth rate by size of corpus ► Number of required exposures to a word by size of corpus ► How high can it scale?
Method (1 of 2) ► Construct a random lexicon ► Vary it by three parameters Vocabulary size Homonymy rate Conceptual-symbol inventory size
Method (2 of 2) ► Construct a series of utterances, each paired with a set of meaning hypotheses ► Vary this by the following parameters Noise rate Degree of referential uncertainty Cluster size (5) Similarity probability (.75)
Sensitivity analysis
Vocabulary size
Degree of referential uncertainty
Noise rate
Conceptual-symbol inventory size
Homonymy rate
Vocabulary Growth
Number of exposures