A computational study of cross-situational techniques for learning word-to- meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March.

A computational study of cross-situational techniques for learning word-to- meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March 5, 2006

The Problem: Mapping Words to Concepts ► Child hears John went to school ► Child sees GO(John, TO(school)) ► Child must learn  John  John  went  GO(x, y)  to  TO(x)  school  school

Two Problems ► Referential uncertainty  MOVE(John, feet)  WEAR(John, RED(shirt)) ► Determining the correct alignment  John  TO(x)  walked  school  to  John  school  GO(x, y)

Helpful Constraints ► Partial Knowledge ► Cross-situational inference ► Covering constraints ► Exclusivity

Partial Knowledge ► Child hears Mary lifted the block ► Child sees  CAUSE(Mary, GO(block, UP))  WANT(Mary, block)  BE(block, ON(table)) ► If the child knows lift contains CAUSE, the second two hypotheses can be ruled out.

Cross-situational inference ► John lifted the ball  CAUSE(John, GO(ball, UP)) ► Mary lifted the block  CAUSE(Mary, GO(block, UP)) ► Thus, lifted  {UP, GO(x, y), GO(x, UP), CAUSE(x, y), CAUSE(x, GO(y, z)), CAUSE(x, GO(y, UP))}

Covering constraints ► Assume: all components of an utterance’s meaning come from the meanings of words in that utterance. ► If it is known that CAUSE is not part of the meaning of John, the or ball, it must be part of the meaning of lifted. ► (But what about constructional meaning?)

Exclusivity ► Assume: any portion of the meaning of an utterance comes from no more than one of its words. ► If John walked  WALK(John) and John  John Then walked can be no more than walked  WALK(x)

Three more problems ► Bootstrapping ► Noisy Input ► Homonymy

Bootstrapping ► Lexical acquisition is much easier if some of the language is already known ► Some of Siskind’s strategies (e.g. cross- situational learning) work without such knowledge ► Others (e.g. exclusivity) require it. ► The algorithm starts off slow, then speeds up

Noise ► Only a subset of all possible meanings will be available to the algorithm ► If none of them contain the correct meaning, cross-situational learning would cause those words never to be acquired ► Some portion of the input must be ignored. ► (A statistical approach is rejected – it is not clear why)

Homonymy ► Similar to noisy input, cross-situational techniques would fail to find a consistent mapping for homonymous words. ► When an inconsistency is found, a split is made. ► If the split is corroborated, a new sense is created; otherwise it is noise.

The problem, formally stated ► From: a sequence of utterances  Each utterance is an unordered collection of words  Each utterance is paired with a set of conceptual expressions ► To: a lexicon  The lexicon maps each word to a set of conceptual expressions, one for each sense of the word

Composition ► Select one sense for each word ► Find all ways of combining these conceptual expressions ► The meaning of an utterance is derived only from the meaning of its component words. ► Every conceptual expression in the meanings of the words must appear in the final conceptual expression (copies are possible)

The simplified algorithm: no noise or homonymy ► Two learning stages  Stage 1: The set of conceptual symbols  E.g. {CAUSE, GO, UP}  Stage 2: The conceptual expression  CAUSE(x, GO(y, UP))

Stage 1: Conceptual symbol set ► Maintain sets of necessary and possible conceptual symbols for each word ► Initialize the former to the empty set and the latter to the universal set ► Utterances will increase the necessary set and decrease the possible set, until they converge on the actual conceptual symbol set

Stage 2: Conceptual expression ► Maintain a set of possible conceptual expressions for each word ► Initialize to the set of all expressions that can be composed from the actual conceptual symbol set ► New utterances will decrease the possible conceptual expression set until only one remains

Example necessaryPossible John {John} {John, ball} Took{CAUSE} {CAUSE, WANT, GO, TO, arm} The{} {WANT, arm} Ball {ball} {ball, arm}

Selecting the meaning John took the ball ► CAUSE(John, GO(ball, TO(John))) ► WANT(John, ball) ► CAUSE(John, GO(PART-OF (LEFT(arm), John), TO(ball)))  Second is eliminated because no CAUSE  Third is eliminated because no word has LEFT or PART-OF

Updated table necessaryPossible John {John} Took {CAUSE, GO, TO} The{}{} Ball {ball}

Stage 2 CAUSE(John, GO(ball, TO(John))) John {John} Took {CAUSE(x, GO(y, TO(x)))} The{} Ball {ball}

Noise and Homonymy ► Noisy or homonymous data can corrupt the lexicon ► Adding an incorrect element to the set of necessary elements ► Taking a correct element away from the set of possible elements ► This may or may not create an inconsistent entry

Extended algorithm ► Necessary and possible conceptual symbols are mapped to senses rather than words ► Words are mapped to their senses ► Each sense has a confidence factor

Sense assignment ► For each utterance, find the cross-product of all the senses ► Choose the “best” consistent sense assignment ► Update the entries for those senses as before ► Add to a sense’s confidence factor each time it is used in a preferred assignment

Inconsistent utterances ► Add the minimal number of new senses until the utterance is no longer inconsistent – three possibilities ► If the current utterance is noise, new senses are bad (and will be ignored) ► There really are new senses ► The original senses were bad, and the right senses are only now being added. ► On occasion, remove senses with low confidence factors

Four simulations ► Vary the task along five parameters ► Vocabulary growth rate by size of corpus ► Number of required exposures to a word by size of corpus ► How high can it scale?

Method (1 of 2) ► Construct a random lexicon ► Vary it by three parameters  Vocabulary size  Homonymy rate  Conceptual-symbol inventory size

Method (2 of 2) ► Construct a series of utterances, each paired with a set of meaning hypotheses ► Vary this by the following parameters  Noise rate  Degree of referential uncertainty  Cluster size (5)  Similarity probability (.75)

Sensitivity analysis

Vocabulary size

Degree of referential uncertainty

Noise rate

Conceptual-symbol inventory size

Homonymy rate

Vocabulary Growth

Number of exposures

A computational study of cross-situational techniques for learning word-to- meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March.

Similar presentations

Presentation on theme: "A computational study of cross-situational techniques for learning word-to- meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A computational study of cross-situational techniques for learning word-to- meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March.

Similar presentations

Presentation on theme: "A computational study of cross-situational techniques for learning word-to- meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March."— Presentation transcript:

Similar presentations

About project

Feedback