Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.

Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana

Cultural transmission Most knowledge is based on secondhand data Some things can only be learned from others –cultural objects transmitted across generations Studying the cognitive aspects of cultural transmission provides unique insights…

Iterated learning (Kirby, 2001) Each learner sees data, forms a hypothesis, produces the data given to the next learner c.f. the playground game “telephone”

Objects of iterated learning It’s not just about languages… In the wild: –religious concepts –social norms –myths and legends –causal theories In the lab: –functions and categories

Outline 1.Analyzing iterated learning 2.Iterated Bayesian learning 3.Examples 4.Iterated learning with humans 5.Conclusions and open questions

Variables x (t+1) independent of history given x (t) Converges to a stationary distribution under easily checked conditions for ergodicity xx x xx x x x Transition matrix T = P(x (t+1) |x (t) ) Markov chains

Stationary distributions Stationary distribution: In matrix form  is the first eigenvector of the matrix T Second eigenvalue sets rate of convergence

A Markov chain on hypotheses Transition probabilities sum out data Stationary distribution and convergence rate from eigenvectors and eigenvalues of Q –can be computed numerically for matrices of reasonable size, and analytically in some cases

Infinite populations in continuous time “Language dynamical equation” “Neutral model” (f j (x) constant) Stable equilibrium at first eigenvector of Q (Nowak, Komarova, & Niyogi, 2001) (Komarova & Nowak, 2003)

Bayesian inference Reverend Thomas Bayes Rational procedure for updating beliefs Foundation of many learning algorithms (e.g., Mackay, 2003) Widely used for language learning (e.g., Charniak, 1993)

Bayes’ theorem Posterior probability LikelihoodPrior probability Sum over space of hypotheses h: hypothesis d: data

Iterated Bayesian learning Learners are Bayesian agents

Markov chains on h and d Markov chain on h has stationary distribution Markov chain on d has stationary distribution the prior predictive distribution

Markov chain Monte Carlo A strategy for sampling from complex probability distributions Key idea: construct a Markov chain which converges to a particular distribution –e.g. Metropolis algorithm –e.g. Gibbs sampling

Gibbs sampling For variables x = x 1, x 2, …, x n Draw x i (t+1) from P(x i |x -i ) x -i = x 1 (t+1), x 2 (t+1),…, x i-1 (t+1), x i+1 (t), …, x n (t) Converges to P(x 1, x 2, …, x n ) (a.k.a. the heat bath algorithm in statistical physics) (Geman & Geman, 1984)

Gibbs sampling (MacKay, 2003)

Iterated learning is a Gibbs sampler Iterated Bayesian learning is a sampler for Implies: –(h,d) converges to this distribution –converence rates are known (Liu, Wong, & Kong, 1995)

An example: Gaussians If we assume… –data, d, is a single real number, x –hypotheses, h, are means of a Gaussian,  –prior, p(  ), is Gaussian(  0,  0 2 ) …then p(x n+1 |x n ) is Gaussian(  n,  x 2 +  n 2 )

An example: Gaussians If we assume… –data, d, is a single real number, x –hypotheses, h, are means of a Gaussian,  –prior, p(  ), is Gaussian(  0,  0 2 ) …then p(x n+1 |x n ) is Gaussian(  n,  x 2 +  n 2 ) p(x n |x 0 ) is Gaussian(  0 +c n x 0, (  x 2 +  0 2 )(1 - c 2n )) i.e. geometric convergence to prior

An example: Gaussians p(x n+1 |x 0 ) is Gaussian(  0 +c n x 0,(  x 2 +  0 2 )(1-c 2n ))

 0 = 0,  0 2 = 1, x 0 = 20 Iterated learning results in rapid convergence to prior

An example: Linear regression Assume –data, d, are pairs of real numbers (x, y) –hypotheses, h, are functions An example: linear regression –hypotheses have slope  and pass through origin –p(  ) is Gaussian(  0,  0 2 ) } x = 1  y

}  y  0 = 1,  0 2 = 0.1, y 0 = -1

An example: compositionality 0 1 01 0 1 01 eventsutterances language xy function “actions” “agents” “nouns” “verbs” compositional

An example: compositionality Data: m event-utterance pairs Hypotheses: languages, with error  0 1 01 0 1 01 compositional 0 1 01 0 1 01 holistic P(h)P(h)

Analysis technique 1.Compute transition matrix on languages 2.Sample Markov chains 3.Compare language frequencies with prior (can also compute eigenvalues etc.)

Convergence to priors  = 0.50,  = 0.05, m = 3  = 0.01,  = 0.05, m = 3 ChainPrior Iteration Effect of Prior

The information bottleneck  = 0.50,  = 0.05, m = 1  = 0.01,  = 0.05, m = 3 ChainPrior Iteration  = 0.50,  = 0.05, m = 10 No effect of bottleneck

The information bottleneck Bottleneck affects relative stability of languages favored by prior

A method for discovering priors Iterated learning converges to the prior… …evaluate prior by producing iterated learning

Iterated function learning Each learner sees a set of (x,y) pairs Makes predictions of y for new x values Predictions are data for the next learner datahypotheses

Function learning in the lab Stimulus Response Slider Feedback Examine iterated learning with different initial data

1 2 3 4 5 6 7 8 9 Iteration Initial data (Kalish, 2004)

Conclusions and open questions Iterated Bayesian learning converges to prior –properties of languages are properties of learners –information bottleneck doesn’t affect equilibrium What about other learning algorithms? What determines rates of convergence? –amount and structure of input data What happens with people?

Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.

Similar presentations

Presentation on theme: "Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.

Similar presentations

Presentation on theme: "Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana."— Presentation transcript:

Similar presentations

About project

Feedback