Download presentation
Presentation is loading. Please wait.
Published byHeini Heidrich Modified over 6 years ago
1
Analyzing cultural evolution by iterated learning
Tom Griffiths Department of Psychology Cognitive Science Program UC Berkeley
2
Inductive problems blicket toma S X Y dax wug X {blicket,dax}
blicket wug S X Y X {blicket,dax} Y {toma, wug} Learning languages from utterances Learning categories from instances of their members Learning functions from (x,y) pairs
3
Learning data
5
Iterated learning (Kirby, 2001)
What are the consequences of learners learning from other learners?
6
Outline Part I: Formal analysis of iterated learning
Part II: Iterated learning in the lab
7
Outline Part I: Formal analysis of iterated learning
Part II: Iterated learning in the lab
8
Objects of iterated learning
How do constraints on learning (inductive biases) influence cultural universals?
9
Language The languages spoken by humans are typically viewed as the result of two factors individual learning innate constraints (biological evolution) This limits the possible explanations for different kinds of linguistic phenomena
10
Linguistic universals
Human languages possess universal properties e.g. compositionality (Comrie, 1981; Greenberg, 1963; Hawkins, 1988) Traditional explanation: linguistic universals reflect strong innate constraints specific to a system for acquiring language (e.g., Chomsky, 1965)
11
Cultural evolution Languages are also subject to change via cultural evolution (through iterated learning) Alternative explanation: linguistic universals emerge as the result of the fact that language is learned anew by each generation (using general-purpose learning mechanisms, expressing weak constraints on languages) (e.g., Briscoe, 1998; Kirby, 2001)
12
Analyzing iterated learning
PL(h|d) PL(h|d) PP(d|h) PP(d|h) PL(h|d): probability of inferring hypothesis h from data d PP(d|h): probability of generating data d from hypothesis h
13
Markov chains Variables x(t+1) independent of history given x(t)
Transition matrix P(x(t+1)|x(t)) Variables x(t+1) independent of history given x(t) Converges to a stationary distribution under easily checked conditions (i.e., if it is ergodic)
14
Analyzing iterated learning
h1 d1 h2 PL(h|d) PP(d|h) d2 h3 d PP(d|h)PL(h|d) h1 h2 h3 A Markov chain on hypotheses d0 d1 h PL(h|d) PP(d|h) d2 A Markov chain on data
15
Bayesian inference Reverend Thomas Bayes
16
Bayes’ theorem Likelihood Prior probability Posterior probability
Sum over space of hypotheses h: hypothesis d: data
17
A note on hypotheses and priors
No commitment to the nature of hypotheses neural networks (Rumelhart & McClelland, 1986) discrete parameters (Gibson & Wexler, 1994) Priors do not necessarily represent innate constraints specific to language acquisition not innate: can reflect independent sources of data not specific: general-purpose learning algorithms also have inductive biases expressible as priors
18
Iterated Bayesian learning
PL(h|d) PL(h|d) PP(d|h) PP(d|h) Assume learners sample from their posterior distribution:
19
Stationary distributions
Markov chain on h converges to the prior, P(h) Markov chain on d converges to the “prior predictive distribution” (Griffiths & Kalish, 2005)
20
Explaining convergence to the prior
PL(h|d) PL(h|d) PP(d|h) PP(d|h) Intuitively: data acts once, prior many times Formally: iterated learning with Bayesian agents is a Gibbs sampler on P(d,h) (Griffiths & Kalish, 2007)
21
x-i = x1(t+1), x2(t+1),…, xi-1(t+1), xi+1(t), …, xn(t)
Gibbs sampling For variables x = x1, x2, …, xn Draw xi(t+1) from P(xi|x-i) x-i = x1(t+1), x2(t+1),…, xi-1(t+1), xi+1(t), …, xn(t) Converges to P(x1, x2, …, xn) (Geman & Geman, 1984) (a.k.a. the heat bath algorithm in statistical physics)
22
Gibbs sampling (MacKay, 2003)
23
Explaining convergence to the prior
PL(h|d) PL(h|d) PP(d|h) PP(d|h) When target distribution is P(d,h) = PP(d|h)P(h), conditional distributions are PL(h|d) and PP(d|h)
24
Implications for linguistic universals
When learners sample from P(h|d), the distribution over languages converges to the prior identifies a one-to-one correspondence between inductive biases and linguistic universals
25
Iterated Bayesian learning
PL(h|d) PL(h|d) PP(d|h) PP(d|h) Assume learners sample from their posterior distribution:
26
From sampling to maximizing
27
From sampling to maximizing
General analytic results are hard to obtain (r = is Monte Carlo EM with a single sample) For certain classes of languages, it is possible to show that the stationary distribution gives each hypothesis h probability proportional to P(h)r the ordering identified by the prior is preserved, but not the corresponding probabilities (Kirby, Dowman, & Griffiths, 2007)
28
Implications for linguistic universals
When learners sample from P(h|d), the distribution over languages converges to the prior identifies a one-to-one correspondence between inductive biases and linguistic universals As learners move towards maximizing, the influence of the prior is exaggerated weak biases can produce strong universals cultural evolution is a viable alternative to traditional explanations for linguistic universals
29
Infinite populations in continuous time
“Language dynamical equation” “Neutral model” (fj(x) constant) Stable equilibrium at first eigenvector of Q, which is our stationary distribution (Nowak, Komarova, & Niyogi, 2001) (Komarova & Nowak, 2003)
30
Analyzing iterated learning
The outcome of iterated learning is strongly affected by the inductive biases of the learners hypotheses with high prior probability ultimately appear with high probability in the population Clarifies the connection between constraints on language learning and linguistic universals… …and provides formal justification for the idea that culture reflects the structure of the mind
31
Outline Part I: Formal analysis of iterated learning
Part II: Iterated learning in the lab
32
Inductive problems blicket toma S X Y dax wug X {blicket,dax}
blicket wug S X Y X {blicket,dax} Y {toma, wug} Learning languages from utterances Learning categories from instances of their members Learning functions from (x,y) pairs
33
Revealing inductive biases
Many problems in cognitive science can be formulated as problems of induction learning languages, concepts, and causal relations Such problems are not solvable without bias (e.g., Goodman, 1955; Kearns & Vazirani, 1994; Vapnik, 1995) What biases guide human inductive inferences? If iterated learning converges to the prior, then it may provide a method for investigating biases
34
Serial reproduction (Bartlett, 1932)
35
General strategy Step 1: use well-studied and simple tasks for which people’s inductive biases are known function learning concept learning Step 2: explore learning problems where effects of inductive biases are controversial frequency distributions systems of color terms
36
Iterated function learning
Each learner sees a set of (x,y) pairs Makes predictions of y for new x values Predictions are data for the next learner data hypotheses (Kalish, Griffiths, & Lewandowsky, 2007)
37
Function learning experiments
Stimulus Response Slider Feedback Examine iterated learning with different initial data
38
Initial data Iteration
39
Iterated concept learning
Each learner sees examples from a species Identifies species of four amoebae Species correspond to boolean concepts data hypotheses (Griffiths, Christian, & Kalish, 2006)
40
Types of concepts (Shepard, Hovland, & Jenkins, 1961)
color Types of concepts (Shepard, Hovland, & Jenkins, 1961) size shape Type I Type II Type III Type IV Type V Type VI
41
Results of iterated learning
Human learners Bayesian model Probability Probability Iteration Iteration
42
Frequency distributions
data hypotheses Each learner sees objects receiving two labels Produces labels for those objects at test First learner: one label {0,1,2,3,4,5}/10 times 5 x “DUP” P(“DUP”| ) = 5 x “NEK” (Vouloumanos, 2008) (Reali & Griffiths, submitted)
43
Results after one generation
Frequency of target label Condition
44
Results after five generations
Frequency of target label
45
The Wright-Fisher model
Basic model: x copies of gene A in population of N With mutation…
46
Iterated learning and Wright-Fisher
Basic model is MAP with uniform prior, mutation model is MAP with Beta prior Extends to other models of genetic drift… connection between drift models and inference x x x with Florencia Reali
47
Cultural evolution of color terms
with Mike Dowman and Jing Xu
48
Identifying inductive biases
Formal analysis suggests that iterated learning provides a way to determine inductive biases Experiments with human learners support this idea when stimuli for which biases are well understood are used, those biases are revealed by iterated learning What do inductive biases look like in other cases? continuous categories causal structure word learning language learning
49
Conclusions Iterated learning provides a lens for magnifying the inductive biases of learners small effects for individuals are big effects for groups When cognition affects culture, studying groups can give us better insight into individuals data
50
Computational Cognitive Science Lab
Credits Joint work with… Brian Christian Mike Dowman Mike Kalish Simon Kirby Steve Lewandowsky Florencia Reali Jing Xu Computational Cognitive Science Lab
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.