1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona.

1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona

2 Instructor Education: Beijing University (Information Science), UCLA (Math, Linguistics) Research interest: Computational linguistics Theories and models of learning Phonetics and phonology Language evolution

3 And you? Email me: Name Background / department / what classes have you taken, etc. Why are you interested in this course? Topics that you would like to be discussed (c.f. the shopping list in syllabus) Rescheduling requests (office hour, etc)

4 Outline Motivation / inspiration: Why study learning? Why use computational models? Phonological acquisition in the first few years of life Statistical learning by infants and by machines Course business

5 Why study learning? Phonology: study of the knowledge of sound patterns (LING 510) Inventories, contrasts, features Phonotactics: blick, not bnick or rbick Alternations: in + possible -> impossible Traditional methodology: identify these patterns by eye, construct grammars Implicitly assuming these patterns are equally important for the learner

6 Why study learning? A different perspective: modeling speakers instead of languages Take a grammar, and ask: If this is what the speaker knows about her language, how can she learn the grammar from the evidence available to her? This can potentially be a test for theories

7 Why study learning? Phonetics: perception, production and acoustic properties of speech sounds (LING 515) Traditional division of labor: Phonology: dealing with form, symbolic Phonetics: dealing with substance, numerical The “interface” problem: how should they talk to each other?

8 Why study learning? The issue of representation 1. How is speech signal encoded in the speaker’s mind? 2. How do infants “crack the speech code”, and become someone who can use the code? Phonology provided many conjectures for Q1, but not enough work has been done to address Q2

9 Alternative methodologies Experimental work Extending knowledge of existing patterns to novel forms (many in this department) Miniature / artificial languages that contain relevant patterns of interest (Gerken, Gomez) This class -- computational work

10 Why computational models? Intuitive ideas about learning are often vague Inductive learning: can humans learn any type of generalizations? Answer: No The most convincing arguments are computational ones

11 Why computational models? Universal Grammar: the crucial initial bias / hypothesis space that a learner needs to know Computational modeling quantifies such a bias, and makes it assessable with empirical data

12 A quick overview of phonological acquisition Prenatal to newborn 4-day-old French babies prefer listening to French over Russian (Mehler et al, 1988) Prepared to learn any phonetic contrast in human languages (Eimas, 71)

13 A quick overview of phonological acquisition 6 months: Effect of experience begin to surface (Kuhl, 92) 6 - 12 months: from a universal perceiver to a language-specific one (Werker & Tees, 84)

14 A quick overview of phonological acquisition 6 - 9 months: knows the sequential patterns of their language English v.s. Dutch (Jusczyk, 93) “avoid” v.s. “waardig” Weird English v.s. better English (Jusczyk, 94) “mim” v.s. “thob” Impossible Dutch v.s. Dutch (Friederici, 93) “bref” v.s. “febr”

15 A quick overview of phonological acquisition 6 months: segment words out of speech stream based on transition probabilities (Saffran et al, 96) pabikutibudogolatupabikudaropi… 8 months: remembering words heard in stories read to them by graduate students (Jusczyk and Hohne, 97)

16 Mechanism: Statistical learning by infants Learning phonetic categories from bi-modal distributions (Maye, Gerken & Werker, 02) Phonotactics (Jusczyk et al, 93) Word segmentation (Saffran et al, 96)

17 Statistical learning by machines A booming enterprise, influencing a number of fields Computer Science Electrical Engineering (signal processing, communication, control, …) Biology Has generated sophisticated tools Motivated by real applications

18 What do those machines have to do with human? Two kinds of statistical machines 1. Architecture is based on an understanding of the domain. Variables and interaction have clear meanings 2. An input-output device that can be easily applied to any domain, by turning various knobs machine Input: text, signal, parse trees, … Output: Yes/no or some score

19 Why need statistics in the model? Sensory input to the learner is often noisy, ambiguous, and contains much variation The language of probability is the only coherent way of reasoning about uncertainty. The statistical perspective unifies a number of proposals, e.g. Rules v.s. analogy Exemplars v.s. categories

20 Statistics and UG My take on this: not an “either-or”, but a “both-and” relationship Statistics help build models that can be tested on realistic data Strong / weak assumptions about UG lead to different models Weak bias: exemplars, neural nets Stronger bias: Markov chain phonotactics Even stronger bias: stochastic OT

21 Statistics and UG But, given a set of data Lots of generalizations can be made Lots of descriptive statistics can be counted You must know what to count in order to draw a meaningful conclusion

22 The basic steps of computational modeling Identifying and formulating a problem, with your theoretical committments Gathering enough data that mirrors a learner’s input This is often in the form of a corpus, searchable by machine Providing an algorithm, carrying out computation, assessing results

23 The basic steps of computational modeling Identifying and formulating a problem Gathering enough data that mirrors a learner’s input This is often in the form of a corpus, searchable by machine Providing an algorithm, carrying out computation, assessing results We will do lots of this in LING 539!

24 Some examples (details to follow later) Learning phonological rules from morphological paradigms (Albright & Hayes, 03) Requires pairs of words e.g. spling / splung -> i -> u / spl_ng But are these words already on the workbench?

25 Some examples (details to follow later) Not most of the time: Maybe they are chopped out of longer stretch of sequences (Brent, 96) Chopping is better done with some idea of what sound sequences are allowed (Brent, Yang) Phonotactic learning can be modeled with OT (Hayes, 99) dujulaIkD  kIti chop Do you like the kitty?

26 Some examples (details to follow later) Not most of the time: Maybe they are chopped out of longer stretch of sequences (Brent, 96) Chopping is better done with some idea of what sound sequences are allowed (Brent, Yang) Phonotactic learning can be modeled with OT (Hayes, 99) dujulaIkD  kIti chop Do you like the kitty? But wait, is this what sounds like to toddlers?

27 Some examples (details to follow later) Probably not when they are young: Maybe they have to form categories first by chopping up waveforms that they have heard (Lin, 05)

28 Some examples (details to follow later) Phonological acquisition is a big problem Infants -- the best learning machine around -- take a large amount of input The whole picture will likely consist of many interacting parts We may have to focus on one smaller problem at a time unitsphonotacticslexiconmorphology

29 Course business Format of the course: Lectures In-class presentations Guest lectures Requirements: Readings: will be posted on the webpage (coming soon) 4 assignments (exercises / short papers) or a term project

30 Course business Email me (yinglin@email.arizona.edu): Name Background / department / what classes have you taken, etc. Why are you interested in this course? Topics that you would like to be discussed (c.f. the shopping list in syllabus) Rescheduling requests (office hour, etc)

1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona.

Similar presentations

Presentation on theme: "1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona.

Similar presentations

Presentation on theme: "1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona."— Presentation transcript:

Similar presentations

About project

Feedback