Detecting evolutionary forces in language change (2017)

Slides:



Advertisements
Similar presentations
Variation and regularities in translation: insights from multiple translation corpora Sara Castagnoli (University of Bologna at Forlì – University of Pisa)
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Evolution & Natural Selection.
Migration. Q1. What would be the trend for allele frequencies of populations that that are large vs. very small? A.Large populations will have a greater.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
1. Introduction Which rules to describe Form and Function Type versus Token 2 Discourse Grammar Appreciation.
CHAPTER ONE CHAPTER ONE Studying Adult Development and Aging.
Evolution of Populations Chapter AP Biology.
Historical linguistics Historical linguistics (also called diachronic linguistics) is the study of language change. Diachronic: The study of linguistic.
Explanation. -Status of linguistics now and before 20 th century - Known as philosophy in the past, now new name – Linguistics - It studies language in.
The Evolution of Populations.  Emphasizes the extensive genetic variation within populations and recognizes the importance of quantitative characteristics.
The Great Vowel Shift Continued The reasons behind this shift are something of a mystery, and linguists have been unable to account for why it took place.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Developing and Evaluating Theories of Behavior.
Introduction to Linguistics Chapter 7: Language Change
Hypotheses setting and testing. Hypotheses A hypothesis is a specific statement of prediction. It describes in concrete terms what you expect will happen.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
Chapter 20: Testing Hypotheses About Proportions AP Statistics.
©2010 John Wiley and Sons Chapter 2 Research Methods in Human-Computer Interaction Chapter 2- Experimental Research.
Selectionist view: allele substitution and polymorphism
BY DR. HAMZA ABDULGHANI MBBS,DPHC,ABFM,FRCGP (UK), Diploma MedED(UK) Associate Professor DEPT. OF MEDICAL EDUCATION COLLEGE OF MEDICINE June 2012 Writing.
Unit 2 The Nature of Learner Language 1. Errors and errors analysis 2. Developmental patterns 3. Variability in learner language.
Don’t be wordy; mind your style! “The results that were taken from a sample of people go to show some results that tell us some facts about how women and.
Serial Founder Effects in Linguistics and Genetics Claire Bowern (with Keith Hunley and Meghan Healy) Yale and University of New Mexico Feb 9, 2012 Based.
The Data Goldrush Andy WedelBodo Winter University of ArizonaUC Merced.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
How We Study the World: The Scientific Method. Scientific Method -- Definition Way of obtaining knowledge about the world in a series of steps No one.
LECTURE 9. Genetic drift In population genetics, genetic drift (or more precisely allelic drift) is the evolutionary process of change in the allele frequencies.
Ex St 801 Statistical Methods Part 2 Inference about a Single Population Mean (HYP)
Chapter 9. A Model of Cultural Evolution and Its Application to Language From “The Computational Nature of Language Learning and Evolution” Summarized.
Announcement NSERC Undergraduate Student Research Awards (USRA) in Universities 16 weeks $5,625
Constraints on definite article alternation in speech production: To “thee” or not to “thee”? By M. GARETH GASKELL, HELEN COX, KATHERINE FOLEY, HELEN GRIEVE,
Language Identification and Part-of-Speech Tagging
English Tenses The Past.
What is cognitive psychology?
Evolution of Populations
Syntax 1 Introduction.
Introduction to Corpus Linguistics
Statistical NLP: Lecture 7
Unit 5: Hypothesis Testing
2nd Language Learning Chapter 2 Lecture 4.
CHAPTER 9 Testing a Claim
What is linguistics?.
A Closer Look at Testing
CHAPTER 9 Testing a Claim
Evolution and Natural Selection
Overview and Basics of Hypothesis Testing
The Study of Life Chapter 1.
Corpus Linguistics I ENG 617
Saidna Zulfiqar bin Tahir STATE UNIVERSITY OF MAKASSAR
Conclusions of Hardy-Weinberg Law
Unit 16 Notes: Page 49 Test Date: 5/24/18
CHAPTER 9 Testing a Claim
Developing and Evaluating Theories of Behavior
The Development of Children, Seventh Edition
Chapter 9: Hypothesis Tests Based on a Single Sample
CHAPTER 9 Testing a Claim
Significance Tests: The Basics
Significance Tests: The Basics
Change over time: Working with diachronic data
Educational Research Chapter 13 Post-Analysis Considerations
CHAPTER 9 Testing a Claim
Introduction: Statistics meets corpus linguistics
CHAPTER 9 Testing a Claim
Tests of Significance Section 10.2.
CHAPTER 9 Testing a Claim
Statistical Test A test of significance is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to.
Biointelligence Laboratory, Seoul National University
Population Genetics: The Hardy-Weinberg Law
CHAPTER 9 Testing a Claim
The end of statistical significance
Presentation transcript:

Detecting evolutionary forces in language change (2017) Authors: Mitchell G. Newberry, Christopher A. Ahern, et al. Presenter: Chen Chang (Peter) r05945030 Date: 2017/12/26

Linguistics and Evolutionary Biology Competition between forms Syntactic structure, phonetics, morphology, etc. Selective mechanism Selection Stochastic Drift

A null model of language change Stochastic drift, random fluctuations in the frequencies of alternative forms, can accumulate to produce substantial change over time. Evidence of directional force

Three language changes of interest Development of the morphological past tense in contemporary American English Spilt  spilled The rise of the periphrastic ‘do’ in Early Modern English You say not  You do not say Jespersen’s cycle of sentential negation in Middle English Ic ne secge  I ne seye not  I say not

Materials and Methods Corpus Data: annotated texts that range in time from the Norman conquest of England to the 21st century. Methods: Compare the frequencies of alternative linguistic variants over time to predictions under the Wright-Fisher model of neutral stochastic drift.

Frequency Increment Test (FIT) First they applied a transformation that produces homoscedastic frequency increments under the null hypothesis of stochastic drift. The FIT tests a null hypothesis of neutral drift against an alternative hypothesis of some directional force influencing the course of evolution. Directional drift/mutations vs. neutral stochastic drift

Past-tense verb conjugation Verb Selection Corpus of Historical American English; tag Lemmas with two past-tense variants with minimum 50 occurrences each Post processing Rare vs. Common Two-sided P value is computed to reject neutral stochastic drift

Results 6 Polymorphic verbs, each with nominal P < 0.05

Results Cases that the irregular variants are favored Lighted  Lit Waked  Woke Sneaked  Snuck Dived  Dove Cases that the regular form is preferred Wove  Weaved Smelt  Smelled

Regulation Economy or cognitive ease Trends toward past-tense regulation have been observed, especially for rare words, from Old to Modern English.

Irregularization One possible explanation: rhyming Psychological studies have found speakers willing to copy or invent irregular variants that rhyme with existing irregular verbs Irregular variant of a polymorphic past-tense verb is favored if similar-sounding irregular verbs are on the rise in the corpus. However, opposite trend has also been observed.

Drift Can explain most of the cases in Modern English Drift vs. Selection Rare words vs. common words: rare words experience more stochasticity in transmission

Do-support Penn Parsed Corpora of Historical English Potential do-support in different contexts: Affirmative questions (Do you…) Negative questions (Don’t you…) Negative declaratives (I don’t…) Negative imperatives (Don’t do…)

Results The rise of the periphrastic ‘do’ was more rapid in negative declarative and imperative statements, for which drift is rejected, than it was in affirmative questions, for which drift isn’t rejected

Results The periphrastic ‘do’ first drifted by chance to high frequency in questions, which then induced a directional bias towards ‘do’ in declarative and imperative statements for reasons of grammatical consistency or cognitive ease.

Jespersen’s cycle Jespersen's Cycle (JC) describes the historical development of the expression of negation in a variety of languages Stage I: negation is expressed by a single pre-verbal element Stage II: both a preverbal and a post-verbal element are obligatory Stage III: the original preverbal element becomes optional or is lost altogether Jespersen's Cycle (JC) is a series of processes in historical linguistics, which describe the historical development of the expression of negation in a variety of languages, from a simple pre-verbal marker of negation, through a discontinuous marker (elements both before and after the verb) and in some cases through subsequent loss of the original pre-verbal marker. Reference: https://en.wikipedia.org/wiki/Jespersen%27s_Cycle

Results Reject neutral drift. This provides statistical support for longstanding hypotheses that changes in verbal negation are driven by directional forces, such as phonetic weakening, or a tendency for speakers to over-use more emphatic forms of negation that then lose emphasis as they become dominant. Jespersen's Cycle (JC) is a series of processes in historical linguistics, which describe the historical development of the expression of negation in a variety of languages, from a simple pre-verbal marker of negation, through a discontinuous marker (elements both before and after the verb) and in some cases through subsequent loss of the original pre-verbal marker.

Conclusion Combining massive digital corpora with time series inference techniques from population genetics now allows us to disentangle distinct forces that drive language evolution. However, how exactly individual-level cognitive processes in a language learner produce population- level phenomena, such as drift and selection, remains a topic for future research. Jespersen's Cycle (JC) is a series of processes in historical linguistics, which describe the historical development of the expression of negation in a variety of languages, from a simple pre-verbal marker of negation, through a discontinuous marker (elements both before and after the verb) and in some cases through subsequent loss of the original pre-verbal marker.