Lexical exceptions and lexical representations: a variationist perspective Gregory R. Guy phonoLAM group July 2013.

Slides:



Advertisements
Similar presentations
323 Notes on Phonemic Theory in Terms of Set Theory 1. Notes on Phonemic Theory Here I will discuss phonemic theory in terms of set theory. A phoneme is.
Advertisements

Contrastive Analysis, Error Analysis, Interlanguage
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Phonology, part 7: Rule Types + Ordering
ANALYZING MORE GENERAL SITUATIONS UNIT 3. Unit Overview  In the first unit we explored tests of significance, confidence intervals, generalization, and.
The sound patterns of language
Phonology, part 5: Features and Phonotactics
Phonological Theories Distinctive Features – SPE and Feature Geometry Session 3 (version SS2006)
Clinical Phonetics.
Do Children Pick and Choose? An Examination of Phonological Selection and Avoidance in Early Lexical Acquisition. Richard G. Schwartz and Laurence B. Leonard.
Learning linguistic structure with simple recurrent networks February 20, 2013.
Best-First Search: Agendas
The Scope of Generalization in Phonology Gregory R. Guy New York University VGFP Workshop, Stanford, July 07.
The Linguistics of SLA.
Part Four PHONOLOGICAL PROCESSES.  Speech sounds are by nature dynamic and flexible, and highly susceptible to the influence of the ‘environment’, i.e.
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Chapter 6 Features PHONOLOGY (Lane 335).
Input-Output Relations in Syntactic Development Reflected in Large Corpora Anat Ninio The Hebrew University, Jerusalem The 2009 Biennial Meeting of SRCD,
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Introduction Regular system: for every input, the grammar produces only one output Ways to achieve regularity Minimize competition between generalizations.
Research on teaching and learning pronunciation
Sound and Speech. The vocal tract Figures from Graddol et al.
Chapter three Phonology
Return to the Obvious: the Ubiquity of Categorical Rules W. Labov, U. of Pennsylvania Panel on Usage-based and rule based approaches to phonological variation.
Consonants and vowel January Review where we’ve been We’ve listened to the sounds of “our” English, and assigned a set of symbols to them. We.
Chapter7 Phonemic Analysis PHONOLOGY (Lane 335). What is Phonology? It’s a field of linguistics which studies the distribution of sounds in a language.
-- A corpus study using logistic regression Yao 1 Vowel alternation in the pronunciation of THE in American English.
1. Introduction Which rules to describe Form and Function Type versus Token 2 Discourse Grammar Appreciation.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Last minute Phonetics questions?
1. Lexical Diffusion What is lexical diffusion?
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
…not the study of telephones!
Phonology, part 2 While you work on another Quick Write, here’s a funny painting of Superman based on a kid’s drawing: March 9, 2009.
Phonetics and Phonology
Ch 9 & Ch 10 Slide 1 Ch 9 – Productivity Productivity – the capacity of a rule to apply to novel circumstances. P. 190 Vowel nasalization in English is.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture4 1 August 2007.
% behaviour incidents cumulative % pupils Concentration of behaviour incidents: Percentage of pupils accounting for percentage of incidents 0.15% of pupils.
The Linguistics of Second Language Acquisition
Phonology, part 4: Distinctive Features
Main Topics  Abstract Analysis:  When Underlying Representations ≠ Surface Forms  Valid motivations/evidence or limits for Abstract Analysis  Empirical.
Phonological Theory.
Ch 7 Slide 1  Rule ordering – when there are multiple rules in the data, we have to decide if these rules interact with each other and how to order those.
Ch 3 Slide 1 Is there a connection between phonemes and speakers’ perception of phonetic differences? (audibility of fine distinctions) Due to phonology,
First topic: clustering and pattern recognition Marc Sobel.
Introduction to Linguistics Chapter 7: Language Change
Models of Linguistic Choice Christopher Manning. 2 Explaining more: How do people choose to express things? What people do say has two parts: Contingent.
Phonology, part 4: Natural Classes and Features November 2, 2012.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Lecture 2 Phonology Sounds: Basic Principles. Definition Phonology is the component of linguistic knowledge concerned with rules, representations, and.
The phonology of Hakka zero- initials Raung-fu Chung Southern Taiwan University 2011, 05, 29, Cheng Da.
Ch 8 Slide 1 Some hints about analysis First try to establish morphemes. If there is allomorphy, list all of the alternants (remember some morphemes don’t.
Language and Social Class
Stop + Approximant Acoustics
Ch4 – Features Features are partly acoustic partly articulatory aspects of sounds but they are used for phonology so sometimes they are created to distinguish.
Principles Rules or Constraints
Consonant Inventory Distribution of Consonants  All consonants can be in the onset, i.e. begin a word.  Not all consonants can be in coda position.
Against formal phonology (Port and Leary).  Generative phonology assumes:  Units (phones) are discrete (not continuous, not variable)  Phonetic space.
Usage-based phonology Why are lines in grocery store about equal?
Technische Universität München Introduction to English Pronunciation Syllable Structure.
11 How we organize the sounds of speech 12 How we use tone of voice 2009 년 1 학기 담당교수 : 홍우평 언어커뮤니케이션의 기 초.
Usage-Based Phonology Anna Nordenskjöld Bergman. Usage-Based Phonology overall approach What is the overall approach taken by this theory? summarize How.
Constraints on definite article alternation in speech production: To “thee” or not to “thee”? By M. GARETH GASKELL, HELEN COX, KATHERINE FOLEY, HELEN GRIEVE,
BİL711 Natural Language Processing
CHAPTER 4 Designing Studies
Abstraction versus exemplars
Statistical Reasoning December 8, 2015 Chapter 6.2
Lexico-grammar: From simple counts to complex models
Chapter 4: Designing Studies
Presentation transcript:

Lexical exceptions and lexical representations: a variationist perspective Gregory R. Guy phonoLAM group July 2013

The problem of lexical scope Some phonological generalizations are valid only for a subset of the lexicon in a given language. The subsets are at least partly defined by lexical identity, not phonological shape. Subsets range in size from small to very large.

Example: English laxing rule /i/ alternates with /E/ in many derived words: serene-serenity, obscene-obscenity, scheme- schematic, spleen-splenetic …but famously fails to alternate in obese-obesity

Related lexical issues for phonology Lexical exceptions Historical borrowings with distinct phonology (e.g., Latinate vocabulary of English, Chinese- origin vocabulary of Japanese) Recent (unassimilated) borrowings; cf. in English ‘Bach’ [bax] Proper names

Lexical exceptions Lexical exceptions are lexical items that exceptionally fail to conform to some generalization found in the rest of the lexicon, or show some phonological pattern that (most) other words do not have.

Lexical exceptions and lexical classes “Exceptions” typically involve just one or a few words. Larger sets of lexical items showing distinct patterns exist in some languages (e.g. Chinese-origin words in Japanese); these are not usually treated as exceptions, but involve the similar issues: how to tie phonological processes to specific words?

Partial lexical scope: The theoretical issue How to associate the statement of the phonological generalization (typically captured by the rule or constraint component of the phonology) with the appropriate lexical set (typically defined in the lexicon)

Two strategies for handling lexically- restricted patterns Word-based: define lexical entries that pre-empt or pre-determine output Phonology-based: constrain processes to apply only to particular lexical subsets

The lexical strategy A lexically-restricted generalization is already encoded in underlying representations, not generated by the phonology Words that fail to show some generalization get URs that block that outcome

The phonological strategy In a rule-based phonology: Exception features: rules are sensitive to features associated with particular lexical items (cf. Chomsky & Halle 1968) Features can trigger or block specific rules Phonological rules are thereby co-indexed with lexical items they apply to

The phonological strategy in OT In a constraint-based approach: Define different constraints for different subsets of the lexicon Co-phonologies: different constraint rankings for different subsets of the lexicon (cf. Inkelas, Ito & Mester, Pater & Coetzee)

Example: Philadelphia /æ/ The TRAP vowel (a.k.a. ‘short a’, /æ/) has tense and lax variants in Philadelphia English: /æ/ is tense before tautosyllabic front nasals and fricatives (e.g., ham, man, half, path, pass) – but mad, bad, glad are also tense – while all other words with following /d/ are not tense (e.g., sad, Dad, had, pad, lad…)

Example: Philadelphia /æ/ Lexical strategy: list mad, bad, glad with tense /æ/ in the lexicon Phonological strategy: the tensing rule can be triggered by an exception feature, which is listed in the lexical entries for mad, bad, glad

How to choose? In the above example, the two approaches to exceptionality make the same predictions, and are essentially notational equivalents Both strategies are evident in early generative phonology Forty years of research has not decided the issue Both strategies survive the transition to constraint-based phonology

The practice In the absence of a theoretical or empirical proof of the superiority of one or the other, the issue has been left undecided Phonologists pick and choose their strategies according to their preference

And yet… The two strategies make quite different claims about mental grammar: – The lexical strategy implies that speakers store lots of detail in the lexicon, even if it is redundant and generalizable – Phonological strategy implies that speakers always seek to maximize the use of generalizations

An obstacle to resolution: The focus on invariant processes A choice between the two strategies is hampered by the focus on invariant processes: obesity always has /i/, serenity always has /E/ No interaction with context

The limitations of an invariant perspective Categorical processes are abrupt In any given context, a unique outcome is expected Hence, they cannot reflect effects of intersecting constraints (in variationist terminology, all constraints are ‘knockouts’)

An alternative approach: look at variable processes Weinreich, Labov and Herzog (1968): “Orderly heterogeneity”: language variation shows systematic quantitative regularities, probabilistically constrained

The insights from variation Variable processes reflect multiple constraints Every item is affected simultaneously by every contextual feature Hence, quantitative patterns can reveal phonological nuances that don’t show up in categorical processes. They may provide solutions to unresolved theoretical problems

Lexical exceptions in variation Many variable processes are known to exhibit unusual frequencies of occurrence in particular lexical items. -e.g., coronal stop deletion in English is exceptionally frequent in ‘and’ (Exceptional because deletion occurs significantly more often in and than in phonologically comparable words like sand, band, hand, etc.)

Table 1. Exceptional and in the ONZE corpus: N Deletion rate and % other words %

Some other examples Final /-s/ deletion in Caribbean Spanish: certain discourse markers have exceptionally high rates of –s absence. – entonce(s) ‘then’, digamo(s), ‘let’s say’. Final /-s/ deletion in Brazilian Portuguese: the first person plural verbal suffix –mos shows an exceptionally high rate of –s absence. – temo(s) ‘we have’; falamo(s) ‘we speak’

How to handle lexical exceptions to variable processes? Phonological strategy: exceptional lexical items have a feature that raises or lowers the probability of a given phonological process occurring in that word. Lexical strategy: exceptional words have distinctive lexical entries that affect the surface frequencies of occurrence of variants.

Background: The VR framework The ‘variable rule’ model of variation treats variable productions as a function of contextual constraint effects (cf. Labov 1968, Cedergren & Sankoff 1973) Each context affecting a phonological process is associated with a probabilistic weight (p i, p j, …) expressing that effect The probability of occurrence of a process in a particular utterance is a logistic function of all relevant contexts

A VR account of English coronal stop deletion The deletion of final stops in cases like west side > wes’ side, old man> ol’ man is independently affected by – preceding segment: more deletion after obstruents (west) than laterals (old) – following segment: more deletion before consonants (wes’ side) than before vowels (west end) – morphology: more deletion in monomorphemic words (mist) than past tense forms (missed)

The frequency of final -t deletion in west side is therefore a function of – presence of a preceding /s/, – presence of following /s/ – morphological status as monomorpheme Each of these effects is independent of all others, and makes a separate contribution to the overall outcome

Phonological strategy for and in the VR model The phonological treatment of lexical exceptions simply assigns a distinct weight to and which is not associated with other lexical items. This weight is high, strongly favoring deletion Tokens of and are still equally affected by all other factors, e.g., following segment

The lexical strategy for and A lexical treatment of the exceptional behavior of and assigns it an alternate entry that pre-encodes the output of the process. – and has an alternate lexical entry an’ or ‘n’. When this form is selected, it always surfaces without a final /d/, thereby boosting the apparent rate of coronal stop deletion. (cf. rock ‘n’ roll, an orthographic representation of this underlying form?)

Testing the strategies: Variation as a window into phonological organization The two strategies for handling lexical exceptions may not be decidable on obligatory/categorical data because of absence of constraint interaction But variation data, showing constraint interaction, allows a test of the models.

The two strategies make different quantitative predictions The phonological strategy using an exception feature simply boosts the overall probability of deletion in and, leaving other constraint effects unchanged. – Hence, the effect of following C vs. V should be the same in exceptional and unexceptional words: Cheese ‘n’ crackers is always deleted more than ham ‘n’ eggs

The lexical strategy achieves elevated surface rates of -d absence in and by selection of UR an’, which does not undergo coronal stop deletion, and is therefore insensitive to constraints on that process. – Hence, lexical exceptions show reduced effect of following C vs V: Cheese ‘n’ crackers is no more likely than ham ‘n’ eggs

The specific quantitative effect of the lexical strategy: A surface corpus of exceptional words is a mixture of two sets of forms: -some are derived from underlying full forms (e.g. and) and show the effects of constraints on the process, -others are derived from underlying reduced forms (an’) and are not affected by constraints on the process

The mixture of the two sets has the quantitative effect of attenuating the observed effect of constraints on the process. Tokens derived from underlying and, showing an external context effect of a certain magnitude m, are mixed in with tokens derived from an’ showing zero external context effect. The total set will show an effect intermediate between 0 and m.

Measuring attenuation In a multivariate analysis, this attenuation should be manifested as a smaller range of values in lexical exceptions for a factor group that measures a constraint on the process (e.g., the following segment effect on coronal stop deletion).

Predictions Exception feature: constraint effects should be equivalent in exceptional and nonexceptional corpora Multiple underlying entries: constraint effects should appear to be weaker in exceptional than nonexceptional corpora.

Table 2. Following context effect on English CSD and exceptional and Non-exceptional Exception (and) words N % del N % del __C __V Range: 23.5% > 13.6% (Source: Neu 1980)

Table 3. Context effects in the ONZE corpus Other words and Following N % del N % del Context: __C __V Range: 47.9% > 12.6% ( 18 speakers from the ONZE corpus at U Canterbury)

Table 4. Multivariate analysis of following context effect in the ONZE data Following Other words and Context: Adjusted probabilities of –d deletion __C[+cor] __C[-cor, +vce] __C[-cor, -vce] __/w/ __V Range:.61 >.31 ( 18 speakers from the ONZE corpus at U Canterbury)

Following context effect appears significantly weaker in exceptional and In both raw deletion percentages and multivariate analyses, in two independent corpora, the effect of following context is much weaker for tokens of and than for other words This is consistent with the lexical strategy: and has an additional UR without a final /d/

Contextual effects on Brazilian Portuguese –s deletion Prior research shows this to be strongly constrained by following context Mainly occurs in preconsonantal position Deletion rates are affected by place, manner, and voicing of following C Do these constraints affect exceptional –mos words just like other words?

Table 5. Lexical exceptions in Brazilian Portuguese -s deletion Features of following C Non-exceptions Lexical exceptions (-mos forms) Voice/Manner: sonorant voiced obstruent voiceless obstruent Range.33 >.14 Place: labial coronal velar Range.29>.19 N: Goodness of fit (log likelihood)

Following context effect appears significantly weaker in exceptional -mos Range of probabilities is smaller for both the place effect and the manner/voicing effect The goodness of fit measure is significantly worse for the exceptional forms, suggesting that they aren’t as well explained by the contextual conditions Again, this is consistent with a lexical account: the first singular morpheme has an alternate entry –mo without final –s.

Contextual effects on Salvadoran Spanish –s deletion Like other Caribbean Spanish dialects, the Spanish of El Salvador has variable final –s deletion Hoffman 2004 finds strong constraint effects on deletion; more deletion in stressed syllables, more deletion before consonants, especially voiced consonants, than before vowels Three discourse markers show exceptionally high rates of deletion: entonces, digamos, pues

Table 6. -s deletion in Salvadoran Spanish (Hoffman 2004) Non-exceptional words Lexical exceptions Following context: (entonces, digamos, pues) sonorant voiced obstruent voiceless obstruent vowel pause Range.42 >.25 Syllable Stress: stressed unstressed Range.24 >.16

Another variable: monophthongal /ay/ in Southern American English (SoAmEng) The English diphthong /ay/ is monophthongized to /a/ in Southern American English This is a variable process, subject to social and contextual constraints More monophthongs are found in pre-voiced contexts (ride vs. right), in phonetically shorter syllables, and among lower status speakers I and my are lexical exceptions, with very high rates of monophthongization, even before voiceless consonants (cf: ‘my time’)

Table 7. /ay/ monophthongization in SoAmEng by following context (Woods 2008) Other words I, my % monophthong 34% 53% Fol. Context: __C[+vce].76 (.51) __V or G.41 (.49) __C[-vce].17(.48) Range:.59 >.03 (n.s.)

Table 8. /ay/ monophthongization in Southern AmEng: duration effect Other words I, my Duration: shorter longer Range:.40 >.23 (Data from Woods 2008)

Contextual effects are much weaker on exceptional I, my in SoAmEng Following context effect is not significant for I, my Duration effect is much weaker Monophthongization occurs much more often in these two words, and is relatively insensitive to context. This is consistent with alternate URs with monophthongal syllabic nuclei /a:/, /ma:/

Summary: In 7 constraints on 4 processes in 3 languages… Magnitude of constraint effect is always weaker for exceptional lexical items than for non-exceptional words This is consistent with predictions of the alternate lexical entry model These results contradict the phonological ‘exception feature’ model, which predicts that contextual effects should be stable and independent of exceptional status

Conclusion: Speakers alter the lexicon Lexical exceptions to variable processes are encoded in the mental grammar by alterations to underlying representations, and the existence of multiple lexical entries for exceptional words (cf. Kiparsky’s treatment of -t,d deletion in stratal OT)

Another model: Exemplar Theory Every word is represented by an exemplar cloud of remembered tokens; hence all words are equally ‘exceptional’, differing only by lexical frequency High frequency forms should undergo high rates of lenition – and has a high deletion rate simply because it is the highest frequency word eligible for coronal stop deletion – Other high frequency words like just should behave similarly to and

Exemplar model An exemplar-theory treatment of lexical exceptions is like a multiple-entry model (some remembered exemplars have /d/, others do not, and in production, speakers sometimes chose a target lacking a final segment) But words should differ mainly as a function of frequency; thus all high frequency words should have attenuated contextual effects in lenition processes

Prediction of the Exemplar Model There should be no special status of ‘lexical exceptions’ In lenition processes, ‘lexical exceptions’ should simply be high frequency words that have an elevated number of lenited exemplars in memory. Such words should not behave differently from other high frequency words

Are lexical exceptions just high frequency words? This does not appear to be the case. The second-highest frequency word in the ONZE corpus was just; it showed significant following context effects and did not behave like and Spanish menos is higher in frequency than entonces and digamos, but does not behave exceptionally

A possible asymmetry Phonological strategy, using exception features, permits both positive and negative exceptions (lexical items that undergo a process at a higher or lower probability than other words) Lexical strategy, with alternate URs, allows only positive exceptions, with higher probabilities (e.g., what UR would block -t,d deletion?)

Impressionistic confirmation All lexical exception cases in variation studies I am familiar with involve elevated rates of occurrence of a variable process, never reduced rates. This confirms the prediction of the lexical entry approach. In other words, speakers can pre-encode the output of a general phonological process in a UR, but they don’t appear to block such a process in specific lexical items.

Conclusions In phonological variation, speakers consistently handle lexical exceptions by means of alternate lexical entries. The types of lexical exceptions that occur are restricted to those which can be handled in this way Mental grammars don’t use exception features (pace Chomsky & Halle 1968)

Is this finding valid for invariant phonological processes? – Perhaps not, but such processes may not permit an empirical test of the two strategies – In the 40 years since SPE was written, no definitive evidence has emerged favoring the ‘exception feature’ strategy

The empirical resolution presented here depends crucially on looking at variable processes Linguistic variation offers a unique window into the phonological operations of the mental grammar

Thank you! Gracias Obrigado Merci Arigato Dank je wel!