Word Order in Second Language Acquisition Corpora

Slides:



Advertisements
Similar presentations
Exploring word order in learner corpora: The WOSLAC Project Corpus Research Seminar Department of Linguistics.
Advertisements

Unit 9 Saving the earth Grammar--Inversion.
Postverbal subjects in L2 English: a corpus-based study ICLC, Santiago de Compostela 19 th September 2005 Amaya Mendikoetxea
How Children Acquire Language
Linguistic Theory Lecture 11 Explanation.
Developmental Sequences in Second Language Learning Presenters: Jacqueline dos Anjos, Hanna Heseker, Dana Meyer.
Second Language Acquisition
Contrastive Analysis, Error Analysis, Interlanguage
Why study grammar? Knowledge of grammar facilitates language learning
Psych 156A/ Ling 150: Acquisition of Language II Lecture 12 Poverty of the Stimulus I.
Metaphorical Uses of Language in Native and Non-native Student Writing: A corpus-based study By: Claudia Marcela Chapetón Castro M.A. in Applied Linguistics.
Movement Markonah : Honey buns, there’s something I wanted to ask you
Chapter eleven linguistics and foreign language teaching
Syntax Lecture 9: Verb Types 2.
Introduction: The Chomskian Perspective on Language Study.
A Road Map for Your Essay
Word Order Choices Chapter 12
Using corpora in SLA research: investigating word order Universidad Autónoma de Madrid WOSLAC project: 2 learner corpora CEDEL2WriCLE.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
1 Three conditions for Verb-Subject order in non-native English: A corpus-based study TALC7 Université Paris 7 – Denis Diderot 3rd July 2006 Amaya Mendikoetxea.
EE 399 Lecture 2 (a) Guidelines To Good Writing. Contents Basic Steps Toward Good Writing. Developing an Outline: Outline Benefits. Initial Development.
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
The lexicon-syntax interface and the syntax-discourse interface:
1 Verb-Subject order in L2 English: New evidence from the ICLE corpus AESLA, Murcia (Spain), April 2007 Cristóbal Lozano
The aim of this part of the curriculum design process is to find the situational factors that will strongly affect the course.
Fundamentals: Linguistic principles
Its Grammatical Categories
1. Introduction Which rules to describe Form and Function Type versus Token 2 Discourse Grammar Appreciation.
Literacy Development in Multilingual Programs. Learning Objectives To identify stages of literacy development in children and use strategies to build.
Generative Grammar(Part ii)
T HE NATURE OF QUALITATIVE RESEARCH Gordana Velickovska Guest Professor Centre for Social Sciences.
Linguistic Theory Lecture 3 Movement. A brief history of movement Movements as ‘special rules’ proposed to capture facts that phrase structure rules cannot.
Emergence of Syntax. Introduction  One of the most important concerns of theoretical linguistics today represents the study of the acquisition of language.
Albert Gatt LIN 3098 Corpus Linguistics. In this lecture Some more on corpora and grammar Construction Grammar as a theoretical framework Collostructional.
CAS LX 502 Semantics 3a. A formalism for meaning (cont ’ d) 3.2, 3.6.
ESL Phases & ESL Scale Curriculum Corporation 1994.
Physics: Frightful, but fun. 3 November 2004 Working seminar CSSME, University of Leeds Carl Angell.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Kinds of Sentence:. Kinds of Sentences: Sentences can be classified into five categories according to the meaning or function(s). They are:- 1.Assertive.
Linguistics, Pragmatics & Natural Grammar
Relationships Between Subject and Predicate. I. The syntactic relationship between subject and predicate 1. The subject determines the number form of.
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
What is linguistics  It is the science of language.  Linguistics is the systematic study of language.  The field of linguistics is concerned with the.
EST Colloquium on Research Assessment in Translation Studies Criteria in Translation Research Assessment Heidrun Gerzymisch-Arbogast 27 September 2008.
Hello class !.... And how do you do, today ? Great ! Good to know !...
Time, Tense and Aspect Rajat Kumar Mohanty Centre For Indian Language Technology Department of Computer Science and Engineering Indian.
SPEECH AND WRITING. Spoken language and speech communication In a normal speech communication a speaker tries to influence on a listener by making him:
C M Clarke-Hill1 Analysing Quantitative Data Forming the Hypothesis Inferential Methods - an overview Research Methods.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Introduction to Scientific Research. Science Vs. Belief Belief is knowing something without needing evidence. Eg. The Jewish, Islamic and Christian belief.
 He was born on Friday November 17, 1896 in the city of Orsha, Rusia.  Lev Semenovich Vygotsky ( ) studied at the University of Moscow to become.
English for Specific Purposes
Introduction Chapter 1 Foundations of statistical natural language processing.
Topic and the Representation of Discourse Content
PSY 219 – Academic Writing in Psychology Fall Çağ University Faculty of Arts and Sciences Department of Psychology Inst. Nilay Avcı Week 3.
 explain expected stages and patterns of language development as related to first and second language acquisition (critical period hypothesis– Proficiency.
Unit 2 The Nature of Learner Language 1. Errors and errors analysis 2. Developmental patterns 3. Variability in learner language.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
1 2 Thinking is a matter of cleverness. 3 Wisdom is not as important as cleverness.
A. Strategies The general approach taken into an enquiry.
COURSE AND SYLLABUS DESIGN
Ethnopoetics Jan Blommaert. Point of departure: voice A world of inequalities A world of inequalities Major inequality: to get yourself heard in your.
Session 1&2 Subject information: Languages Activity 11 & 12 From "learning the skills” to “application of skills" 1.
Second Language Acquisition Think about a baby acquiring his first language. Think about a person acquiring a second language. What similarities and differences.
Chapter 3 Language Acquisition: A Linguistic Treatment Jang, HaYoung Biointelligence Laborotary Seoul National University.
Lec. 10.  In this section we explain which constituents of a sentence are minimally required, and why. We first provide an informal discussion and then.
Chapter 10 Language acquisition Language acquisition----refers to the child’s acquisition of his mother tongue, i.e. how the child comes to understand.
Searching corpora.
Using GOLD to Tracking L2 Development
Presentation transcript:

Word Order in Second Language Acquisition Corpora The WOSLAC project: Word Order in Second Language Acquisition Corpora http://www.uam.es/woslac Université catholique de Louvain (Belgium) “Learner Corpus Colloquium” 3 April 2006 Amaya Mendikoetxea amaya.mendikoetxea@uam.es Cristóbal Lozano clozan2@yahoo.com Universidad Autónoma de Madrid

SUMMARY The main purpose of this project is to determine the lexicon-syntax and syntax-discourse properties which constrain word order in the interlanguage of L2 learners of English (with L1 Spanish) and L2 learners of Spanish (with L1 English). In particular, we wish to examine the validity of the Unaccusative Hypothesis at the lexicon-syntax interface and the role of discourse functions such as topic and focus at the syntax-discourse interface in L2 Spanish and L2 English. Our initial hypotheses are the following: (1) The Unaccusative Hypothesis plays a role in word order in L2 learners’ interlanguage; (2) Lexicon-syntax properties are acquired before syntax-discourse properties, i.e., properties at the lexicon-syntax interface are present in the initial stages of grammatical development, while properties at syntax-discourse interface are persistently difficult to acquire and generate deficits even at advanced levels of proficiency; (3) Interlanguages have structures that cannot be explained with reference to L1 or L2, but rather reflect universal properties of languages. To test these hypotheses, a corpus will be compiled and appropriate searching tools will be developed. The data obtained will be analysed both qualitatively and quantitatively. The interpretation of the data will be done within a comparative framework which will help determine the role of L1 in L2 acquisition in the grammar areas of the study.

MAIN PURPOSE To determine the lexicon-syntax and syntax-discourse properties which constrain word order in the interlanguage of L2 learners L2 English (with L1 Spanish) L2 Spanish (with L1 English). To examine the validity of: the Unaccusative Hypothesis at the lexicon-syntax interface and the role of discourse functions such as topic and focus at the syntax-discourse interface in L2 Spanish and English.

RESEARCH QUESTIONS (1) The Unaccusative Hypothesis plays a role in word order in L2 learners’ interlanguage; (2) Lexicon-syntax > syntax-discourse: Lexicon-syntax properties are acquired before syntax-discourse properties, i.e., properties at the lexicon-syntax interface are present in the initial stages of grammatical development, while properties at syntax-discourse interface are persistently difficult to acquire and generate deficits even at advanced levels of proficiency; (3) Interlanguages have structures that cannot be explained with reference to L1 or L2, but rather reflect universal properties of languages.

DATA 2 written corpora: L1 Spanish – L2 English L2 English – L1 Spanish Data analysis: qualitatively and quantitatively (descriptive and inferential statistics, SPSS).

L2 English – L1 Spanish 260 academic essays in electronic format Range: 500 words up to 2,000 words (300.000 words) 1st and 3rd year Spanish students in an academic writing course on a degree in English Philology at the Universidad Autónoma de Madrid. Basic procedure for gathering the data : Learner Profile Essay Profile Oxford Quick Placement Test

Software: UAM Corpus Tool Software for text annotation: UAM CorpusTool It allows an analyst to select a text from the corpus, and annotate it in various ways. It can highlight a segment (e.g., an it-cleft) and then assign features to that segment. The tool produces an XML-encoded version of the text file, including the features assigned to the segments. It can then automatically detect instances of the pattern.

L2 Spanish – L1 English (online)

THEORETICAL FRAMEWORK Comparative framework which will help determine the role of L1 in L2 acquisition in the grammar areas of the study. L1 properties L2 properties Universal properties We look at: Properties operating at the lexicon-syntax interface (unergatives vs. unaccusatives). Properties operating at the syntax-discourse (topic vs. focus).

Cont´d →Underlying idea: formal and functional features interact in the structures under consideration. Formal and functional approaches are, therefore, essential for the understanding of SLA data. At the same time, SLA data from nonnative grammars can be potentially significant for the understanding of linguistic phenomena in native grammars.

THEORETICAL FRAMEWORK: INTERFACES Chomsky’s (1995, 2001, and so on) Minimalist Program Emphasis on interface conditions. →A well designed language faculty would involve nothing else other than interface conditions. “a theory of language that takes a linguistic expression to be nothing other than a formal object that satisfies the interface conditions in the optimal way.” (Chomsky 1995: 171).

Features of the language sample to be studied 1. Non-canonical word order clause patterns which do not conform to the S(ubject) V(erb) O(bject) order. (a) Inverted subjects Into the room came a tiny old lady. The came voices all shouting together Not before in our history have so many strong influences united to produce so large a disaster. As infections increased in women, so did infections in their babies. b) Dislocation in the left-periphery (fronting) The paper Terry buys everyday (not a book!) Why he said that I will never now c) Right-dislocation The teacher made clearer the standards which students should be aiming for. It’s a pity that he cannot speak Russian.

(cont´d) 2. Variation in word order & special constructions a) passive constructions The paper is bought (by Terry) everyday. b) There-constructions There came a tiny old lady into the room. c) Ditransitive constructions a. I’ll give Mary the book b. I’ll give the book to Mary (d) Object Placement in Phrasal Verbs a. He picked up the telephone b. He picked the telephone up (e) Clefts a. It was his voice that held me. b. What held me was his voice

FIRST WOSLAC STUDY Postverbal subjects L1 Spa – L2 Eng ICLE corpus Interfaces Lexicon-syntax Syntax-discourse

Word order in native English Very restricted: canonical word order SV. Four girls sang Four girls arrived Lexicon-syntax interface (Levin & Rappaport-Hovav, etc): Unaccusative Hypothesis (Burzio 1986, etc) *There sang four girls at the opera. [unergative verb] There arrived four girls at the station. [unaccusative verb] Syntax-discourse interface (Biber et al, Birner, etc): Postverbal material tends to be focus (new info) We have complimentary soft drinks and coffee. Also complimentary is red and white wine. Syntax-Phonological Form (PF) interface (Arnold et al, etc) Heavy material is sentence-final (Principle of End-Weight, Quirk): That money is important is obvious. It is obvious that money is important. Postverbal subjects which are focus, long and complex tend to occur postverbally in those structures which allow them.

Previous L2 findings Production of postverbal subjects in L2 English (Rutherford 1989, Oshita 2004): L1 Spanish – L2 English: …it arrived the day of his departure… And then at last comes the great day. In every country exist criminals …after a few minutes arrive the girlfriend with his family too. Only with unaccusative verbs (never with unergatives). Unaccusatives: arrive, happen, exist, come, appear, live… Explanation: syntax-lexicon interface (Unaccusative Hypothesis) Previous studies focused on ERRORS, thus emphasising the differences between native and non-native structures. Our study emphasises the similarities between native and non-native structures  licesing conditions are the same.

Hypotheses VS order in L1 Spa – L2 Eng… GENERAL HYPOTHESIS: Conditions licensing VS in L2 Eng are the same as those in Native Eng, DESPITE differences in grammaticalisation. H1: Lexicon-syntax interface: Postverbal subjects with unaccs (never with unergs) H2: Syntax-PF interface: Postverbal subjects: heavy (NOT light) H3: Syntax-Discourse interface: Postverbal subjects: focus (NOT topic)

Method Learner corpus: L1 Spa – L2 Eng Problem: proficiency level?? ICLE Spanish subcorpus (Granger et al. 2002) UAM corpus [2nd edition of ICLE] Problem: proficiency level?? WordSmith v. 4.0 (Scott 2004) Excel, SPSS v. 12.0  Concordance queries can be performed automatically with WordSmith, BUT there is a lot of manual work (filtering out unusable data, coding data in Excel, analysing data in SPSS, etc).

Data analysis Based on Levin (1993) and Levin & Rappaport-Hovav (1995): Unergatives: cough, cry, shout, speak, walk, dance… [TOTAL: 41] Unaccusatives: exist, live, appear, emerge, happen, arrive… [TOTAL: 34] WordSmith: query searches: For every lemma (e.g., APPEAR, ARISE), we searched for: All possible native forms: appear, appears, appearing, appeared arise, arises, arising, arose, arisen All posible overregularised and overgeneralised learner forms: arised, arosed,arisened, arosened (“So arised the Sain Inquisition”) All possible forms with probable L1 transfer of spelling: apear, apears, apearing, apeared All other possible misspelled forms: appeard, apeard

Data analysis (cont’d) CONCORDANCES: RAW OUTPUT Thousands of concordances, BUT approx. ¾ were unusable. Filtering criteria had to be applied manually.

Data coding/analysis: EXCEL

Data analysis: preliminary descriptive stats - EXCEL

Data analysis – inferential stats: SPSS

GENERAL HYPOTHESIS: Result: types of VS structures produced Locative inversion: In the main plot appear the main characters: Volpone and Mosca. There-insertion: There exist positive means of earning money. AdvP-insertion: … and here emerges the problem. * it-insertion: *In the name of religion it had occurred many important events… * XP-insertion: *In 1760 occurs the restoration of Charles II in England. * Ø-insertion: …*because exist the science technology and the industrialisation. GRAMM. UNGRAM.

H1: Results: VS and unaccusativity

H2: Result: VS and weight Syntactic weight has to be measured manually according to some theoretical criteria HEAVY Against this society drama emerged an opposition headed by Oscar Wilde and Bernard Shaw. …so came the decline of the theatre. Then come the necessity to earn more. LIGHT So arised the Saint Inquisition… …and from there began a fire. Still today … exists the bloody fights.

H2: Result: SV and weight HEAVY …the cases of men mistreated do not appear in the media… …a disintegration of culture, tradition and society would begin… …the utopian societies created by the early socialists appeared. LIGHT …but they may appear everywhere. …since the day eventually came… …these people should exist, …

H3: Result: VS and discourse Discourse status (topic/focus) has to be measured manually by establishing theoretical criteria and then by checking the context (or even the essay) manually FOCUS …there also exists a wide variety of optional channels which have to be paid. So arised the Saint Inquisition. In 1880 it begun the experiments whose result was the appearance of the television some years later. TOPIC …our modern world, dominated by science and technology and industrialisation …because exist the science technology and the industrialisation.

H3: Result: SV and discourse TOPIC I use the Internet … I find windows … if they press on any of these windows … these windows cannot appear because a child could enter easily… …the world of drugs: mafias … problems with mafias finished … dangerous people making money … no reason why these people should exist.

V S S V Summary/Conclusion NPsubj Vunacc Lexicon-syntax Vunacc NPsubj Syntax-discourse FOCUS Syntax-PF HEAVY NPsubj Vunacc Syntax-discourse TOPIC Syntax-PF LIGHT S V

Thank you!

Heavy/Light scale -------

Data analysis (cont’d)-------- CONCORDANCES: 6 BASIC FILTERING CRITERIA:  The verb must be intransitive (unergative or unaccusative).  In the screen of the television one or two “rombos” should appear. [unac]  Leontes cries and the statue talks. [unerg]  This government’s movement has created several opinions. [trans]  The verb must be finite, with(out) aux.  …also it exists the psychological agresssions… [finite no aux]  … the cases of men mistreated do not appear in the media. [finite aux]  This contradiction could disappear [finite modal]  There’s no reason for it to exist. [for clause + to inf]  Poor people cross borders to escape from poverty. [to-inf clause]  …let time pass… [‘let’ constructions]  …make everyone’s life go ahead [causative + infinitive]  Returning to the title of this paper,… [gerundive clauses]  …they go away in order to escape to France. [‘in order to’ clauses]  …women have to live with the agressor [have to/ought to/able to]  …prudence was beginning to disappear. [verbal/aspectual periphrases]  Before entering the argumentation,… [small clauses]  …instead of following… [complement of P]  …likely to happen… [complement of A]  The tests to enter the army are quite difficult now. [complement of N]

9. Data analysis (cont’d)--------  The verb must be in the active voice.  This contradiction could disappear. [active unaccusative]  This situation has already been happened. [passivised unaccusative]  The subject must be an NP. …it arose [diverse social ranks, the rich and the poor that depended on the property they had]. [inverted NP subject] …it only remains [to add that nowadays we live in a world…] [extraposition] It happened [that the countries which make the weapons are…] [extraposition] The sentence can be either grammatical or ungrammatical in native English.  This contradiction could disappear. [gram]  …it won’t exist nothing of what people don’t get bored or tired. [ungram]  The subject can appear either postverbally (VS) or preverbally (SV).  …the real problem appears when they have to look for their first job. [SV]  So arised the Saint Inquisition. [VS]

10. Data analysis (cont’d) --------- OTHER FILTERING CRITERIA Target V + V (verbal coordination) Families without father exist and work well. Coordinator + target V  …we can manage to obtain it and live in a better world. Interrogatives (only if V is the target)  How could they live?  Does exist then a manipulation of television? Formulaic & Set expressions in English  As sometimes happens…  …fall victim to…  …the world we live in. Set expressions transferred from the L1  …it happens the same.  …they fall into account that they have treated very badly Mr Hardcastle. Phrasal verbs:  …a scientist come up with an intention… Quotes (literary or other): “To what purpose, April, do you return again? “Feminism has to evolved or die”, Friedan said in 1982…

11. Data analysis (cont’d)------- OTHER FILTERING CRITERIA (CONT’D) Transitive alternants (unacs): Rosamond lived a very comfortable life.  …once you have passed this stage.  …the University of Pennsylvania developed the electronic calculator. Causativizations (unacs):  …how parents grew their children.  But this idea could rise the question of… Verbs that do not belong to the proposed semantic criteria by Levin & Rappaport-Hovav:  …social classes appear to be broken. [≠appearance]  …we come to know about his personality… [≠inherently directed motion] Subject relative clauses:  …those fantastic relatives that still survive. ..events of this kind which occurred in Spain. Free relative clauses:  …trying to imagine what will remain…  Hastings realizes what is happening… Predicative complements:  Theatres remained closed.  …men appear completely subordinated to the women’s desires.

Result: Type of VS structures ------

Result: VS and specific unaccusative verbs-----

Length of postverbal subject-----

Word order in native Spanish Lexicon-syntax interface: Syntax-discourse interface: UNERGATIVES: SV A: Qué pasó? B: Un hombre gritó [SV] UNACCUSATIVES: SV A: Qué pasó? B: Llegó un hombre [VS] UNERGATIVES A: Quién gritó? B: Gritó un hombre [VS] UNACCUSATIVES A: Quién llegó? B: Llegó un hombre [VS] Theoretical evidence: Zubizarreta 1998, Casielles-Suárez 2004, Domínguez 2004 Empirical evidence: Hertel 2000, 2003, Lozano 2003, 2006

Result: VS and (in)definiteness …some decades ago, it appeared a new invent: the television. The play was very well performed and also appeared new elements in the stage. …it has appeared some cases of women that have killed their husbands… DEFINITE …because later could appear the real evidence and the real guilty. …and usually appears the noble young man that either waste or has wasted his fortune. In the main plot appear the main characters: Volpone and Mosca.

10. Resultados: léxico-sintaxis ¿Qué pasó? Inacusativos (VS): Llegó un hombre Inergativos (SV): Un hombre gritó n.s. sig sig sig n.s. sig

11. Resultados: sintaxis-discurso ¿Quién llegó / gritó? Inacusativos (VS): Llegó un hombre Inergativos (SV): Gritó un hombre n.s. n.s. sig sig n.s. sig