CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom.

Slides:



Advertisements
Similar presentations
Building up Corpus of Technical Vocabulary – Strategies and Feasibility Presenters: Dr. Aparna Palle, Preetha Anthony GNITS, HYDERABAD.
Advertisements

Uses of a Corpus “[E]xplore actual patterns of language use”
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
Dr. Radhika Mamidi Corpus. What is a Corpus? a corpus (plural corpora) or text corpus is a large and structured set of texts (now usually electronically.
1 Corpora for all Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
What is VOICE? VOICE, the Vienna-Oxford International Corpus of English, is a structured collection of language data, the first computer-readable corpus.
Police-Rescue Learner’s Dictionary Epp Leibur, Külli Saluste.
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
Corpus Linguistics. What is corpus linguistics? Method / Theory in Linguistics Analysis of collections of texts (corpora) Verifying/ Strengthening or.
1/26 Corpus Linguistics. 2/26 Varieties of English Relevance of corpus linguistics to this course –Previously studies of stylistics were largely informal.
Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre
The origins of language curriculum development
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
LELA English Corpus Linguistics
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
CALL – computer assisted language learning A short course delivered by Dr. Klaus Schwienhorst. MITE January 2002.
Presented by Jennifer Robison TexTESOL II March 12, 2010 San Antonio, TX.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Research methods in corpus linguistics Xiaofei Lu.
Chapter 3: An Introduction to Corpus Linguistics Compiled by: Sajjad Ghadamyari Farhad Ghiasvand Presentation Date: Dec. 8, Monday.
14: THE TEACHING OF GRAMMAR  Should grammar be taught?  When? How? Why?  Grammar teaching: Any strategies conducted in order to help learners understand,
English Corpora and Language Learning Tamás Váradi
Memory Strategy – Using Mental Images
CORPUS LINGUISTICS: AN INTRODUCTION Susi Yuliawati, M.Hum. Universitas Padjadjaran
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
Using corpora for bespoke language teaching
Prof. Karīna Aijmere ( Karin Aijmer ) Gēteborgas Universitāte, Zviedrija „Valodas apguvēju korpuss – tā veidošana un izmantošana valodu apguvē, mācību.
Data collection and experimentation. Why should we talk about data collection? It is a central part of most, if not all, aspects of current speech technology.
U SING C ORPUS - BASED R ESEARCH FOR L ANGUAGE T EACHING AND L EARNING ENGLISH 510 Hee Sung (Grace) Jun & Kimberly LeVelle.
BTANT 129 w5 Introduction to corpus linguistics. BTANT 129 w5 Corpus The old school concept – A collection of texts especially if complete and self-contained:
Translation Studies 8. Research methods in Translation Studies Krisztina Károly, Spring, 2006 Sources: Károly, 2002; Klaudy, 2003.
Reflections on Using Corpora Data in EFL Teaching CHEN BO Chongqing Jiaotong University 2006.
1 Corpora: Annotating and Searching LING 5200 Computational Corpus Linguistics Martha Palmer.
Researching language with computers Paul Thompson.
Corpus-assisted discourse analysis
THE TBL FRAMEWORK: LAGUAGE FOCUS Willis, J. (1996) ByJulietaEdayFabiola.
Why We Need Corpora and the Sketch Engine Adam Kilgarriff Lexical Computing Ltd, UK Universities of Leeds and Sussex.
Chapter 10 Language and Computer English Linguistics: An Introduction.
UCREL: from LOB to REVERE Paul Rayson. November 1999CSEG awayday Paul Rayson2 A brief history of UCREL In ten minutes, I will present a brief history.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Creating Authentic EFL Materials Using English Corpora: Some Benefits of Corpus for the Layman Tyler Barrett Kure City ALT
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
1 2 English as a global language English as a global language: the place of English: as a lingua franca the number of English speakers: million.
How Can Corpora Help Me To Be Successful in CO150?
Corpus Linguistics in Research Doctorate in Education University of Warwick 6th November 2008.
English for Specific Purposes
Movie Guides Would you like to… MOTIVATE STUDENTS USE AUTHENTIC MATERIAL OFFER VARIETY SURPRISE STUDENTS SUPPLEMENT EFL / ESL COURSE HAVE EVERYTHING.
Enda F. Scott 2001 Good morning An introduction to modern dictionary making.
Corpus search What are the most common words in English
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
The Changing World of English. A Language Story Kachru (1985): - If the spread of English continues at the current rate, by the year 2000 its non-native.
What is a Corpus? What is not a corpus?  the Web  collection of citations  a text Definition of a corpus “A corpus is a collection of pieces of language.
INTRODUCTION TO APPLIED LINGUISTICS
Introduction to Corpus linguistics
Using Technology to Teach Listening Skills
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
Corpus Linguistics Anca Dinu February, 2017.
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.
Using Corpora in Linguistics
Computational and Statistical Methods for Corpus Analysis: Overview
Exploring the BNC Corpus
Corpus Linguistics I ENG 617
عمادة التعلم الإلكتروني والتعليم عن بعد
Corpus-Based ELT CEL Symposium Creating Learning Designers
(word formation: follow up)
Using GOLD to Tracking L2 Development
Applied Linguistics Chapter Four: Corpus Linguistics
Presentation transcript:

CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom

W HAT IS A CORPUS ? A corpus can be defined as a collection of texts assumed to be representative of a given language put together so that it can be used for linguistic analysis. Usually the assumption is that the language stored in a corpus is naturally-occurring, that is gathered according to explicit design criteria, with a specific purpose in mind, and with a claim to represent natural chunks of language selected according to specific typology Tognini-Bonelli (2001:2)

“nowadays the term 'corpus' nearly always implies the additional feature of 'machine- readable' ”. McEnery & Wilson, Corpus Linguistics. Online manual.

English language corpora: General vs. Specific

E NGLISH CORPORA : G ENERAL LANGUAGE CORPORA First generation corpora: - Brown Corpus of Written American English -Lancaster Oslo-Bergen of Written British English -500 texts of around 2000 words each -no spoken data -wide variety of written texts

E NGLISH CORPORA : G ENERAL LANGUAGE CORPORA Second generation corpora: -Bank of English -monitor corpus -both spoken and written text -different regional varieties of English -British National Corpus (BNC) -90 million written words -10 million spoken words -freely accessible: Mark Davies‘ interface

O THER TYPES OF E NGLISH LANGUAGE CORPORA -speech corpora: -sound recordings -SPOKEN ENGLISH CORPUS -detailed description of spoken phenomena: phonology, prosody (stress, tone units…), etc -multimedia corpora: -transcripts synchronised audio/video recordings -TALKBANK Website: SANTA BARBARACORPUS OF SPOKEN AMERICAN ENGLISH (SBCSAE)

space for our own annotation some mark- up for context audiovisual element

O THER TYPES OF E NGLISH LANGUAGE CORPORA -parsed corpora: -syntactically analysed - SURFACE AND UNDERLYING STRUCTURAL ANALYSES AND NATURALISTIC ENGLISH CORPUS ( SUSANNE ) -historical corpora: -English of earlier periods -may cover specific historical periods or genres -track and describe how language has evolved -A REPRESENTATIVE CORPUS OF HISTORICAL ENGLISH REGISTERS ( ARCHER )

O THER TYPES OF E NGLISH LANGUAGE CORPORA -specialised corpora: -focus on concrete genres/domains - BUSINESS LETTERS CORPUS ( BLC ) -lingua franca corpora: -ENGLISH AS A LINGUA FRANCA IN ACADEMIC SETTINGS ( ELFA ) CORPUS -intercultural exchanges among speakers who use English as a lingua franca

O THER TYPES OF E NGLISH LANGUAGE CORPORA -developmental language corpora: -non-adult English native speakers' output -not as proficient as native-speaker corpora - POLYTECHNIC OF WALES (POW) CORPUS -ESL/EFL learner corpora: -learners of English's output -one and the same L1 background or different mother tongues -JAPANESE EFL LEARNER CORPUS ( JEFLL )

W ORD S MITH : FLEXIBLE CORPUS -Computer program which permits users to compile their own corpus -Texts must be in.txt format -Any text can be subjected to the same process of analysis that official corpora undergo: concordance lines, word lists, etc -No need to pre-process such texts in advance

Corpus linguistics -Insights into the internal workings of real language -Knowledge in turn also used in other fields of enquiry -Planning, designing, compiling and tagging -Frequency lists and concordance lines (+further analysis) -Sinclair’s (2003) “degeneralisation”: -sceptical about 'received' descriptions - patterns found in the data: more precise or alternative descriptions -Corpus-based dictionaries and grammars -how lexis and grammar are “really” used - COLLINS COBUILD LEARNER'S DICTIONARY - THE LONGMAN GRAMMAR OF SPOKEN AND WRITTEN ENGLISH

CORPORA IN THE ESL/EFL CLASSROOM: PEDAGOGICAL FOUNDATIONS -Mixture between instructional and naturalistic LL -Fulfilment of both the input and output hypotheses -”Scaffolding” (though loosely speaking) -insights concerning English culture(s) -Student-centred and related to constructivism: mastering corpora = learning autonomy

C ORPUS - BASED ESL/EFL ACTIVITIES -Focus on lexis, grammar and register -introductory notions concerning collocation, colligation, and formal vs. informal -For already motivated students: BNC

Activity one: contractions, formal or informal? spoken or written? The key * ?’??

1 * ?’?? 2 3 4

Quotation marks!

Activities two and three: Corpora as a source of knowledge concerning collocation and colligation

[v*] mistakes

powerful, not strong!!! [aj*]

Activity four: meaning via collocations and co-text

For non-motivated students: WordSmith -Contact with the English language: input (at least lexis-wise) -Popular culture: MUSIC IN ENGLISH!!!

Activity one: music corpora, lexis, and the BNC for grammar accuracy

author corpus reference corpus

Select the text you want a list of

Save both lists to compare them with Keyword

author corpus list reference corpus list

That was all! The nightmare is over! Thank you for listening! ^.^