Presentation is loading. Please wait.

Presentation is loading. Please wait.

CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom.

Similar presentations


Presentation on theme: "CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom."— Presentation transcript:

1 CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom

2 W HAT IS A CORPUS ? A corpus can be defined as a collection of texts assumed to be representative of a given language put together so that it can be used for linguistic analysis. Usually the assumption is that the language stored in a corpus is naturally-occurring, that is gathered according to explicit design criteria, with a specific purpose in mind, and with a claim to represent natural chunks of language selected according to specific typology Tognini-Bonelli (2001:2)

3 “nowadays the term 'corpus' nearly always implies the additional feature of 'machine- readable' ”. McEnery & Wilson, Corpus Linguistics. Online manual.

4 English language corpora: General vs. Specific

5 E NGLISH CORPORA : G ENERAL LANGUAGE CORPORA First generation corpora: - Brown Corpus of Written American English -Lancaster Oslo-Bergen of Written British English -500 texts of around 2000 words each -no spoken data -wide variety of written texts

6 E NGLISH CORPORA : G ENERAL LANGUAGE CORPORA Second generation corpora: -Bank of English -monitor corpus -both spoken and written text -different regional varieties of English -British National Corpus (BNC) -90 million written words -10 million spoken words -freely accessible: Mark Davies‘ interface

7 O THER TYPES OF E NGLISH LANGUAGE CORPORA -speech corpora: -sound recordings -SPOKEN ENGLISH CORPUS -detailed description of spoken phenomena: phonology, prosody (stress, tone units…), etc -multimedia corpora: -transcripts synchronised audio/video recordings -TALKBANK Website: SANTA BARBARACORPUS OF SPOKEN AMERICAN ENGLISH (SBCSAE)

8 space for our own annotation some mark- up for context audiovisual element

9 O THER TYPES OF E NGLISH LANGUAGE CORPORA -parsed corpora: -syntactically analysed - SURFACE AND UNDERLYING STRUCTURAL ANALYSES AND NATURALISTIC ENGLISH CORPUS ( SUSANNE ) -historical corpora: -English of earlier periods -may cover specific historical periods or genres -track and describe how language has evolved -A REPRESENTATIVE CORPUS OF HISTORICAL ENGLISH REGISTERS ( ARCHER )

10 O THER TYPES OF E NGLISH LANGUAGE CORPORA -specialised corpora: -focus on concrete genres/domains - BUSINESS LETTERS CORPUS ( BLC ) -lingua franca corpora: -ENGLISH AS A LINGUA FRANCA IN ACADEMIC SETTINGS ( ELFA ) CORPUS -intercultural exchanges among speakers who use English as a lingua franca

11 O THER TYPES OF E NGLISH LANGUAGE CORPORA -developmental language corpora: -non-adult English native speakers' output -not as proficient as native-speaker corpora - POLYTECHNIC OF WALES (POW) CORPUS -ESL/EFL learner corpora: -learners of English's output -one and the same L1 background or different mother tongues -JAPANESE EFL LEARNER CORPUS ( JEFLL )

12 W ORD S MITH : FLEXIBLE CORPUS -Computer program which permits users to compile their own corpus -Texts must be in.txt format -Any text can be subjected to the same process of analysis that official corpora undergo: concordance lines, word lists, etc -No need to pre-process such texts in advance

13 Corpus linguistics -Insights into the internal workings of real language -Knowledge in turn also used in other fields of enquiry -Planning, designing, compiling and tagging -Frequency lists and concordance lines (+further analysis) -Sinclair’s (2003) “degeneralisation”: -sceptical about 'received' descriptions - patterns found in the data: more precise or alternative descriptions -Corpus-based dictionaries and grammars -how lexis and grammar are “really” used - COLLINS COBUILD LEARNER'S DICTIONARY - THE LONGMAN GRAMMAR OF SPOKEN AND WRITTEN ENGLISH

14 CORPORA IN THE ESL/EFL CLASSROOM: PEDAGOGICAL FOUNDATIONS -Mixture between instructional and naturalistic LL -Fulfilment of both the input and output hypotheses -”Scaffolding” (though loosely speaking) -insights concerning English culture(s) -Student-centred and related to constructivism: mastering corpora = learning autonomy

15 C ORPUS - BASED ESL/EFL ACTIVITIES -Focus on lexis, grammar and register -introductory notions concerning collocation, colligation, and formal vs. informal -For already motivated students: BNC

16 Activity one: contractions, formal or informal? spoken or written? The key * ?’??

17 1 * ?’?? 2 3 4

18 Quotation marks!

19 Activities two and three: Corpora as a source of knowledge concerning collocation and colligation

20 [v*] mistakes

21

22 powerful, not strong!!! [aj*]

23 Activity four: meaning via collocations and co-text

24

25 For non-motivated students: WordSmith -Contact with the English language: input (at least lexis-wise) -Popular culture: MUSIC IN ENGLISH!!!

26 Activity one: music corpora, lexis, and the BNC for grammar accuracy

27 author corpus reference corpus

28 Select the text you want a list of

29 Save both lists to compare them with Keyword

30 author corpus list reference corpus list

31

32

33 That was all! The nightmare is over! Thank you for listening! ^.^


Download ppt "CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom."

Similar presentations


Ads by Google