Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intro to corpus linguistics

Similar presentations


Presentation on theme: "Intro to corpus linguistics"— Presentation transcript:

1 Intro to corpus linguistics
Corpora and language teaching John Corbett & Wendy Anderson

2 Today’s session Applications of corpus linguistics to
Classroom teaching Course and materials design

3 Uses of corpora in ELT Dictionaries Textbooks Grammars Teaching corpora Classroom Materials Activities Syllabus design Testing Basic research Based on Johansson, 2009.

4

5 Issues for discussion…
At what point can learners be expected to work with grammar inductively, using corpus-informed examples? Should learners be trained to use corpora themselves? What is the role of metalanguage in the ELT classroom (eg ‘collocation’, ‘concordance line’)? What balance should be struck between awareness (receptive knowledge) and productive knowledge? How do learners respond to ‘grey areas’ and ‘exceptions’?

6 The corpus as a source of texts
Some corpora can simply be used by classroom teachers as a source of ‘authentic’ texts, of the kind that are not often found in textbooks. The texts can be integrated into ‘process-based’ learning tasks (cf Nunan, 1989): Goal Learner Role Input TASK Teacher Role Activity Settings

7 The corpus as a source of texts
Goal Learner Role Input TASK Teacher Role Activity Settings Goal: to practise story-telling in conversation Input: recordings from the SCOTS corpus Activity: predict story structure; listen to stories from the corpus; attend useful language; revise own stories accordingly Learner role: anticipate; attend; revise Teacher role: set up and monitor tasks; indicate useful language Settings: Pairwork

8 Exploiting the spoken texts
Spoken texts (including sound files) can be downloaded and sampled as the basis of classroom activities. One activity asks learners to ‘anticipate dialogues’. Let’s look at one example.

9 Modelling dialogue Pairwork: A and B
Roleplay: Person A is a parent. Person B is a child, five years of age. The parent asks the child how Spiderman became Spiderman. The child explains.

10 How Spiderman became Spiderman
FATHER: [laugh] Okay. So, ehm, I was wondering if maybe you could tell me about how Spiderman became Spiderman? CHILD: Ehm, so this guy wa-, called Peter Parker was in a place and officer um [?]crogrammes[/?] and there was a spider, //radioactive// FATHER: //mm// CHILD: spider. FATHER: Oh!

11 How Spiderman became Spiderman
CHILD: And it climbed up Peter Parker and it bit him and then he tu- and then he was poisoned by that spider and he, and he was turning into a spider himself! FATHER: Oh //He turned in-// CHILD: //[?]With a[/?]// human head!

12 How Spiderman became Spiderman
FATHER: Oh so he had a spider's body and a human head? CHILD: Yeah. FATHER: Oh! So how did he become Spiderman that's not got the spider's body? CHILD: Um, and there was actually a dream and a- and he walked onto the road and a car was coming //for him// FATHER: //Oh!//

13 How Spiderman became Spiderman
CHILD: and he jumped and he was sticking to the wall! FATHER: Really? CHILD: Erm, and he noticed he had superpowers, so he would call himself Spiderman!

14 Some things to notice Children are vague (‘Peter Parker was in a place’) Adults repeat what children are saying (‘Oh so he had a spider's body and a human head?’) Adults give lots of feedback (‘Oh! Really!’) Children are dramatic (‘Spidermaaaaan!!!’) Can we find any behaviour here that is generally useful in conversation?

15 Redrafting So – change roles with your partner and describe how Superman became Superman. Remember, one partner is five years old and the other is a parent… ************************************************* Certain corpora (eg corpora of speech and writing that allow access to full texts) can be used as easily explorable text/discourse archives that can be searched for generic texts (eg oral stories, written reports) that illustrate generic features. Learning tasks can be created to draw attention to these features, and to help learners internalise them.

16 ICLE - http://www.uclouvain.be/en-cecl-icle.html

17 Using learner corpora (LC)
LC Collection LC Analysis LC-informed pedagogical application Learner population x Academic Learner population y publishers Learner corpora for delayed pedagogical use. From Granger, 2009.

18 Learner corpora of academic writing
“[…] features such as personal pronouns, contractions, the quantifier all or the demonstrative pronoun that, which are markedly more frequent in conversation than in writing, tend to be used more by German, Spanish and Bulgarian learners than by native speakers. Similarly, Granger and Rayson (1998) demonstrate that French-speaking learners overuse many lexical and grammatical features typical of speech, such as first and second person pronouns or short Germanic adverbs (also, only, so, very, etc.), but underuse many of the characteristics of formal writing, such as a high density of nouns and prepositions. Other studies have focused on more specific items, for example I think (Granger 1998, Aijmer 2002, Neff et al. 2007), of course (Granger and Tyson 1996, Altenberg and Tapper 1998, Narita and Sugiura 2006), because (Lorenz 1999) or so (Lorenz 1999, Anping 2002), showing that these items tend to be overused by learners and that this overuse gives learner writing a distinctly oral tone.” Gilquin and Paquot, 2007

19 ‘Over-used’ lexical items in ICLE

20 ‘Maybe’ across 4 corpora.

21 Why is this feature ‘over-used’?
The examination of academic essays produced by native students brings to light [an] explanation for the spoken-like nature of learner writing, namely the influence of developmental factors. We compared the ICLE data with data from the Louvain Corpus of Native English Essays (LOCNESS, cf. Granger 1996), which contains about 300,000 words of academic writing produced by British and American students, and came to the conclusion that novice writers tend to use spoken features, regardless of whether English is their mother tongue or not. Thus, Figure 4, which gives the frequency of maybe in four varieties of English (academic writing, student writing, learner writing and speech), shows that native students also have a tendency to overuse this spoken-like adverb, although it is slightly less marked than among EFL learners. Register confusion, therefore, seems to be as much part of the process of acquiring a foreign language as it is part of the process of becoming an expert writer.

22 From learner corpora to dictionary
From Macmillan Dictionary entry for ‘maybe’

23 Using learner corpora Learner group x LC collection Teachers LC analysis LC application Learner group x Learner group y Learner corpora for immediate pedagogical use From Granger, 2009

24 ‘think’ in past student assignments

25

26 Collocates of ‘because’ in BNC: Academic

27 Collocates of ‘because’ in BNC: Academic

28 Expressing stance about cause
Use corpus searches to identify options for modifying a subordinate clause of reason: This is partly because largely simply merely presumably precisely no doubt mainly possibly

29 Expressing stance about cause
This is partly because largely simply merely presumably precisely no doubt mainly possibly Moreover we may noticed that the most frequent use of idioms were in the genre of Spoken. And also I think that is because the spoken materials were much more difficult to collect in the old days due to the technical constraints.

30 Expressing stance about cause
This is partly because largely simply merely presumably precisely no doubt mainly possibly Moreover we may noticed that the most frequent use of idioms were in the genre of Spoken, presumably because the spoken materials were much more difficult to collect in the old days due to the technical constraints.

31 Take-home messages Corpus-based linguistics has had an enormous impact on learner dictionaries and grammar books. It has had a substantial (but more controversial) impact on textbooks and teaching materials. It has had a lesser impact directly on classroom activities so far, partly (presumably?) because… Computers are not available in all classrooms yet It is difficult to convert corpus-derived information into learning activities. Corpora can even give learners ‘too much information.’

32 References Aijmer, Karin ed. (2009) Corpora and Language Teaching Amsterdam: John Benjamins Anderson, Wendy and John Corbett (2017) Exploring English with Online Corpora. 2° edition. London: Palgrave Macmillan Gilquin, Gaëtanelle and Magali Paquot (2007) ‘Spoken Features in Learner Academic Writing: Identification, Explanation and Solution’ Proceedings of the Fourth Corpus Linguistics Conference University of Birmingham July, [ Granger, Silviane (2009) ‘The contribution of learner corpora to second language acquisition and foreign language teaching: a critical evaluation’ in Aijmer, ed., pp Johansson, Stig (2009) ‘Some thoughts on corpora and second language acquisition’ in Aijmer, ed., pp O’Keeffe, Anne, Michael McCarthy and Ronald Carter (2007) From Corpus to Classroom: Language Use and Language Teaching. Cambridge: CUP


Download ppt "Intro to corpus linguistics"

Similar presentations


Ads by Google