Profiling French Vocabulary: The shape of lexicons by frequency & coverage 10.45-11.15, Monday, March 23, Session K Nfld., Room 13, Mezzanine Tom Cobb.

Slides:



Advertisements
Similar presentations
Critical Reading Strategies: Overview of Research Process
Advertisements

Cognitive-metacognitive and content-technical aspects of constructivist Internet-based learning environments: a LISREL analysis 指導教授:張菽萱 報告人:沈永祺.
Lesson 4: Gather Evidence & Handle It Correctly. Gather all the relevant Scriptural evidence on any Biblical subject. – There is a difference between.
Individual Hour Activities Picture Description Oral & Written Presentation Betty Simelmits.
Variation and regularities in translation: insights from multiple translation corpora Sara Castagnoli (University of Bologna at Forlì – University of Pisa)
IGCSE Coursework August 2011 The write up for this experiment should be word processed if possible A hard copy should be submitted by next Monday No excuses.
Interlanguage phonology: Phonological description of what constitute ‘foreign accents’ have been developed. Studies about the reception of such accents.
Copyright © Cengage Learning. All rights reserved.
1 Corpora for all Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
“Corpus Insights from Lextutor R&D that are too small to publish but too interesting to ignore” +1 Tom Cobb SFU March 12,
CALL 2008 Antwerp Choosing words and their order for vocabulary CALL Cornelia Tschichold Swansea.
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
Chapter 18: Words as They Appear in Malaysian Secondary School English Language Textbooks: Some Implications for Pedagogy Jayakaran Mukundan Presented.
Presented by Jennifer Robison TexTESOL II March 12, 2010 San Antonio, TX.
ACOS 2010 Standards of Mathematical Practice
The Impact of On-line Teaching Practices On Young EFL Learners' Instruction Dr. Trisevgeni Liontou RHODES MAY
© Curriculum Foundation1 Section 2 The nature of the assessment task Section 2 The nature of the assessment task There are three key questions: What are.
Saturday, March 15 th and Monday, March 17 th English FL: Reading Comprehension and Composition. Writing: Paragraph Structure; unity; parts, etc. Translation.
Welcome Parents Presented by the Clermont Elementary.
Learning Objectives. Objectives Objectives: By the conclusion to this session each participant should be able to… Differentiate between a goal and objectives.
Teacher’s role in different methods of teaching English.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Language and Literacy Unit 4 - Getting Ready for the Unit
The Vocabulary Coverage in American Television Programs A Corpus-Based Study NA3C 0006 Christina 周惠娟 1.
Call to Write, Third edition Chapter Twelve, The Research Process: Critical Essays and Research Papers.
READING A PAPER. Basic Parts of a Research Paper 1. Abstract 2. Introduction to Technology (background) 3. Tools & techniques/Methods used in current.
Basic concepts of language learning & teaching materials.
What Makes an Essay an Essay. Essay is defined as a short piece of composition written from a writer’s point of view that is most commonly linked to an.
ASSESSING LANGUAGE SKILLS
©2015 Paul Read 5.5 Writing Opinion Essays in Part Two /sizes/z/in/photostream/
Developing reading skills Factors involved in effective reading
Total Physical Response (TPR)
Averil Coxhead Hüsem Korkmaz MA TEFL. was developed from a corpus of 5 million words with the needs of ESL/EFL learners in mind, contains the most widely.
Year 9 Humanities Personal Project Term 2. Contents  The task and outcome The task and outcome  The purpose The purpose  Becoming an effective learner.
VOCABULARY LEARNING THROUGH READING Warren Matsuoka & David Hirsh University of Sydney Australia INSERT YOUR ORGANIZATION’S LOGO HERE The results suggest.
GSL & NGSL. Comparison: GSL 1953 (Michael West) 1995 ( John Bauman & Brent Culligan) Today’s version 2284 Word families (famous early 20th century researchers;
1.  Interpretation refers to the task of drawing inferences from the collected facts after an analytical and/or experimental study.  The task of interpretation.
Parents Information Evening
Introduction to the ERWC (Expository Reading and Writing Course)
Accelerating progress in Y6
Colorado State University
1 Ch 1. VOCABULARY SIZE, TEXT COVERAGE & WORD LISTS Nation& Waring.
TYPE OF READINGS.
Tips for Working with English Language Learners By: Mark Paskert & Myra Talley.
INTERNAL ASSESSMENT ADVICE Or…how to get a 7 on your Internal Assessment.
Read the following assumptions about vocabulary in English learning and decide if you agree with them or not. 1.A vocabulary item can be more than one.
EXAMINERS’ COMMENTS RAPHAEL’S LONG TURN GRAMMAR Accurate use of simple grammatical structures and also of some complex sentences: ‘they could also be preparing.
Thesis Statements in Academic Essays By Susanne Bentley.
ATTACKING THE (SAR) OPEN ENDED RESPONSE. Get out a sheet of paper(or 2?)! Your responses to the questions on this power point will be your SAR test grade.
Incidental versus intentional vocabulary learning A selection of research articles.
1 Vocabulary acquisition from extensive reading: A case study Maria Pigada and Norbert Schmitt ( 2006)
Parent literacy workshop March 24, elements required for reading Phonemic awareness (hearing sounds in words) Phonics (letter sound relationship)
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
Vocabulary Acquisition in a Second Language: Do Learners Really Acquire Most Vocabulary by Reading? Some Empirical Evidence Batia Laufer.
Lexical chunks Liu, D. (2003). The most frequently used spoken American English idioms: a corpus analysis and its implications. TESOL Quarterly 37(4),
Quick start, slow finish: Learning the lexis of French is like learning to play the guitar DS h10 – 15h50 Bloc J-6 Tom Cobb
Close Reading October 18, Session Objectives Participants will: Be able to define close reading. Learn the components of close reading. Participate.
Selecting a Novel for an Independent Reading Project
What Do Teachers Need to Know About Assessment?
22. Form-Focused Instruction

Advanced Higher Modern Languages:
How many lexical items do students need to know?
Learning to program with Logo
Unit 4 Introducing the Study.
[Insert State Name] State Assessments: What do families need to know?
[Insert State Name] State Assessments: What do families need to know?
The Third-K slump: What Lexical Tutors can learn from Lextutor users
Constructing a Test We now know what makes a good question:
Presentation transcript:

Profiling French Vocabulary: The shape of lexicons by frequency & coverage , Monday, March 23, Session K Nfld., Room 13, Mezzanine Tom Cobb

Abstract Lexical frequency profiling (LFP; Laufer & Nation, 1995), which has been highly influential in ESL vocabulary research and instruction, has had a slower beginning in French. This has been due to lack of access to large corpora of French from which pedagogically relevant frequency information could be derived. Pioneering efforts in the 1990s (Goodfellow & Lamy, 2002) had facilitated promising comparisons of the lexical coverage of French and English texts (Author & Horst, 2004), which had pedagogical implications that were both interesting and practical (Ovtcharov, Author & Halter, 2006) but inconclusive owing to incompleteness of the frequency information. Now, however, work behind the Frequency Dictionary of French by Lonsdale and Lebras (Routledge, 2009) has produced and made available complete and lemmatized corpus-based frequency information for French. This means that both researchers and teachers can now in principle use the LFP methodology to explore thoroughly the lexical composition, sophistication, and ‘richness’ of French texts. To be discussed will be the method of incorporating the frequency information within an LFP methodology, examples of the sort of research such profiling makes possible, and the means by which researchers can access the tools of this analysis and use them for their own purposes. Representative initial findings from the application of this methodology to French will be offered, including a suggestion that French deploys its lexical resources rather differently from how English does and may present unique and previously undefined lexical challenges to its learners. Recent corpus work in French has made lexical frequency profiling (LFP) methodology available to French researchers and teachers. Initial findings suggest that French may present unique lexical challenges to its learners. Attendees will be shown how to access the tools of this analysis for use in their own work. 2

Lexical frequency profiling (LFP; Laufer & Nation, 1995), which has been highly influential in ESL vocabulary research and instruction, has had a slower beginning in French. 3

This has been due to lack of access to large corpora of French from which pedagogically relevant frequency information could be derived. 4

Pioneering efforts in the 1990s (Goodfellow & Lamy, 2002) had facilitated promising comparisons of the lexical coverage of French and English texts (Author & Horst, 2004), which had pedagogical implications that were both interesting and practical (Ovtcharov, Author & Halter, 2006) but inconclusive owing to incompleteness of the frequency information. 5

Now, however, work behind the Frequency Dictionary of French by Lonsdale and Lebras (Routledge, 2009) has produced and made available complete and lemmatized corpus- based frequency information for French. 6

This means that both researchers and teachers can now in principle use the LFP methodology to explore the lexical composition, sophistication, and richness of French texts. 7

To be discussed will be the method of incorporating frequency information within an LFP methodology, examples of the sort of research such profiling makes possible, and the means by which researchers can access the tools of this analysis and use them for their own purposes. 8

Initial findings from the application of this methodology to French will be offered, including a suggestion that French deploys its lexical resources rather differently from English and may present unique and previously undefined lexical challenges to its learners. 9

The main new idea of the “vocab revolution” in ESL/FL Is Zipf’s old idea that some words get way more use in any language Made recently useable by computer technology 10

11

12

13 1, consistency, 2 where to look

14

15 The AWL effect

So it was a reasonable question to ask, “Is there an AWL in French?” An interesting question for several reasons… This gradually became a question that could be answered 16

17

18

19

20

21 FRENCH – v.1 zoom

English French 22 ENG 1+2=80, FR 1+2=90

So French is getting the AWL effect for free And for fewer words 23

So the question had to be reformulated: Is there an AWL in French? “Is there room for an AWL In French?” 24

25

26

The answered seemed, “No” 1k+2k is already giving 90% coverage And the remaining 10% is presumably needed for technical, archaic, & oddball items With the implication that acquiring a functional second lexicon was easier in French 27

, a happy picture in ESL vocab 2k+AWL=90% (+technical=95%) BUT SHORT LIVED 1. The goal of vocab development was recalculated (Nation, 2006) The Comprehension-Bar got raised 95% coverage  98% coverage 2. The how-to of building tech lists became less clear 3. Bigger, better frequency lists put the existence of an AWL in question – BNC lists (2005) – BNC-COCA lists (2012) But the notion of 2000 words = 80% has pretty much survived 28 Back to English

29 VP-BNC-Coca zoom

So the new question about French is ~ Is there room for an AWL In French? “ How are the medium and low frequency lexical resources of French deployed in the remaining 10% space available?” What does this imply for learning French? This question gradually became answerable  30

31

25 l emmatized French k-lists From Lonsdale & Le Bras dictionary project at BYU Based on 23-million word corpus Continental + International French 50/50 Spoken and written 50/50 Literary 40%, expository 60% List-crunched for RANGE + FREQ 32

33

34

35 FRENCH – v.5

So now we can investigate the shape of the mid-frequency French lexicon And make plausible comparisons with English What lies between 90% and 95% coverage in French texts? – Or between 90% and 98%? Is there “less to learn” in French than in English ? (Remembering that lemmas ≠ families) 36

3 tests 37

Test 1 Translated popular texts 20 translated Readers’ Digest texts  20 Fr, 20 Eng Half translated E->F, half F-> E Total 2939 words Eng, 3650 words Fr Run through VP-Fr as a mini-corpus (as a single file) 38

39 ENGLISHENGLISH 95% 98%

40 FRENCHFRENCH 95% 98%

Eng Side by side Fr (fams) (lemmas) 41 Using 98% criterion

Fr (lemmas) A lot of words in that blue circle! The difference between k8 to k16 is only 100 word types But these 100 words are drawn from a pool of 8,000 lemmas 42

Test 2 Translated extended literary work Samuel Beckett’s idea - French as “an impoverished lexicon”? Actually he never said this But he did write in French, and “use stark language to con- vey a stark world” How stark is Beckett’s French? 43

44

45

46 «En attendant Godot»“Waiting for Godot” Proper nouns-<1k has changed the 1k-2k thing

Test 3 Maybe Tests 1+2 were something about translated texts? Ok, then let’s compare 4 random original editorial texts Chosen March, 2015 From (1) Le Monde - Paris (2) Le Devoir – Montreal (3) The Globe & Mail – Toronto (4) The NY Times – New York 47

48

49

Conclusion (1) Comparing languages: – French may make slightly more use of its common words than English does – But it makes far more use of its mid- and low- frequency lexical resources (3k to 20k+) – Cobb & Horst (2004) was right as far as it went, but incomplete For lack of resources 50

Conclusion (2) Comparing learning tasks: Learning enough vocab for 90% coverage looks slightly easier in French than English But learning enough words for 98% or even 95% coverage looks far more difficult How many FL2 S’s ever get there? 51

(3) The shapes of the two lexicons seem to be like this: English 52 98% 95%

French 53 98%95%

54

55 But notice that the French early advantage persists to about 4k (So 3k words in French gives better coverage than in English) F E

Discussion Is the greater ease of acquiring a 90% lexicon in French a reason for the traditional FL2 emphasis on phonology and syntax? Is it that French is a more “academic/elitist” lexicon… Or just that English is less so? – Maybe the shape of English reflects the lingua franca role the language has come to play – Such that its writers use *circumlocution* for complex ideas, rather than seeking « le mot juste »? Flaubert 56

57 ENGLISH AS A LINGUA FRANCA? BUT SURELY NOT IN 19th CENT.

Further work As ever in corpus work, this needs empirical validation – Do L2 readers with 10k lexicons actually experience a comprehension deficit? As ever in list work, new lists are probably just around the next corner – Any picture is strictly provisional 58

Pedagogical implications Are there manageable zones within the French lexicon, like “technical lists”? – … that could be found through work with specialist corpora? Till then, the message seems to be – Get out your flashcards! At least now we know what to put on them OR  59

60

All chapters + papers + /list_learn/ available at Thank you! 61

A method note But wait! We are comparing lemmas v. families Cat cats v. cat cats catty 1000 families give more coverage than 1000 lemmas – How much more? Some recent work by Charles Browne suggests an answer 62

/ 2818 *100 = 84% 1000 lems have ~ 16% less coverage than 1000 fams in Eng At High-Frequency NGSL zone (1k+2k) (probably less at lower frequency zones)

But even assuming (1) a 16% difference that (2) was maintained at lower-frequency zones About every six lemma lists (6 x 16% = 96%) we would lose a k-level to maintain lemma- family equivalence – So in 18 levels we would lose 3 The picture would not change greatly – Even in exaggerated worst-case scenario 64

Eng Fr (fams) (lemmas) 65 K8 E-fams = k16 F-lems for 98% ?  K8 E-fams = k13 F-lems for 98% Pattern is the same