Download presentation
Presentation is loading. Please wait.
Published byRandolf Jacobs Modified over 9 years ago
1
Profiling French Vocabulary: The shape of lexicons by frequency & coverage 10.45-11.15, Monday, March 23, Session K Nfld., Room 13, Mezzanine Tom Cobb
2
Abstract Lexical frequency profiling (LFP; Laufer & Nation, 1995), which has been highly influential in ESL vocabulary research and instruction, has had a slower beginning in French. This has been due to lack of access to large corpora of French from which pedagogically relevant frequency information could be derived. Pioneering efforts in the 1990s (Goodfellow & Lamy, 2002) had facilitated promising comparisons of the lexical coverage of French and English texts (Author & Horst, 2004), which had pedagogical implications that were both interesting and practical (Ovtcharov, Author & Halter, 2006) but inconclusive owing to incompleteness of the frequency information. Now, however, work behind the Frequency Dictionary of French by Lonsdale and Lebras (Routledge, 2009) has produced and made available complete and lemmatized corpus-based frequency information for French. This means that both researchers and teachers can now in principle use the LFP methodology to explore thoroughly the lexical composition, sophistication, and ‘richness’ of French texts. To be discussed will be the method of incorporating the frequency information within an LFP methodology, examples of the sort of research such profiling makes possible, and the means by which researchers can access the tools of this analysis and use them for their own purposes. Representative initial findings from the application of this methodology to French will be offered, including a suggestion that French deploys its lexical resources rather differently from how English does and may present unique and previously undefined lexical challenges to its learners. Recent corpus work in French has made lexical frequency profiling (LFP) methodology available to French researchers and teachers. Initial findings suggest that French may present unique lexical challenges to its learners. Attendees will be shown how to access the tools of this analysis for use in their own work. 2
3
Lexical frequency profiling (LFP; Laufer & Nation, 1995), which has been highly influential in ESL vocabulary research and instruction, has had a slower beginning in French. 3
4
This has been due to lack of access to large corpora of French from which pedagogically relevant frequency information could be derived. 4
5
Pioneering efforts in the 1990s (Goodfellow & Lamy, 2002) had facilitated promising comparisons of the lexical coverage of French and English texts (Author & Horst, 2004), which had pedagogical implications that were both interesting and practical (Ovtcharov, Author & Halter, 2006) but inconclusive owing to incompleteness of the frequency information. 5
6
Now, however, work behind the Frequency Dictionary of French by Lonsdale and Lebras (Routledge, 2009) has produced and made available complete and lemmatized corpus- based frequency information for French. 6
7
This means that both researchers and teachers can now in principle use the LFP methodology to explore the lexical composition, sophistication, and richness of French texts. 7
8
To be discussed will be the method of incorporating frequency information within an LFP methodology, examples of the sort of research such profiling makes possible, and the means by which researchers can access the tools of this analysis and use them for their own purposes. 8
9
Initial findings from the application of this methodology to French will be offered, including a suggestion that French deploys its lexical resources rather differently from English and may present unique and previously undefined lexical challenges to its learners. 9
10
The main new idea of the “vocab revolution” 1990- in ESL/FL Is Zipf’s old idea that some words get way more use in any language Made recently useable by computer technology 10
11
11
12
12
13
13 1, consistency, 2 where to look
14
14
15
15 The AWL effect
16
So it was a reasonable question to ask, “Is there an AWL in French?” An interesting question for several reasons… This gradually became a question that could be answered 16
17
17
18
18
19
19
20
20
21
21 FRENCH – v.1 zoom
22
English French 22 ENG 1+2=80, FR 1+2=90
23
So French is getting the AWL effect for free And for fewer words 23
24
So the question had to be reformulated: Is there an AWL in French? “Is there room for an AWL In French?” 24
25
25
26
26
27
The answered seemed, “No” 1k+2k is already giving 90% coverage And the remaining 10% is presumably needed for technical, archaic, & oddball items With the implication that acquiring a functional second lexicon was easier in French 27
28
1995-2005, a happy picture in ESL vocab 2k+AWL=90% (+technical=95%) BUT SHORT LIVED 1. The goal of vocab development was recalculated (Nation, 2006) The Comprehension-Bar got raised 95% coverage 98% coverage 2. The how-to of building tech lists became less clear 3. Bigger, better frequency lists put the existence of an AWL in question – BNC lists (2005) – BNC-COCA lists (2012) But the notion of 2000 words = 80% has pretty much survived 28 Back to English
29
29 VP-BNC-Coca zoom
30
So the new question about French is ~ Is there room for an AWL In French? “ How are the medium and low frequency lexical resources of French deployed in the remaining 10% space available?” What does this imply for learning French? This question gradually became answerable 30
31
31
32
25 l emmatized French k-lists From Lonsdale & Le Bras dictionary project at BYU Based on 23-million word corpus Continental + International French 50/50 Spoken and written 50/50 Literary 40%, expository 60% List-crunched for RANGE + FREQ 32
33
33
34
34
35
35 FRENCH – v.5
36
So now we can investigate the shape of the mid-frequency French lexicon And make plausible comparisons with English What lies between 90% and 95% coverage in French texts? – Or between 90% and 98%? Is there “less to learn” in French than in English ? (Remembering that lemmas ≠ families) 36
37
3 tests 37
38
Test 1 Translated popular texts 20 translated Readers’ Digest texts 20 Fr, 20 Eng Half translated E->F, half F-> E Total 2939 words Eng, 3650 words Fr Run through VP-Fr as a mini-corpus (as a single file) 38
39
39 ENGLISHENGLISH 95% 98%
40
40 FRENCHFRENCH 95% 98%
41
Eng Side by side Fr (fams) (lemmas) 41 Using 98% criterion
42
Fr (lemmas) A lot of words in that blue circle! The difference between k8 to k16 is only 100 word types But these 100 words are drawn from a pool of 8,000 lemmas 42
43
Test 2 Translated extended literary work Samuel Beckett’s idea - French as “an impoverished lexicon”? Actually he never said this But he did write in French, and “use stark language to con- vey a stark world” How stark is Beckett’s French? 43
44
44
45
45
46
46 «En attendant Godot»“Waiting for Godot” Proper nouns-<1k has changed the 1k-2k thing
47
Test 3 Maybe Tests 1+2 were something about translated texts? Ok, then let’s compare 4 random original editorial texts Chosen 14-15 March, 2015 From (1) Le Monde - Paris (2) Le Devoir – Montreal (3) The Globe & Mail – Toronto (4) The NY Times – New York 47
48
48
49
49
50
Conclusion (1) Comparing languages: – French may make slightly more use of its common words than English does – But it makes far more use of its mid- and low- frequency lexical resources (3k to 20k+) – Cobb & Horst (2004) was right as far as it went, but incomplete For lack of resources 50
51
Conclusion (2) Comparing learning tasks: Learning enough vocab for 90% coverage looks slightly easier in French than English But learning enough words for 98% or even 95% coverage looks far more difficult How many FL2 S’s ever get there? 51
52
(3) The shapes of the two lexicons seem to be like this: English 52 98% 95%
53
French 53 98%95%
54
54
55
55 But notice that the French early advantage persists to about 4k (So 3k words in French gives better coverage than in English) F E
56
Discussion Is the greater ease of acquiring a 90% lexicon in French a reason for the traditional FL2 emphasis on phonology and syntax? Is it that French is a more “academic/elitist” lexicon… Or just that English is less so? – Maybe the shape of English reflects the lingua franca role the language has come to play – Such that its writers use *circumlocution* for complex ideas, rather than seeking « le mot juste »? Flaubert 56
57
57 ENGLISH AS A LINGUA FRANCA? BUT SURELY NOT IN 19th CENT.
58
Further work As ever in corpus work, this needs empirical validation – Do L2 readers with 10k lexicons actually experience a comprehension deficit? As ever in list work, new lists are probably just around the next corner – Any picture is strictly provisional 58
59
Pedagogical implications Are there manageable zones within the French lexicon, like “technical lists”? – … that could be found through work with specialist corpora? Till then, the message seems to be – Get out your flashcards! At least now we know what to put on them OR 59
60
60
61
All chapters + papers + /list_learn/ available at www.lextutor.ca Thank you! cobb.tom@sympatico.ca 61
62
A method note But wait! We are comparing lemmas v. families Cat cats v. cat cats catty 1000 families give more coverage than 1000 lemmas – How much more? Some recent work by Charles Browne suggests an answer 62
63
63 http://www.newgeneralservicelist.org/ 2368 / 2818 *100 = 84% 1000 lems have ~ 16% less coverage than 1000 fams in Eng At High-Frequency NGSL zone (1k+2k) (probably less at lower frequency zones)
64
But even assuming (1) a 16% difference that (2) was maintained at lower-frequency zones About every six lemma lists (6 x 16% = 96%) we would lose a k-level to maintain lemma- family equivalence – So in 18 levels we would lose 3 The picture would not change greatly – Even in exaggerated worst-case scenario 64
65
Eng Fr (fams) (lemmas) 65 K8 E-fams = k16 F-lems for 98% ? K8 E-fams = k13 F-lems for 98% Pattern is the same
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.