Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Tom Cobb Université du Québec à Montréal Didactique Des Langues DDL for French learners: A resource wish-list 8-10 Sept 2011 Association.

Similar presentations


Presentation on theme: "1 Tom Cobb Université du Québec à Montréal Didactique Des Langues DDL for French learners: A resource wish-list 8-10 Sept 2011 Association."— Presentation transcript:

1 1 Tom Cobb cobb.tom@uqam.ca Université du Québec à Montréal Didactique Des Langues DDL for French learners: A resource wish-list 8-10 Sept 2011 Association for French Language Studies Colloque annuel - Nancy cobb.tom@uqam.ca

2 2 OR – What we’ve got + what we still need to make DDL 1.0 in French

3 3 Some personal stuff Is my presence here a bit fraudulent? > Not a corpus linguist > Not a French linguist or teacher > Not «au courant » about French educ. Not a v. good French speaker Just an applied linguist with amateur programming ability and a belief that corpus is a useful tool in learning and teaching - Whose class notes became a best-seller - Whose class website became must-click

4 4

5 5

6 6

7 7 A huge pile of tutorial & language analysis tools > But at least all in one place > And do not require “administrative privileges” to access in a lab > Works on most of the various browser x platform combinations Trying to keep it all together is like riding a whirlwind But one does make an effort…

8 8

9 9

10 10 So, good « concordanciers » all, We instantly extract the pattern : 1. Pretty much a Nordic sport 2. When Europe sleeps > N. America lextutors 3. China never sleeps

11 11

12 12 But what do all those energetic folks “do” on Lextutor? Site Stats are pretty bare But we get some idea from their itineraries

13 13 PATHWAYS 1

14 14 PATHWAYS 2

15 15 PATHWAYS 3

16 16 PATHWAYS 4

17 17 PATHWAYS 5

18 18 PATHWAYS 6

19 19 PATHWAYS 7

20 20 PATHWAYS 8

21 21 PATHWAYS 9

22 22 PATHWAYS 9a

23 23 PATHWAYS 10

24 24 PATHWAYS 11

25 25 PATHWAYS 12

26 26 PATHWAYS 13

27 27 PATHWAYS 14

28 28 PATHWAYS 15

29 29 PATHWAYS 16

30 30 PATHWAYS 17

31 31

32 32 > Some of it random > Some of it inexplicable > Some of it makes sense So we combine the well motivated, well-worn paths

33 33

34 34 My only experience of French education was…

35 35 Young Americans at the Sorbonne learned… le subjonctif du passé historique > … etc … who could not order breakfast … who did not know 300 basic words … who had never said or understood a full sentence  A STRONG ARGUMENT FOR SOME “COMMUNICATIVE METHOD” ! + some small awareness of frequency

36 36 But as we now know… COMMUNICATIVE METHOD had its own problems Top end: > Tendency to plateau Lower end: > Missing HF vocab + verb tenses > Non-grammaticized multiword units > No sense of “language as object” ENTER FOCUS ON FORM and LANGUAGE AWARENESS

37 37 I have always wondered why… TEXT COMPUTING + FonF / LA have not seemed a more obvious link The computer since about 2005 - commandeered as “just another means of communication” With all the limitations of the earlier communicative era? My job has become to make and sell this link

38 38 Some underlying assumptions Much interesting research in applied linguistics makes extensive use of a language technology Few learners at any level will have any major experience of this Most language technologies used in research are never “pedagogicalized” Many easily could be

39 39 Some underlying assumptions (2) Many interesting language technologies are hard to get your hands on Most can be reverse engineered A corpus is not just something techy to keep the stronger students busy Rather it is a necessary tool in SLA

40 40 Some underlying assumptions (3) Unless Chomsky was right Language acquisition depends on input At least in L2 for post-adolescents But patterns in natural input are - fragmentary, distributed, imperceptible Requiring 15 years - to form via osmosis Successful SLA can only occur - with some sort of data assembly + compression The best form of this is ~ a corpus + way to query it (concordance)

41 41 Some underlying assumptions (4) But a pedagogical corpus is not necessarily the same as the computational linguist’s Corpus as a learning tool ~ Need not be enormous As in language pedagogy generally, “Do more with less” WITH PROGRAMMING + IMAGINATION + USER fB Need not be tagged It’s the learner’s task to parse surface structure input

42 42 Some underlying assumptions (4) Corpus as a learning tool ~ Need not follow a sampling principle Frequency may be more useful (“This corpus is 95% first thousand level words…”) Or “Our Course Materials” Or “The words of a Author X” etc Need not pass through the education establishment With the Web can reach learners directly

43 43 Some underlying assumptions (5) Corpus as a learning tool ~ Is probably the only ground between the extremes of Hot Potatoes exercises and pricey R+D in “intelligent” iCALL … that can serve as a framework for interesting, practical, real-world CALL development

44 44 Some underlying assumptions (6) Corpus as a learning tool ~ Is probably the only way of developing a truly multilingual CALL Just a question of finding or making comparable corpora in different languages - and passing them through the same algorithms

45 45 Is there any proof for any of this?

46 46 Some research findings for pedagogical concordancing Deep vocabulary knowledge can result from multi-contextual encounters with words in a concordancing activity Breadth & depth of vocabulary learning can be reconciled with concordancing

47 47 Some research findings for pedagogical concordancing (2) A plausible case for constructivism can be constructed within a DDL/ concordancing framework The value of collaborative learning in SLA can be shown in co-constructed learner concordances

48 48 Some research findings for pedagogical concordancing (3) Concordance feedback to learner writing via embedded links (a) works well, (b) helps some learners, ( c) hurts none Many functions of an NS “reading buddy” can be realized by “resource- assisted reading” esp. text-linked concordancing

49 49 Some research findings for pedagogical concordancing (4) …the reading buddy AND, LINKED TO GOOGLE BOOKS, CAN DELIVER “THE TEACHER WITH THE TEXT” so that African universities need not build $$$ libraries $$$

50 50 BUT the multi-languages idea?

51 51

52 52

53 53 What can we dofor our learners with all these corpora?

54 54

55 55

56 56

57 57

58 58

59 59

60 60

61 61

62 62

63 63

64 64

65 65 So what else is needed for a French DDL?

66 66 Wish List 1 A resident 3-5 million word general French corpus

67 67

68 68 Some of this may be a “familization” problem The word exists in a 1 million word corpus, just not in a particular form

69 69 Following Nation’s heroic cracking of the 100 million-word BNC into 14 k-family units… Many things became possible on Lextutor Especially with smallish corpora

70 70

71 71 From Wang & Nation (2004), based on Bauer & Nation (1993)

72 72

73 73 1k 80.17% 2k 5.65% AWL 1.68% Off 12.50% GSL + AWL BNC LISTS (N, 2006) BNC + proper noun auto-extract (C+L, 09)

74 74

75 75

76 76

77 77

78 78 Wish List 2. A complete, familized French frequency list from a big corpus Or at least lemmatized

79 79 With such a list we can offer learners… A summary of the whole lexicon of a language + a plan for getting acquainted with it

80 80

81 81

82 82

83 83 Of course, for learning basic vocabulary items (1k, 2k, 3k) Contextual learning from a “real” will not work

84 84 Build comprehensible corpus

85 85 Wish List 3. Graded French corpora Of at least 1 million words Ideally at 1k, 2k, 3k levels So we can say, “For this learner, 95% of items in this corpus are comprehensible”

86 86 And… so far all this only deals with individual words

87 87 Some Multi-Word Units  independent, non-compositional meaning are so frequent that…  they are actually 1st and 2nd 1000 items  E.g., learners will meet “of course”  More frequently than 2k item “window” 505 of these belong in the most frequent 5,000 (>10%) Schmitt & Alvarez, Nottingham However… We have now discovered *the V.H.F Multi-Word*

88 88

89 89

90 90 Wish List 4. A list of the VHF idiomatic multi-words in French (to incorporate in Wish-List item 3)

91 91 And when all that is done ~ The frequency list can be adjusted for homographs « Les poules du couvent couvent » « Tu as l’ as de pique » « Les vers marchent versAvignon » A combined frequency rating based on word form? English wish list  contextually sensitive word lists (DDL 2.0)

92 92 Wish List Résumé 1. A mid-size general corpus of French for DDL web work (3 million-ish) 2. A graded corpus for DDL web work 3. A complete, familized frequency list from a big corpus 4. Identification of HF multiwords for eventual inclusion in 3

93 93 Suite… Software lextutor.ca/ Papers lextutor.ca/cv/ Anything else > Find me here > Write me at cobb.tom@uqam.ca cobb.tom@uqam.ca

94 94

95 95 Technology and Language Testing Corpus-Based Testing Parallel Concordancing Analyzing Speech Corpora Concordancing Language Teacher Training in Technology Language Trainer Training in Technology Technology and Listening Computer-Assisted Language Learning Effectiveness Research Learner Modeling in Intelligent Computer- Assisted Language Learning Intelligent Computer- Assisted Language Learning Natural Language Processing in Intelligent Computer-Assisted Language Learning Learner Corpora Automated Speech Recognition Technology and Discourse Intonation Computer-Assisted Vocabulary Load Analysis Technology and Usage- Based Teaching Applications Information Retrieval for Reading Tutors Online Communities of Practice Emerging Technologies for Language Learning Lexical Bundles Technology and Phrases Technology and Teaching Writing Latent Semantic Analysis

96 96 Mobile Assisted Language Learning Technology and Phonetics Computer-Mediated Communication and Second Language Use Computer-Mediated Communication and Second Language Development Multimodal Computer- Mediated Communication and Distance Education Distance Language Learning Massively Multiplayer Online Games Digital Divide Technology and Teaching Vocabulary Exporting Applied Linguistics Technology Monolingual Lexicography Bilingual Lexicography Lexicography Across Languages Internet and World English Searchlinguistics Internationalization and Localization Translation Terminology Technology and Literacy Corpora and Literature Keyword Analysis Connectionism Text-to-Speech Synthesis Development Text-to-Speech Synthesis Research Lexical Priming Technology and Culture Computer-Assisted Language Learning and Machine Translation


Download ppt "1 Tom Cobb Université du Québec à Montréal Didactique Des Langues DDL for French learners: A resource wish-list 8-10 Sept 2011 Association."

Similar presentations


Ads by Google