Presentation is loading. Please wait.

Presentation is loading. Please wait.

TALC 2006 1 Applying some Developments in Corpus Building Technology to Language Teaching and Learning TALC 2006 Paris.

Similar presentations


Presentation on theme: "TALC 2006 1 Applying some Developments in Corpus Building Technology to Language Teaching and Learning TALC 2006 Paris."— Presentation transcript:

1 TALC 2006 1 Applying some Developments in Corpus Building Technology to Language Teaching and Learning TALC 2006 Paris

2 TALC 20062 James Thomas & Jan Pomikálek Department of Information Technology Faculty of Informatics Masaryk University Brno Czech Republic

3 TALC 20063 Data Driven Learning  doctoral students of Faculty of Informatics  training and trust needed to ask questions needed to be able to create queries needed to believe answers needed to trust descriptive accounts

4 TALC 20064 TALC 2002 Corpus consultation hampered by students’ limited vocabulary  different tasks needed  concordances need to be sorted Readability Average word frequency of each concordance The design of a Lexical Difficulty Filter for language learning on the Internet (pdf)pdf

5 TALC 20065 What changed … Web-based interface  Bonito became Word Sketch Engine (WSE)  user friendly CQL now optional (example)example New features - new results! (example)example  word sketches  sketch differences  thesaurus (statistical)  frequency distribution (chunks/patterns)

6 TALC 20066 Addressing issues of faith and skills Worksheets including instructions  example relating to the textbook example Classroom use of concordance printouts  prepositions prepositions Activities set for corpus use  example relating to the textbook example Error correction of each other’s written work

7 TALC 20067 Addressing Problem 1 (cont) Faith in general corpus use  students find the results convincing and useful Feedback from students  Qualitative feedback only  See abstract.abstract  BNC not “computer savvy”

8 TALC 20068 BNC - limited application Dated – 94% texts from 1985 to 1993  modern technology not accounted for Technical vocabulary missing Differences between word usage  higher frequency of academic vocabulary not represented (Coxhead)  see key words list Solution: revisit an old idea …

9 TALC 20069 TALC 2004 Each dept at FI MU was invited to contribute academic papers to a new Informatics Corpus Metatag sections to serve as models for own writing Language differences between introductions, methodology, conclusions

10 TALC 200610 Ran aground Demand for metadata – too fine-grained  too labour-intensive  few could see the point – unable to give priority to it Convoluted uploading interface

11 TALC 200611 Addressing Problem 2 “Build Corp”  “Corpus Builder”Corpus Builder  Configurable metadata list  POS tagging, lemmatization  Other transformations can be incorporated, e.g., HTML  text  Corpus configuration  Building Word sketches  Compiling statistical thesaurus  User accounts management

12 TALC 200612 Simplified user’s procedure  Interface for converting pdfs Abbyy FineReader  Save set in folder  Upload files  Metadata (ACM)  Notes provided to users Notes  Demo

13 TALC 200613 An Informatics Corpus is born Currently contains  202 documents  2,763,259 tokens  18 ACM categories (over half documents in one category)

14 TALC 200614 Uses to date Key term extraction herehere Illustrative sentences  Moodle’s glossary module Moodle Words in need of pronunciation attention Some worksheets of  adjectives with prepositions adjectives Website of sample searches Website

15 TALC 200615 What the future holds Language acquisition  consulting resources doesn’t guarantee retention  log corpus consultation  converted into interactive revision activities, automatically  researching the effectiveness of DDL

16 TALC 200616 What the future holds Corpus Builder  single click  keywords extraction  automatic conversion from various formats to plain text  POS tagging for LOTE  log user ’ s use


Download ppt "TALC 2006 1 Applying some Developments in Corpus Building Technology to Language Teaching and Learning TALC 2006 Paris."

Similar presentations


Ads by Google