Spoken multimedia corpora for pedagogical purposes Sabine Braun (University of Surrey) Pascual Pérez-Paredes (Universidad de Murcia) Ylva Berglund (Oxford.

Slides:



Advertisements
Similar presentations
Common Core Standards (What this means in computer class)
Advertisements

IRCS Workshop on Open Language Archives IMDI & Endangered Languages Archives Heidi Johnson / AILLA.
Supporting further and higher education e-Learning and Pedagogy overview Helen Beetham Programme Consultant.
Mitglied der Leibniz-Gemeinschaft Querying Spoken Language Corpora Thomas Schmidt IDS Mannheim.
National Curriculum Framework n Exploring and Developing ideas n Investigating and making art, craft & design n Evaluating and developing work n Knowledge.
Integrating Digital Media & Branding
Relocation, relocation, relocation... MFL, History and PHSE moving together…
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Multilingual eLearning in LANGuage Engineering. Project Overview  Project span: Oct 2004 – Oct 2007  Kick-off meeting Oct  Project goals:
Applying blogs to a language learning context Tríona Hourigan Institute for the Study of Knowledge in Society University of Limerick.
IELTS Speaking English Language Centre The Hong Kong Polytechnic University.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
CALL – computer assisted language learning A short course delivered by Dr. Klaus Schwienhorst. MITE January 2002.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Making Use of Assessment Data for English Language Curriculum Planning 15 December 2006 English Language Education Section Curriculum Development Institute.
INACOL National Standards for Quality Online Teaching, Version 2.
“Technology aiding pedagogy in Language Teaching”
The 6 Principles of Second language learning (DEECD,2000) Beliefs and Understandings Assessment Principle Responsibility Principle Immersion Principle.
Smart Learning Services Based on Smart Cloud Computing
Curriculum Framework for Romani Seminar for decision makers and practitioners Council of Europe, 31 May and 1 June 2007 An introduction to the Curriculum.
Welcome Assessment Centres David Phillips Senior Assessment Partner DfT Resourcing Group.
Blended Language Learning Principles, concepts and experiences University of Bern, Department of English 14 April 2003 Prof Dr Kurt Kohn Chair of Applied.
1 DEVELOPING ASSESSMENT TOOLS FOR ESL Liz Davidson & Nadia Casarotto CMM General Studies and Further Education.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Margaret J. Cox King’s College London
Interstate New Teacher Assessment and Support Consortium (INTASC)
XP 1 HTML: The Language of the Web A Web page is a text file written in a language called Hypertext Markup Language. A markup language is a language that.
Translation Studies 8. Research methods in Translation Studies Krisztina Károly, Spring, 2006 Sources: Károly, 2002; Klaudy, 2003.
Framework for Diagnostic Teaching. Framework The framework for diagnostic teaching places a premium on tailoring programs that specifically fit all readers.
Reflections on Using Corpora Data in EFL Teaching CHEN BO Chongqing Jiaotong University 2006.
A good place to start !. Our aim is to develop in students ; Interest in & enjoyment of historical study; Skills for life long learning; The capacity.
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Introduction.
Thursday 9 th September 2010 Welcome to AS Language & Literature Success criteria: I understand the structure of the course. I know what will be expected.
ELA Common Core Shifts. Shift 1 Balancing Informational & Literary Text.
Customizing the IMDI metadata schema for endangered languages Heidi Johnson (AILLA) Arienne Dwyer (DOBES)
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Facilitating Learning in Professional Experience: Mentoring for Success Module 1 - An Introduction.
The linguistic integration of adult migrants: ways of evaluating policy and practice 24−25 June 2010 Summing up David Little.
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
Rutgers Multimedia Chinese Teaching System (RMCTS) MERLOT International Conference, August 7-10, 2008.
DLESE Metadata Story Katy Ginger Metadata Architect DLESE Program Center Session: Metadata stories – The Creation and Management of Metadata.
Similarities to my current programme of work Teaching of relevant strategies to be used whenever pupils listen and talk with others (e.g. one person speaking.
Pedagogic Corpora for Content & Language Integrated Learning Applied English Linguistics Group Tübingen This project has been funded with support from.
Themes, Contexts, & No Grammar Teaching to the New AP French Exam
L.O: To explore the language choices made in a bank. To become familiar with the mark scheme for Controlled Assessment. Week 3.
Anchor Standards ELA Standards marked with this symbol represent Kansas’s 15%
Movie Guides Would you like to… MOTIVATE STUDENTS USE AUTHENTIC MATERIAL OFFER VARIETY SURPRISE STUDENTS SUPPLEMENT EFL / ESL COURSE HAVE EVERYTHING.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
INTRODUCTION TO THE WIDA FRAMEWORK Presenter Affiliation Date.
BRAT: a web based tool for manual annotation Hans Paulussen ITEC, KU Leuven KULAK.
EL Program in a Nutshell EL Program Flow Chart.
© STZ Language Learning Media Telos Language Partner (TLP Pro) TLP Pro combines communication-oriented interactive self-study activities with intuitive.
English as an Additional Language or Dialect 2014/21125 © 2014 School Curriculum and Standards Authority.
© 2013 TILA Petra Hoffstaedter – Steinbeis-Transferzentrum Sprachlernmedien 1 Tila Teacher Training Tools for Synchronous and Asynchronous Telecollaboration.
WiMi Pedagogic Corpora – exploiting real language for authenticated learning Kurt Kohn Chair of Applied English Linguistics University of Tübingen Germany.
School – Based Assessment – Framework
Collecting Written Data
English as a Second Language 0511
Ryan McFall, Herb Dershem Dept. of Computer Science Hope College
Common European Framework of References (CEFR)
Teaching English to Speakers of Other Languages
Computational and Statistical Methods for Corpus Analysis: Overview
European Network on teacher Education Policies
Genre-Based Approach and the Competence-Based Curriculum
AS Language Transition to A2.
AET 510 Innovative Education-- snaptutorial.com
A Level English Language
Using GOLD to Tracking L2 Development
Presentation transcript:

Spoken multimedia corpora for pedagogical purposes Sabine Braun (University of Surrey) Pascual Pérez-Paredes (Universidad de Murcia) Ylva Berglund (Oxford University) Birmingham Corpus Linguistics Conference 2007

Introduction The usefulness of corpora in language pedagogy is widely recognised. But there is a need for pedagogically relevant corpora, reflected e.g. in initiatives to create 'ad-hoc' corpora in pedagogical contexts. The creation of pedagogically relevant corpora raises challenges for corpus design. Past and current initiatives have largely focussed on written corpora; spoken discourse is becoming more important in pedagogical contexts. The creation of pedagogically relevant spoken corpora raises additional challenges for corpus design.

The challenges (1) CORPUS DESIGN Traditional reference corpora (content, size, data format, transcription, annotation, query) CORPUS EXPLOITATION Data-Driven Learning (focus on non-linear reading: concordances and co-texts) Corpora contain textual records of discourse; their interpretation requires (re-)contextualisation. Learners may have difficulties analysing corpus data; they require pedagogical mediation. Pedagogical corpus uses differ from linguistic description; this requires e.g. pedagogically motivated query options. Corpora need to be integrated with curricula; this requires e.g. complementarity of content and effective delivery. Do not fully support pedagogical requirements.

The challenges (2) CORPUS DESIGN Traditionally: representation in written format CORPUS EXPLOITATION Work with text-only data and e.g. conversational markup Spoken discourse is more dependent on shared physical contexts. It is adjusted to aural and online perception (e.g. chunking) It is affected by limitations of processing capacity (false starts, repair). It is marked by accents. It is multimodal. Again, this does not fully support pedagogical requirements.

Requirements Format: multimedia to retain multimodal character of spoken language Content: complementary with curriculum topics, more coherence than in traditional corpora Pedagogically motivated transcription, annotation and alignment (transcript-video) Combination of query methods: text-based exploration and application of corpus techniques Pedagogical enrichment of corpora with complementary resources (e.g. exercises, explanations) Effective delivery of corpora and additional resources to learners/teachers

Corpus creation (1) ELISA Professional English Accounts of professional life Different varieties SACODEYL 7 European languages Youth language corpora Speakers and Examples: ELISA and SACODEYL Interview format Video clips with transcript Communicatively relevant topics, e.g. in SACODEYL topics outlined in the Common European Framework Elicitation process: briefing informants and prompting them during the interview, ensuring naturally flowing discourse

Corpus creation (1) TopicInterview questions AgeCEFGramm. functions Holidays1.Where did you spend your last holidays? A2 can describe past activities, personal experiences Past tense 2.What are your plans for the next holidays? 13-15B1 can describe dreams, hopes and ambitions Future Conditonal Modal verbs Plans for the future 1.What are your plans for your career? 16-18B1 can explain/give reasons for my plans, intentions and actions Future 2.On what grounds do you decide? 16-18B2 can speculate about causes, consequences, hypothetical situations Conditional Modal verbs Example of topics in SACODEYL

Corpus creation (2) Markup Pedagogic annotation XML files TEI-compliant corpora Transcription CONTINUUM RAW, ORTHOGRAPHIC TRANSCRIPTION – ANNOTATED CORPORA

Corpus creation (2) SACODEYL TRANSCRIPTOR SACODEYL ANNOTATOR Markup Pedagogic annotation XML files TEI-compliant corpora Transcription

Corpus creation (3) SACODEYL TRANSCRIPTOR

[METADATA] Title: La Unión Europea une a los ciudadanos Date Recording: Date Transcription: Locale:I.E.S. Floridablanca,Murcia, España Principal Investigator: Pascual Perez-Paredes Researcher:Pascual Perez-Paredes Transcriber: Encarnación Tornero Valero Editor: Autority: SACODEYL Project ID: Language:ES MediaFileName:ES02.avi Participants: person:Chico name: role: Entrevistado sex: Hombre age: 16 description: person: E name: Andrés Mercader Rodríguez role: Entrevistador sex: Hombre age: 32 description: [/METADATA] Corpus creation (2)

Corpus query Query options will support text- and corpus-based exploration and include e.g. –Easy access to entire interviews –A topic index supporting the analysis of similar sections across interviews ("topic concordances") –Other indices based on the annotation categories –Ready-made data (e.g. frequency lists of each interview; selective concordances) –A concordancer for extended/advanced search; adapted to pedagogical requirements

Corpus query

Pedagogical enrichment The corpora will be enriched with prototypical learning activities. These will focus on one interview section or one interview as a whole or sections across interviews… They will include e.g. –linguistic and cultural explanations and exercises (form-focussed as well as communication-oriented), –(listening) comprehension and production tasks, –explorative tasks (concordance-based as well as interview-based). Use of authoring tool Telos Language Partner to create learning packages with ranges of activities.

Pedagogical enrichment

Corpus delivery Effective delivery as a further prerequisite for integration into curriculum In SACODEYL, use of Moodle learning platform, giving access to: –Corpora (query interfaces) –Resources created in the project (different types of learning activities) –Resources created by future corpus users

Summary Method outlined is transferable to other pedagogical contexts, topics, languages Method helps to use corpora more efficiently in pedagogical contexts – from sporadically used resource to systematic exploitation Corpus creation complies with standards to facilitate reuse of corpora for other contexts (research) 

Contact Sabine Braun: Pascual Pérez-Paredes: Ylva Berglund: And visit our poster session… As well as our websites: