How I Learned to Stop Empiricising and Love my Intuitions Or: Why corpus research is like a tornado DOUGAL GRAHAM –

Slides:



Advertisements
Similar presentations
The English Vocabulary Profile
Advertisements

Corpora in grammatical studies
Diachronic study and language change Corpus Linguistics Richard Xiao
An investigation into Corpus-based learning about language inin the primary-school: CLLIP Corpus evidence of the features of childrens literature.
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES language teaching (1) Bambang Kaswanti Purwo
Improved TF-IDF Ranker
Diachronic study and language change Corpus Linguistics Richard Xiao
Uses of a Corpus “[E]xplore actual patterns of language use”
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
Outline What is a collocation?
The Nature of Learner Language
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
1/26 Corpus Linguistics. 2/26 Varieties of English Relevance of corpus linguistics to this course –Previously studies of stylistics were largely informal.
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
Presented by Jennifer Robison TexTESOL II March 12, 2010 San Antonio, TX.
Corpus Linguistics Case study 2 Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics, London: Longman,
Chapter 3: An Introduction to Corpus Linguistics Compiled by: Sajjad Ghadamyari Farhad Ghiasvand Presentation Date: Dec. 8, Monday.
Memory Strategy – Using Mental Images
An Element of Voice. …is the way words are arranged in sentences. In other words, syntax is sentence structure. Syntax includes these important elements:
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
Linguistics, Pragmatics & Natural Grammar
The Vocabulary Coverage in American Television Programs A Corpus-Based Study NA3C 0006 Christina 周惠娟 1.
Researching language with computers Paul Thompson.
Genre in a Frequency Dictionary Adam Kilgarriff & Carole Tiberius.
Why We Need Corpora and the Sketch Engine Adam Kilgarriff Lexical Computing Ltd, UK Universities of Leeds and Sussex.
PIECES OF ME LEARNING INTENTION (L.I): To show what is important to me by producing a “Pieces of Me” project STARTER: Can you speak any languages apart.
Corpus Evaluation Adam Kilgarriff Lexical Computing Ltd Corpus evaluationPortsmouth Nov
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
How Can Corpora Help Me To Be Successful in CO150?
CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL
RESEARCH DESIGN & CORPUS COMPILATION. Corpus design is intrinsic and a fundamental part of the analysis. It is guided by the RQ and affects the results.
Corpus search What are the most common words in English
SIMS 296a-4 Text Data Mining Marti Hearst UC Berkeley SIMS.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
1301 Spring 2016 Day 16 Lessons. What to expect today: Questions over the reading? Check Attendance About MLA Signal Phrases & Standard MLA MLA in-text.
What is a Corpus? What is not a corpus?  the Web  collection of citations  a text Definition of a corpus “A corpus is a collection of pieces of language.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
The Origin of Language Curriculum Development
Poetry Corrects Reality.But How Do Readers Deal with This? Willie van Peer Ludwig Maximilian University Munich & Anna Chesnokova Borys Grinchenko University.
To teach or not to teach: the effectiveness of overtly teaching formulaic phrasing in Academic Practice Julie Wilson, Teaching Fellow, Durham University.
AMANY ALKHAYAT PSCW ENG371 INTRODUCTION TO CORPUS PROCESSING Corpus Processing Ch1.
751-3.
Statistical NLP: Lecture 7
Vocabulary acquisition in language classrooms
From: A Phrasal Expressions List
Reading and Frequency Lists

Corpus Linguistics I ENG 617
عمادة التعلم الإلكتروني والتعليم عن بعد
Introduction to Corpus Linguistics: Exploring Collocation
Introduction to Corpus Linguistics: Applications Lexicography
Corpus Linguistics I ENG 617
Introduction to Corpus Linguistics: Colligation
Corpus-Based ELT CEL Symposium Creating Learning Designers
The documentation format of the Modern Language Association
: 2018.
Contextual Analysis Context governs our linguistics choice.
VOCABULARY ASSESSMENT
Abdullah Alasmary King Saud University
Reminders Outliers Reading Schedule – be sure to keep up with your reading! Chapters 3-5 due Monday we return from Thanksgiving. Outliers Next Reading.
Chapter 2 What speakers know.
Applied Linguistics Chapter Four: Corpus Linguistics
The documentation format of the Modern Language Association
The documentation format of the Modern Language Association
Sociolinguistics.
Performance Criteria across ELP Levels
Definition of a corpus Research on written or spoken texts can now be carried out with corpus linguistics. The notion of a corpus as the basis for a form.
Presentation transcript:

How I Learned to Stop Empiricising and Love my Intuitions Or: Why corpus research is like a tornado DOUGAL GRAHAM –

Me & My Research o Computational background o Academic Formulas List (Simpson-Vlach, et al, 2010) o AFL for Engineering English

Q 1.In which genre (spoken, fiction, newspaper, academic) is shall used most and in which the least, compared to will? 2.Put the following verbs in order of frequency (high to low): promise, shine, finish, enable, jump. 3.Which of the following would occur more frequently with little, and which with small: success, plate, hill, baby, impact, pieces, wonder, distance. (Davies, 2011)

Empirical approaches o Phrase research: “as shown in chapter” o Phrase list plus… o Empirical metrics: o Frequency o Range o Mutual Information o LL o FTW (Simpson-Vlach et al, 2010)

Results Three Words Four WordsFive Words what is thecan be used toat a rate of # the number ofas a function ofyou should be able to as shown inthe magnitude of thebeyond the scope of this # and #as shown in figurehow long will it take can be usedwith respect to the the first law of thermodynamics shown in figurein this chapter wein such a way that the value ofthe value of thethe rate of change of

Intuitively… o Results not so useful o Goal: “A useful list of formulaic Eng. phrases” o Re-visit metrics o Frequency o Range o Mutual Information o LL o FTW (Simpson-Vlach et al, 2010)

Re-evaluation o Intuitively, the results weren’t useful o Confusion o Martinez & Schmitt’s PHRASE List o Intuitive criteria

Problems o AFL approach o results not sufficiently useful o are the assumptions warranted? o PHRASE List approach o Criteria very intuitive o Hand-sorting 15,000 items

Liking my intuitions o Needs to be useful for learners o Should be difficult language o How can we determine the language that will be difficult?

Results Three Words Four WordsFive Words what is thecan be used toat a rate of # the number ofas a function ofyou should be able to as shown inthe magnitude of thebeyond the scope of this # and #as shown in figurehow long will it take can be usedwith respect to the the first law of thermodynamics shown in figurein this chapter wein such a way that the value ofthe value of thethe rate of change of

Semi-empirical o marked part of speech “for a given” o marked word form “is known as” o marked collocations “under the action of”

Semi-Intuitive o non-prototypical word meaning “let us consider” o non-literal phrase meaning “we can write” o specialized syntax “let X be”

1. Empiricism 2. Intuitive re-evaluation 3. Semi-empirical criteria 4. Semi-intuitive criteria 5. Results Intuition Empiricism

Embrace the tornado

Final Points o Embrace the tornado o Iterative design o Precision vs. Recall

Selected References Davies, M. (2011). Synchronic and diachronic use of corpora. In V. Viana, S. Zyngier, & G. Barnbrook (Eds.), Perspectives on corpus linguistics (Vol. 48, pp. 63–80). John Benjamins Publishing. Martinez, R., & Schmitt, N. (2012). A Phrasal Expressions List. Applied Linguistics, 33(3), 299–320. Simpson-Vlach, R., & Ellis, N. C. (2010). An Academic Formulas List: New Methods in Phraseology Research. Applied Linguistics, 31(4), 487–512. doi: /applin/amp058