How I Learned to Stop Empiricising and Love my Intuitions Or: Why corpus research is like a tornado DOUGAL GRAHAM –
Me & My Research o Computational background o Academic Formulas List (Simpson-Vlach, et al, 2010) o AFL for Engineering English
Q 1.In which genre (spoken, fiction, newspaper, academic) is shall used most and in which the least, compared to will? 2.Put the following verbs in order of frequency (high to low): promise, shine, finish, enable, jump. 3.Which of the following would occur more frequently with little, and which with small: success, plate, hill, baby, impact, pieces, wonder, distance. (Davies, 2011)
Empirical approaches o Phrase research: “as shown in chapter” o Phrase list plus… o Empirical metrics: o Frequency o Range o Mutual Information o LL o FTW (Simpson-Vlach et al, 2010)
Results Three Words Four WordsFive Words what is thecan be used toat a rate of # the number ofas a function ofyou should be able to as shown inthe magnitude of thebeyond the scope of this # and #as shown in figurehow long will it take can be usedwith respect to the the first law of thermodynamics shown in figurein this chapter wein such a way that the value ofthe value of thethe rate of change of
Intuitively… o Results not so useful o Goal: “A useful list of formulaic Eng. phrases” o Re-visit metrics o Frequency o Range o Mutual Information o LL o FTW (Simpson-Vlach et al, 2010)
Re-evaluation o Intuitively, the results weren’t useful o Confusion o Martinez & Schmitt’s PHRASE List o Intuitive criteria
Problems o AFL approach o results not sufficiently useful o are the assumptions warranted? o PHRASE List approach o Criteria very intuitive o Hand-sorting 15,000 items
Liking my intuitions o Needs to be useful for learners o Should be difficult language o How can we determine the language that will be difficult?
Results Three Words Four WordsFive Words what is thecan be used toat a rate of # the number ofas a function ofyou should be able to as shown inthe magnitude of thebeyond the scope of this # and #as shown in figurehow long will it take can be usedwith respect to the the first law of thermodynamics shown in figurein this chapter wein such a way that the value ofthe value of thethe rate of change of
Semi-empirical o marked part of speech “for a given” o marked word form “is known as” o marked collocations “under the action of”
Semi-Intuitive o non-prototypical word meaning “let us consider” o non-literal phrase meaning “we can write” o specialized syntax “let X be”
1. Empiricism 2. Intuitive re-evaluation 3. Semi-empirical criteria 4. Semi-intuitive criteria 5. Results Intuition Empiricism
Embrace the tornado
Final Points o Embrace the tornado o Iterative design o Precision vs. Recall
Selected References Davies, M. (2011). Synchronic and diachronic use of corpora. In V. Viana, S. Zyngier, & G. Barnbrook (Eds.), Perspectives on corpus linguistics (Vol. 48, pp. 63–80). John Benjamins Publishing. Martinez, R., & Schmitt, N. (2012). A Phrasal Expressions List. Applied Linguistics, 33(3), 299–320. Simpson-Vlach, R., & Ellis, N. C. (2010). An Academic Formulas List: New Methods in Phraseology Research. Applied Linguistics, 31(4), 487–512. doi: /applin/amp058