Presentation is loading. Please wait.

Presentation is loading. Please wait.

751-3.

Similar presentations


Presentation on theme: "751-3."— Presentation transcript:

1 751-3

2 Distinctive words Examined frequency list comparison to obtain positive and negative keywords Alternative – look for content words in a single frequency list Alternative – use a stop word list to filter out grammatical words

3 Assignment

4 Stop list Example of a stop list
a about above across after again against all almost alone along already also although always among an and another any anybody anyone anything anywhere are area areas around as ask asked asking asks at away b back backed backing backs be became because become becomes been before began behind being beings best better between big both but by c came can cannot case cases certain certainly clear clearly come could d did differ different differently do does done down down downed downing downs during e each early either end ended ending ends enough even evenly ever every everybody everyone everything everywhere f face faces fact facts far felt few find finds first for four from full fully further furthered furthering furthers g gave general generally get gets give given gives go going good goods got great greater greatest group grouped grouping groups h had has have having he her here herself high high high higher highest him himself his how however i if important in interest interested interesting interests into is it its itself j just k keep keeps kind knew know known knows l large largely last later latest least less let lets like likely long longer longest m made make making man many may me member members men might more most mostly mr mrs much must my myself n necessary need needed needing needs never new new newer newest next no nobody non noone not nothing now nowhere number numbers o of off often old older oldest on once one only open opened opening opens or order ordered ordering orders other others our out over p part parted parting parts per perhaps place places point pointed pointing points possible present presented presenting presents problem problems put puts q quite r rather really right right room rooms s said same saw say says second seconds see seem seemed seeming seems sees several shall she should show showed showing shows side sides since small smaller smallest so some somebody someone something somewhere state states still still such sure t take taken than that the their them then there therefore these they thing things think thinks this those though thought thoughts three through thus to today together too took toward turn turned turning turns two u under until up upon us use used uses v very w want wanted wanting wants was way ways we well wells went were what when where whether which while who whole whose why will with within without work worked working works would x y year years yet you young younger youngest your yours z

5 Word cloud wordle.net jasondavies.com/wordlist

6 From my notes

7 Another view

8 Analysing concordance lines
Worksheet

9 Teaching Acad Eng What to teach What is Academic English
How do we go about teaching it

10 General experience Find an “academic writing textbook”
Solves the problems: What is academic English How to teach academic English But still … How does the author know what academic English is?

11 Find/create a suitable corpus
Now we can have a corpus of academic English From that we can get, for example, a wordlist or other linguistic patterns

12 Academic corpus what is academic English
The English that academics use Which academics What language Lectures/Ppts/Tutorial/Morning tea/Conference presentations/Abstracts/Articles

13 Academic English corpus
representativeness ? Academic English Academic English corpus

14 Academic English corpus
representativeness ? Academic English Academic English corpus If you were going to create an academic corpus, what would you include (and in what proportions)?

15 Academic corpus Typically we don’t have access to our preferred corpus
Existing corpora MICASE, MICUSP BASE, BAWE Some people create a corpus from journal articles

16 MICASE Transcripts from lectures in a variety of disciplines
Assignment is based on the Physical Science files. A keyword analysis with the Times files as the reference corpus. Select 15 keywords Use the keywords to select 15 collocations/phrases Some reflections on the process

17 Moving on from wordlists
In week 1 we looked at single words and frequency lists How do we move from single words to larger units? What are the larger units Grammatical units – verb phrase, sentence, … (requires Part of Speech tags or parsing) Collocations – recurrent word combinations (tea time, class schedule, point of view, take place, …)

18 ngrams One way to get sequences rather than words is to make a frequency list for word pairs, word triples, etc. Called bigrams, trigrams, etc. How many bigrams in a 1000 word corpus?

19 What are lexical bundles?
Term  from  Doug  Biber Similar  to  n-grams With  a  minimum  frequency A  minimum  range  (e.g.  must  occur  in  15  out  of  the 20  files  in  the  corpus)

20 Lexical bundles in academic registers
Academic registers vs. Non-academic registers Across academic registers Disciplinary variations in the same register

21 Corpus used – Biber et al

22 Examples of LB in academic registers
Referential expression: at the bottom of is one of the Discourse organizer: on the other hand in addition to the Stance expression: it is difficult to it is important to

23 Disciplinary variation

24 Collocations More or less recognisable, but not definable
No computer program can produce a definitive list of collocations – only a list of candidates

25 Computer identification of chunks
Frequency alone cannot be used because a highly frequent sequence may not be a unit E.g., “and of the”, “that we will” Need to manually check word-sequences We can also use a statistic like Mutual Information or Log Likelihood to give different views of multiword lists

26 Expanded lexicon Idea that L2 learners have to be familiar with collocations as well as individual words -- change of heart, coffee cup, coffee beans, drip coffee, coffee shop.

27 Collocation lists Phrasal Expressions List Academic Formulas List
Ron Martinez and Norbert Schmitt Applied Linguistics Aug 2012 Academic Formulas List Rita Simpson and Nick Ellis Applied Linguistics 2010

28 AFL-Simpson & Ellis (Nick)
Frequent recurrent patterns Distinctive of academic texts (like keywords) Occurring in a range of academic genres Referred to as “range” or “dispersion”

29 AFL-Simpson & Ellis (Nick)
Extracted word sequences Comparison with non-academic texts (Used Log Likelihood – same as keyword analysis) Occurring in a range of academic genres 4 out of 5 Academic Divisions Used teachers to assess coherence of sequences in order to get the most reliable statistic Ranked using a frequency and MI measure

30 Simpson and Ellis Phrases organised by FTW
FTW = Formula Teaching Worth

31 A Phrasal Expressions List
Martinez and Schmitt Note that they used a different methodology Highlight non-compositional sequences (e.g., at all) – those likely to cause difficulties for learners Ngram analysis plus manual selection Consider relation of phrasal lists on word lists (and coverage)

32 Statistics Based on probability
How likely is it that some event or outcome is based on chance (tossing coins) Applied to experimental data: drug trials, teaching methods Statements about the outcome: the probability that the outcome occurred by chance is less than 1 in 100. (p < 0.01)

33 Statistics and texts I view stats as a way of ranking (presenting) data for you to examine We cannot make statements such as “there is a 1 in 100 probability that this text data occurred by chance” We can note that a word pair have a high Mutual Information score Not an experiment. Text data is not random.


Download ppt "751-3."

Similar presentations


Ads by Google