Presentation is loading. Please wait.

Presentation is loading. Please wait.

ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo

Similar presentations


Presentation on theme: "ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo"— Presentation transcript:

1 ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

2 [Hunston ch.1] » corpus by itself – can do nothing at all just a store of used language » corpus access software ▪ re-arrange the store ▪ enable observations of various kinds to be made ▪ process data from corpus in three ways: showing frequency phraseology collocation [O’Keefee 1.5] basic corpus linguistic techniques using standard software such as Wordsmith Tools (Scott 1999) Monoconc Pro (2000) ▪ concordancing ▪ word frequency counts ▪ key word analysis ▪ cluster analysis

3 » concordancing ▪ a core tool in CL ▪ to find every occurrence of a particular word or phrase (the search word or phrase  “node”) » word frequency counts or wordlists ▪ rapid calculation of word frequency lists (wordlists) for any batch of texts ▪ with a rank ordering of all the words in order of frequency » key word analysis ▪ words whose frequency is unusually high in comparison with some norm ▪ not usually the most frequent words in a text, but the more “unusually frequent” ▪ useful way of characterizing a text or a genre

4 » cluster analysis ▪ how language systematically clusters into combinations of words or “chunks” (e.g. I mean, this that and the other, etc.) ▪ how this contribute to the description of the vocabulary of a language (to help Ls acquire vocab n develop fluency) ▪ 2-, 3-, 4-, 5-, or 6-word combinations [key word analysis] ▪ potential applications in the areas of forensic linguistics stylistics content analysis text retrieval [in ELT] to create word lists (LSP Programs)

5 lexico-grammatical profiles [when looking at concordance lines]  create a “lexico-grammatical profile” of a word and its contexts of use 1. collocates ▪ which word(s) occur most frequently w/ statistical significance in the word’s environment? 2. chunks/idioms ▪ does the word form part of any recurrent chunks? ▪ is the word idiom-prone? ▪ what types occur (e.g. binominal or trinominals)? (rough n ready; ready, willing and able) 3. syntactic restrictions ▪ are there syntactic patterns that restrict the word? (e.g. prepositions that go with the word?) ▪ what are the typical clause-position (initial/medial/final)? ▪ are there any tense/aspect restrictions

6 4. semantic restrictions ▪ are there semantic restrictions? (e.g. applied to [+HUM] only, never with an intensifier) 5. (semantic) prosody ▪ words, as well as having typical collocates (e.g. blonde collocates w/ hair, not w/ car) tend to occur in particular environments: positive or negative ۰ 90% of collocates of cause are negative (accident, cancer, commotion, crisis, delay) ۰ provide collocates with words of positive connotation (care, food, help, jobs, relief, support)

7 [O’Keefee Ch. 3] traditional view of vocabulary: vocabulary = all the single words of language over years, in the light of corpus analysis: open the criteria to search for recurrences of more than one word (i.e. pairs and trios of words, even larger groupings) ▪ “chunks” like a couple of, at the moment, all the time as frequent as single words (possible, alone, fun, expensive) ▪ single words has been widely considered to be the basic unit units of more than one word (phrasal verbs, compound, idioms)  higher level of proficiency exceptions: ۰ greetings and everyday expressions how are things? see you tomorrow, thanks very much

8 ۰ specialized functional phrases Happy New Year, good luck ۰ common prepositional phrases at the weekend, on the first of May ۰ a high-frequency compounds bus stop, whiteboard » collocation ۰ groupings of more than one word + unitary of meaning and specialized functions ۰ statistical tendency of words to co-occur (Hunston 2001:12) ۰ collocations are not absolute or deterministic, but are probabilistic events (resulting from repeated combinations used n encountered by speakers of any language e.g. strong tea, powerful cars ۰ common verbs display distinct preferences for what they combine with: things turn or go grey, brown, white people go (*turn) mad, insane, bald, blind

9 » strings of words in corpora ▪ CL: it is lexis, rather than syntax, which accounts for the organization and patterning of language ▪ two fundamental principles at work in the creation of meaning: the “idiom principle” the “open choice principle” ▪ syntax, the slots where there are choices to be made (the open choice principle) far from being primary; only brought into service occasionally, a kind of “glue” to cement the lexical chunks together ▪ form n meaning work hand in hand Cambridge International Corpus: [100 examples of be touched by] 14% ‘experience physical contact’ 86% nonphysical meaning, 80% of which ‘emotionally affected by’  touch [+passive]: nonphysical senses

10 » phraseology and idiomaticity ▪ contributors to the understanding of multi-word vocabulary: ۰ corpus linguistics ۰phraseology and the study of idiomaticity (for Ts n Ls) ▪ different terminologies to describe the phenomena of multi-word vocabulary or chunks ۰ lexical phrases ۰ prefabricated patterns ۰ routine formulae ۰ formulaic sequences ۰ lexicalized stems ۰ chunks ۰ (restricted) collocations ۰ fixed expressions ۰ multi-word units/expressions ۰ idioms ۰ etc. multi-word phenomena  fundamental feature of language use

11 concordance lines – many instances of use of a word or phrase “latent patterning” – phraseology [Hunston Ch.1] phraseology vs. how Ts explain “confusing adjectives” such as interested and interesting ▪ “the minimal pair” the boy is interested n the boy is interesting ▪ concordance lines: frequent pattern of ۰ interested: “someone is interested in something” ۰ interesting: always preceded by a noun: “an interesting thing”, “what is interesting is …”, “it is interesting to see …”

12 reference books have difficulty explaining between n through a phraseological approach: between: frequently found after nouns such as difference, distinction, gap, contrast, conflict, n quarrel relationship, agreement, comparison, meeting, contact, correlation through: frequently found after verbs such as go, pass, come, run, fall, n lead “semantic functions” between has a “location” meaning the channel between Africa n Sicily earnings between L5 and L6 a week through has an “instrumental meaning”

13 NSs often recognize if a phraseology is unusual to explain why that is the case is not easy “require to be done” seems wrong to Owen’s (1996) intuition Bank of English: “REQUIRE to be” fairly frequently the past participle verbs to follow [+SPEC], Not a general verb such as do. These roses require to be pruned each spring require to be done very few (3 out of 302)  Owen’s intuitions backed up by evidence of the corpus (on phraseology, not grammatical grounds)

14 What’s the contribution of NS’s intuition? make generalizations from a mass of specific info in a corpus e.g. Bank of English: CONTACT – verb + noun (Sripicharn 1998) typically used with “official” persons (office, newspaper, etc.) contact your travel agent also found when the person a family member or a friend she had no contact with her father  the difference between two kinds of noun (travel agent n father) is important (Sripicharn 1998)

15 REFERENCES Hunston, Susan. 2002. Corpora in Applied Linguistics. Cambridge UP. O’Keeffe, Anne; Michael McCarthy; Ronaldo Carter. 2007. From Corpus to Classroom: Language Use and Language Teaching. Cambridge UP.


Download ppt "ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo"

Similar presentations


Ads by Google