Download presentation
Presentation is loading. Please wait.
Published byMuriel Harmon Modified over 9 years ago
1
Labels: automation Adam Kilgarriff
2
Auckland 2012Kilgarriff / Labels: automation2 Which words are: Most distinctive of business English? Most often in plural? For eg English nouns Most often used in gerund? For eg Spanish verbs
3
Auckland 2012Kilgarriff / Labels: automation3 Common issue for lexicographers Ordinary cases No need to say anything in dictionary Extreme cases (“most X”) Needs saying
4
Auckland 2012Kilgarriff / Labels: automation4 Not hard in principle Given the right corpus For each word Count, under condition 1 Eg plural instances Count, under condition 2 Eg all instances Compute ratio Sort all words according to ratio Words at top of list are most X
5
Auckland 2012Kilgarriff / Labels: automation5 In practice Programming task Big corpora: big and slow Slightly different each time Very rarely done (except Keywords in WordSmith) Now: automated in Sketch Engine demo
6
Auckland 2012Kilgarriff / Labels: automation6 FindX specification file: type 1 =plural name Q1 [lempos="%s" & tag="NN2"] Q2 [lempos="%s" & tag="NN[10]"] The two queries to compare frequencies for lempos="%s" means the list we want is a list of lempos (lemma + pos) RE ^[a-z]+-n$ items matching this RegExp only (here: all-lower-case nouns)
7
Auckland 2012Kilgarriff / Labels: automation7 FindX specification file: type 2 =passive name HR passives human-readable name (optional) WS passive use the word sketch relation ‘passive’ RE -v$ only for items matching this RegExp (here: only verbs) (optional)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.