Download presentation
Presentation is loading. Please wait.
Published byRuby Williamson Modified over 9 years ago
1
GoogleDictionary Paul Nepywoda Alla Rozovskaya
2
Goal Develop a tool for English that, given a word, will illustrate its usage
3
Who Will Benefit Learners of English Teachers of English Native speakers who wish to find common usages of a word
4
Similar Tools? Dictionaries BUT our tool focuses on the usage of words and not on defining their meanings ranks expressions based on frequency extracts examples straight from context
5
Similar Tools? Google BUT our tool focuses on finding high frequency neighboring words instead of simply the documents that contain the target word
6
Data Resources Corpus of newspaper articles (3.5 Million words) [used for demo] Advantage: large amount of data Disadvantage: limited domain Use a search engine to build a corpus of documents containing the target word Advantages: various domains, dynamic data source Disadvantage: time to download documents
7
Implementation (1) Search a corpus to determine the most typical words by extracting words within a certain window of the target word and rank words based on their frequencies -compute rank of single words and pairs of words within a window
8
Implementation (2) Computing rank of expression Tf : raw count Idf of a word : Position Normalization: Reward context words closer to the target
9
Interface Output ranked list of expressions with example sentences via the Web Examples: course information notorious come come (without idf)
10
Further Improvements Use a search engine to build a corpus Allow phrase searching Provide option to search for highly frequent phrases as opposed to idiomatic expressions
11
Conclusion We have presented a tool that given a word will find typical usages of the word in natural language The tool should be useful for learners of English native speakers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.