Download presentation
Presentation is loading. Please wait.
1
Identifying terms with similar meanings across corpora
2
Sahami and Heilman My Project Kofi Annan UN Secretary General
Google(Kofi Annan) Google(UN Secretary General) My Project ForeignAffairs(Kofi Annan) Google(Kofi Annan) BioDatabase(Python) Google(Python)
3
Main Program Google Search API Web Lucene Pre-computed IDFs
4
Best Results So Far IMDB
“Apocalypse Now” and “Gothika” clearly identified as popular. “The Body”, “Summer School”, “Antitrust” clearly identified as… overshadowed by other meanings. Compound identification (actor names, etc.) would probably be a big help here.
5
References Sahami, M. and Heilman, T. D A web-based kernel function for measuring the similarity of short text snippets. In Proceedings of the 15th International Conference on World Wide Web (Edinburgh, Scotland, May , 2006). WWW '06. ACM Press, New York, NY, DOI=
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.