Identifying terms with similar meanings across corpora

Slides:



Advertisements
Similar presentations
INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Advertisements

The Role of Twitter in YouTube Videos Diffusion George Christodoulou EPFL Switzerland Laboratory for Internet Computing Department of Computer Science.
© All Rights Reserved Web Browser A software application that enables you to view and interact with pages on the World Wide Web. Examples.
Communicating Information: Web Design. It’s a big net HTTP FTP TCP/IP SMTP protocols The Internet The Internet is a network of networks… It connects millions.
Web Search - Summer Term 2006 III. Web Search - Introduction (Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
Scoped web site search: an Australian university case study.
Mehran Sahami Timothy D. Heilman A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets.
1 Extending Link-based Algorithms for Similar Web Pages with Neighborhood Structure Allen, Zhenjiang LIN CSE, CUHK 13 Dec 2006.
HTML Notes Chapters 1--6 Codes used in creating HTML documents are called tags. Tags are always enclosed in left ( ) angle brackets. Tags can be upper.
Internet ISP ISP ISP Your Computer Your Friend ’s Computer Internet Service Provider ( ISP )
Citation Recommendation 1 Web Technology Laboratory Ferdowsi University of Mashhad.
2008 International Workshop on Web and Databases (WebDB) Efficient Web-Based Linkage of Short to Long Forms Yee Fan Tan 1, Ergin Elmacioglu 2, Min-Yen.
Who created Google? The very popular search engine called Google was invented by Larry Page and Sergey Brin in 1996 at Standford university while doing.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
T HE U SES O F C OMPUTERS The Slide Show will Begin Shortly.
The Future of the Internet: Web 3.0 and 3-D Web Matt Crosslin, M.Ed. Instructional Designer University of Texas at Arlington's Center for Distance Education.
Multimedia & The World Wide Web winny HCI 201 Multimedia and the www.
HTML Lesson 2. Review Questions  What are HTML tags used for?  What do HTML tags look like?  What are the 3 required HTML tags?  In what section of.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
MapReduce. Google and MapReduce Google searches billions of web pages very, very quickly How? It uses a technique called “MapReduce” to distribute the.
1 Web Search What is a telescope? Who invented the telescope? 2 Web Search What are other powerful telescopes? 3 Thinking Why are telescopes important.
 First go on google, type in glitter text and click on the link..  Next type in the text and the design you want then press the create text button.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Dinosaurs During this unit you will be briefly covering several different topics to give you a taste of ICT. The overall aim of this unit is to create.
The Internet and WWW. This term we shall be learning more about the Internet and World Wide Web and their benefits and uses. We shall also be learning.
1 Web Search/Thinkin g What does an operating system do? 2 Web Search/Thinkin g What would happen if a computer did not have an operating system?
1 Web Search/Thinkin g What exactly is the internet? How is it different from the world wide web? 2 Web Search/Thinkin g How were early computers.
HTML Introduction. Lecture 7 What we will cover…  Understanding the first html code…  Tags o two-sided tags o one-sided tags  Block level elements.
Test1 Here some text. Text 2 More text.
CS 533 – 5 min. Presentations M. Sami Arpa Enes Taylan
2.7 Communication Methods
The Internet.
Image-based Tree Branch Recovery
University of Texas Mobile Library Search Application Architecture
ONLINE LITERATURE SEARCH By Using MEDLINE PUBMED HINARI
Nobel Peace Prize Winner, 2001
Finding replicated web collections
מבנה בסיסי של מסמך html מסמך ב- html מורכב מתגיות.
Finding and Evaluation Information
شبكة الانترنت العالمية
MapReduce.
الانترنت والبريد الإلكتروني
Correlation of Term Count and Document Frequency for Google N-Grams
ثانيا :أدوات البحث عبر الانترنت
[type text here] [type text here] [type text here] [type text here]
Your text here Your text here Your text here Your text here Your text here Pooky.Pandas.
How do I research effectively? Part 2
The Big 6 Research Model Step 3: Location and Access
SJSJ Ms. Pirtle, Technology Class
Searching with context
Your text here Your text here Your text here Your text here
The Internet What is the Internet? The Internet is a global web of computers connected to each other by wires, (mostly phone lines). If you look at a.
Networks and the Internet
[type text here] [type text here] [type text here] [type text here]
the title of the presentation to go here
Welcome to the class!.
Correlation of Term Count and Document Frequency for Google N-Grams
Ойыны Тапқан – тапқандікі, Көкпар - тартқандікі. Ойынды бастау.
CS/INFO 430 Information Retrieval
Question Answer System Deliverable #2
Type your presentation title here
Put the title here Here is the sub-title.
Put the title here Here is the sub-title.
The Internet.
Similarities Differences
R-tree – Another Example (1/2)
Journal of Investigative Dermatology
ITU-T Kaleidoscope Conference Innovations in NGN
Un Viaje Virtual a España
Professor Shriram Krishnamurthi
Presentation transcript:

Identifying terms with similar meanings across corpora

Sahami and Heilman My Project Kofi Annan UN Secretary General Google(Kofi Annan) Google(UN Secretary General) My Project ForeignAffairs(Kofi Annan) Google(Kofi Annan) BioDatabase(Python) Google(Python)

Main Program Google Search API Web Lucene Pre-computed IDFs

Best Results So Far IMDB “Apocalypse Now” and “Gothika” clearly identified as popular. “The Body”, “Summer School”, “Antitrust” clearly identified as… overshadowed by other meanings. Compound identification (actor names, etc.) would probably be a big help here.

References Sahami, M. and Heilman, T. D. 2006. A web-based kernel function for measuring the similarity of short text snippets. In Proceedings of the 15th International Conference on World Wide Web (Edinburgh, Scotland, May 23 - 26, 2006). WWW '06. ACM Press, New York, NY, 377-386. DOI= http://doi.acm.org/10.1145/1135777.1135834