Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist.

Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University TH@cs.brown.edu Chief Scientist & Co-founder RecomMind Inc., www.recommind.comwww.recommind.com

2 Three Key Challenges in IR … Robustness: Insensitivity of search results with respect to variations of query Structure & topicality: Extracting relevant concepts or topics and using those to improve accuracy and structure search result (e.g.). Integration: Statistical methods with prior/expert/linguistic knowledge, different cues (terms, links, credibility of source, …) Where do language models come in? Are these problems related?

3 Concept / Topic-Based View concept-specific language model What is a concept? – A (sparse) distribution over terms in the vocabulary. – Probabilities: How likely is it that a term will express a certain concept? – Concept=hidden, Term=observed document-specific "concept" model Concept-based document representation (Concept-based user representation)

4 From Concepts to Language Models Putting both ingredients together concept-based language model Semantic Language Model: – Unsupervised Learning: Probabilistic Latent Semantic Analysis (pLSA, SIGIR'99) – Qualitative pre-structuring of concepts based on thesauri, synsets, categories, topics, etc. – Quantitative model by use of statistical estimation!

5 Why Semantic Language Models? "Intelligent", domain-specific smoothing for document-specific unigram models Combines structure and numbers Linguistic resources can be integrated Category & topic information can be integrated User profiles can be integrated (combination with collaborative filtering) Results for ambiguous queries can be structured – most relevant for short queries & heterogeneous domain (Web Search [finally!]) – Other ways to intelligently interact with users.

6 Conclusion Using statistical estimation, language models allow us to enrich concept-based retrieval models with quantitative information. Semantic smoothing for improved language models. Integration of various sources of evidence. Richer models for interactive information access (they make sense).

Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist.

Similar presentations

Presentation on theme: "Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist.

Similar presentations

Presentation on theme: "Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist."— Presentation transcript:

Similar presentations

About project

Feedback