True/False questions (3pts*2)

Slides:



Advertisements
Similar presentations
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
Advertisements

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
Text Categorization CSC 575 Intelligent Information Retrieval.
Bag-of-Words Methods for Text Mining CSCI-GA.2590 – Lecture 2A
Chapter 7 Retrieval Models.
Formal Multinomial and Multiple- Bernoulli Language Models Don Metzler.
Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(
Lecture 14-2 Multinomial logit (Maddala Ch 12.2)
Computer vision: models, learning and inference
Term weighting and vector representation of text Lecture 3.
Language Modeling Approaches for Information Retrieval Rong Jin.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.
2015 AP US History Exam. Section I Part A: Multiple Choice 50–55 Questions | 55 Minutes | 40% of Exam Score Questions appear in sets of 2–5. Students.
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.
Chapter6. Statistical Inference : n-gram Model over Sparse Data 이 동 훈 Foundations of Statistic Natural Language Processing.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 7. Topic Extraction.
Chapter 23: Probabilistic Language Models April 13, 2004.
Artificial Intelligence 8. Supervised and unsupervised learning Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.
Bag-of-Words Methods for Text Mining CSCI-GA.2590 – Lecture 2A Ralph Grishman NYU.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
Learning In Bayesian Networks. General Learning Problem Set of random variables X = {X 1, X 2, X 3, X 4, …} Training set D = { X (1), X (2), …, X (N)
Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Full modeling versus summarizing gene- tree uncertainty: Method choice and species-tree accuracy L.L. Knowles et al., Molecular Phylogenetics and Evolution.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS KNOWLEDGE-BASED.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 14: Language Models for IR.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
Spaces.
Lecture 13: Language Models for IR
Instance Based Learning
MSA Question Stems Mr. Harpine.
CSCI 5417 Information Retrieval Systems Jim Martin
Bayes Net Learning: Bayesian Approaches
Vector Space Model Seminar Social Media Mining University UC3M
Oliver Schulte Machine Learning 726
Lecture 15: Text Classification & Naive Bayes
Vector Space Model Hongning Wang
Nonfiction Book Report Project DUE 1/13/17
Neural Language Model CS246 Junghoo “John” Cho.
Relevance Feedback Hongning Wang
CSCI 5822 Probabilistic Models of Human and Machine Learning
ISTEP+ Online Resources
Introduction to Statistical Modeling
CS4705 Natural Language Processing
Text Categorization Assigning documents to a fixed set of categories
Instance Based Learning
CSCI 5822 Probabilistic Models of Human and Machine Learning
Pre- Reading PET.
CS4705 Natural Language Processing
LECTURE 07: BAYESIAN ESTIMATION
CS 4501: Information Retrieval
Topic Models in Text Processing
Text Categorization Berlin Chen 2003 Reference:
CS590I: Information Retrieval
Retrieval Utilities Relevance feedback Clustering
IB Exam Paper 1 – 90 minutes You read 4 to 5 articles
INF 141: Information Retrieval
Question-Answer Relationships
Constitution Test Format
How can I succeed on the exam?
7 pt 7 pt 7 pt 7 pt 7 pt 14 pt 14 pt 14 pt 14 pt 14 pt 21 pt 21 pt
Presentation transcript:

True/False questions (3pts*2) Language models assume words are independent within a given text document. True/False, explain: Comparing to the head words, tail words contribute more in inferring the relation, e.g., similarity, among the documents.

Multiple choice questions (4pts*2) Which of the following method(s) will improve perplexity of a language model (estimated by maximum likelihood) on testing set: (a) increase training set (b) smoothing (c) reduce the order of N-gram (d) switch to Bayesian estimation Which of the following method(s) will reduce the vocabulary size for a vector space model (a) tokenization (b) stemming (c) normalization (d) N-grams

Short answer question (6pts) Why cosine similarity is preferred over Euclidian distance in Vector Space Models?