True/False questions (3pts*2) Language models assume words are independent within a given text document. True/False, explain: Comparing to the head words, tail words contribute more in inferring the relation, e.g., similarity, among the documents.
Multiple choice questions (4pts*2) Which of the following method(s) will improve perplexity of a language model (estimated by maximum likelihood) on testing set: (a) increase training set (b) smoothing (c) reduce the order of N-gram (d) switch to Bayesian estimation Which of the following method(s) will reduce the vocabulary size for a vector space model (a) tokenization (b) stemming (c) normalization (d) N-grams
Short answer question (6pts) Why cosine similarity is preferred over Euclidian distance in Vector Space Models?