True/False questions (3pts*2)

True/False questions (3pts*2)
Language models assume words are independent within a given text document. True/False, explain: Comparing to the head words, tail words contribute more in inferring the relation, e.g., similarity, among the documents.

Multiple choice questions (4pts*2)
Which of the following method(s) will improve perplexity of a language model (estimated by maximum likelihood) on testing set: (a) increase training set (b) smoothing (c) reduce the order of N-gram (d) switch to Bayesian estimation Which of the following method(s) will reduce the vocabulary size for a vector space model (a) tokenization (b) stemming (c) normalization (d) N-grams

Short answer question (6pts)
Why cosine similarity is preferred over Euclidian distance in Vector Space Models?

True/False questions (3pts*2)

Similar presentations

Presentation on theme: "True/False questions (3pts*2)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

True/False questions (3pts*2)

Similar presentations

Presentation on theme: "True/False questions (3pts*2)"— Presentation transcript:

Similar presentations

About project

Feedback