Download presentation
Presentation is loading. Please wait.
1
True/False questions (3pts*2)
Language models assume words are independent within a given text document. True/False, explain: Comparing to the head words, tail words contribute more in inferring the relation, e.g., similarity, among the documents.
2
Multiple choice questions (4pts*2)
Which of the following method(s) will improve perplexity of a language model (estimated by maximum likelihood) on testing set: (a) increase training set (b) smoothing (c) reduce the order of N-gram (d) switch to Bayesian estimation Which of the following method(s) will reduce the vocabulary size for a vector space model (a) tokenization (b) stemming (c) normalization (d) N-grams
3
Short answer question (6pts)
Why cosine similarity is preferred over Euclidian distance in Vector Space Models?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.