Recap: Conditional Exponential Model Predication probability Model parameters: For each class y, we have weights wy and threshold cy Maximum likelihood estimation Translation invariance
Modified Conditional Exponential Model Set w1 to be a zero vector and c1 to be zero Predication probability Model parameter estimation
MaxEnt for Classification Problems Favor uniform distributions Maximizing entropy of distribution Consistent with training data Constraints on the mean of input features
Translation Problem Parameters: p(dans), p(en), p(au), p(a), p(pendant) Represent each French word with two features {dans, en} {dans, a} dans 1 en au-cours-de a pendant Empirical Average 0.3 0.5
Constraints
Maximum Entropy Formulation for the Translation Problem Solution: p(dans) = 0.2, p(a) = 0.3, p(en)=0.1, p(au-cours-de) = 0.2, p(pendant) = 0.2