CS4705 Natural Language Processing Midterm Review CS4705 Natural Language Processing
Midterm Review Statistical v. Symbolic Processing Regular Expressions 80/20 Rule Regular Expressions Finite State Automata Determinism v. non-determinism (Weighted) Finite State Transducers Morphology Word Classes Inflectional v. Derivational Affixation, infixation, concatenation Morphotactics
Morphological parsing Koskenniemi’s two-level morphology Porter stemmer Minimum Edit Distance (Levenshtein) N-grams Markov assumption Chain Rule Language Modeling Simple, Adaptive, Class-based (syntax-based), bursty Smoothing Add-one, Witten-Bell, Good-Turing Back-off Perplexity, Entropy Maximum Likelihood Estimation
Syntax Context Free Grammars Chomsky’s view: Syntax is cognitive reality Parse Trees Dependency Structure Part-of-Speech Tagging Hand Written Rules v. Statistical v. Hybrid Brill Tagging Types of Ambiguity Context Free Grammars Top-down v. Bottom-up Derivations Left Corners Grammar Equivalence Normal Forms (CNF)
Probabilistic Parsing (p)CYK, Earley Parsing Derivational Probability Lexicalization Classification Supertagging Machine Learning Dependent v. Independent variables Training v. Development Test v. Test sets Feature Vectors Metrics Accuracy Precision, Recall, F-Measure Gold Standards
Semantics Meaning Representations Semantic Roles, Subcategorization frames FOPC Pros Cons Temporal Representations Richenbach Aspect Beliefs, Desires, Intention Representation Syntax-driven semantics