Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers.

Similar presentations


Presentation on theme: "CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers."— Presentation transcript:

1 CS4705 Natural Language Processing

2  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers  Morphology ◦ Word Classes and p.o.s. ◦ Inflectional v. Derivational ◦ Affixation, infixation, concatenation ◦ Morphotactics

3 ◦ Different languages, different morphologies ◦ Evidence from human performance  Morphological parsing ◦ Koskenniemi’s two-level morphology ◦ FSAs vs. FSTs ◦ Porter stemmer  Noise channel model ◦ Bayesian inference

4  N-grams ◦ Markov assumption ◦ Chain Rule ◦ Language Modeling  Simple, Adaptive, Class-based (syntax-based)  Smoothing  Add-one, Witten-Bell, Good-Turing  Back-off models

5  Creating and using ngram LMs ◦ Corpora ◦ Maximum Likelihood Estimation  Syntax ◦ Chomsky’s view: Syntax is cognitive reality ◦ Parse Trees  Dependency Structure ◦ What is a good parse tree? ◦ Part-of-Speech Tagging  Hand Written Rules v. Statistical v. Hybrid  Brill Tagging  HMMs

6 ◦ Types of Ambiguity  Context Free Grammars ◦ Top-down v. Bottom-up Derivations  Early Algorithm ◦ Grammar Equivalence ◦ Normal Forms (CNF) ◦ Modifying the grammar  Probabilistic Parsing ◦ Derivational Probability ◦ Computing probabilities for a rule ◦ Choosing a rule probabilistically ◦ Lexicalization

7  Machine Learning ◦ Dependent v. Independent variables ◦ Training v. Development Test v. Test sets ◦ Feature Vectors ◦ Metrics  Accuracy  Precision, Recall, F-Measure ◦ Gold Standards  Semantics ◦ Where it fits ◦ Thematic roles ◦ First Order Predicate Calculus as a representation ◦ Semantic Analysis will not be covered on the midterm


Download ppt "CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers."

Similar presentations


Ads by Google