LING 438/538 Computational Linguistics Sandiway Fong Lecture 16: 10/19.

Slides:



Advertisements
Similar presentations
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Advertisements

4b Lexical analysis Finite Automata
LING 438/538 Computational Linguistics Sandiway Fong Lecture 17: 10/25.
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong. Administrivia Homework 3 graded.
LING 388: Language and Computers Sandiway Fong Lecture 5: 9/5.
Finite Automata CPSC 388 Ellen Walker Hiram College.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 13: 10/9.
LING/C SC/PSYC 438/538 Lecture 12 Sandiway Fong. Administrivia We'll postpone Homework 4 review until next week …
YES-NO machines Finite State Automata as language recognizers.
LING 388: Language and Computers Sandiway Fong Lecture 9: 9/27.
LING 388: Language and Computers Sandiway Fong 9/29 Lecture 11.
1 Module 20 NFA’s with -transitions –NFA- ’s Formal definition Simplifies construction –LNFA- –Showing LNFA  is a subset of LNFA (extra credit) and therefore.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/29.
LING 388: Language and Computers Sandiway Fong Lecture 9: 9/22.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
1 Module 15 FSA’s –Defining FSA’s –Computing with FSA’s Defining L(M) –Defining language class LFSA –Comparing LFSA to set of solvable languages (REC)
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 9: 9/21.
LING 388: Language and Computers Sandiway Fong Lecture 21: 11/7.
LING 388: Language and Computers Sandiway Fong Lecture 12: 10/5.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/18.
LING 388 Language and Computers Lecture 8 9/25/03 Sandiway FONG.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/6.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/26.
LING 388 Language and Computers Lecture 4 9/11/03 Sandiway FONG.
1 Lecture 16 FSA’s –Defining FSA’s –Computing with FSA’s Defining L(M) –Defining language class LFSA –Comparing LFSA to set of solvable languages (REC)
LING 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/3.
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/13.
Lecture 18 NFA’s with -transitions –NFA- ’s Formal definition Simplifies construction –LNFA- –Showing LNFA  is a subset of LNFA and therefore a subset.
LING 388: Language and Computers Sandiway Fong Lecture 11: 10/3.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 16: 10/23.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/7.
Finite state automaton (FSA)
1 Lecture 16 FSA’s –Defining FSA’s –Computing with FSA’s Defining L(M) –Defining language class LFSA –Comparing LFSA to set of solvable languages (REC)
LING 388 Language and Computers Take-Home Final Examination 12/9/03 Sandiway FONG.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/5.
LING 388 Language and Computers Lecture 11 10/7/03 Sandiway FONG.
LING 388: Language and Computers Sandiway Fong Lecture 10: 9/26.
LING 388 Language and Computers Lecture 7 9/23/03 Sandiway FONG.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/2.
Converting an NFA into an FSA Proving LNFA is a subset of LFSA.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 14: 10/12.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
LING 388 Language and Computers Lecture 6 9/18/03 Sandiway FONG.
CPSC 388 – Compiler Design and Construction
LING 438/538 Computational Linguistics
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong. Administrivia Next Monday – guest lecture from Dr. Jerry Ball of the Air Force Research Labs to be continued.
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
LING/C SC/PSYC 438/538 Lecture 7 9/15 Sandiway Fong.
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/15.
LING/C SC/PSYC 438/538 Lecture 12 10/4 Sandiway Fong.
Transition Diagrams Lecture 3 Wed, Jan 21, Building Transition Diagrams from Regular Expressions A regular expression consists of symbols a, b,
LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10.
CS 203: Introduction to Formal Languages and Automata
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 10 Automata, Grammars and Languages.
1 Section 13.1 Turing Machines A Turing machine (TM) is a simple computer that has an infinite amount of storage in the form of cells on an infinite tape.
1 Turing Machines and Equivalent Models Section 13.1 Turing Machines.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 12 Mälardalen University 2007.
1 Closure E.g., we understand number systems partly by understanding closure properties: Naturals are closed under +, , but not -, . Integers are closed.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Lexical analysis Finite Automata
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong.
Turing Machines 2nd 2017 Lecture 9.
Transition Diagrams Lecture 3 Fri, Jan 21, 2005.
4b Lexical analysis Finite Automata
4b Lexical analysis Finite Automata
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
Presentation transcript:

LING 438/538 Computational Linguistics Sandiway Fong Lecture 16: 10/19

Administrivia review homework #3 new homework #4 –out today –usual rules apply - due next Thursday

Last Time Spelling errors and correction Error Correction –correct Bayesian Probability –Minimum Edit Distance Computation Dynamic Programming

Minimum Edit Distance example –assuming insert =1 delete=1 substitution=2 (or 0 for substituting the same character) recursive formula –incrementally computed from minimum edit distances of shorter strings intent execut intent execu inten execut inten execu one edit operation away L DB min(L+1,D+0,B+1) cost: =8

Minimum Edit Distance Computation one formula Microsoft Excel implementation $ in a cell reference means don’t change when copied from cell to cell e.g. in C$1, 1 stays the same in $A3,A stays the same (not 3) min(C2+1,B3+1,B2+if(C$1=$A3,0,2))min(D2+1,C3+1,C2+if(D$1=$A3,0,2)) min(C3+1,B4+1,B3+if(C$1=$A4,0,2)) inc col inc row row columnprotected

Minimum Edit Distance Computation demo example pairs –intention, intent: –intention, intentional: –intention, ten: –intention, ton: –intention, teen: min edit distance (assuming substitution cost 2)

Homework 3 Review

Question 1 438/538 (4pts) Give the minimum size regular expression for the FSA below (2pt) Minimum size regular expression for the FSA: –a + b* not minimum size in terms of number of symbols: –aa*b* –(aa*)|(aa*b*) s xy a a b ε

Question 1 438/538 (4pts) Give an equivalent FSA without the ε-transition (2pts) –answer in the form of a diagram or formal definition or Prolog definition are all ok Equivalent ε-free FSA s xy a a b ε sab ab ab How to arrive at this answer? by inspection or by consideration of a + b* b* = ε | b + sa a a sb b b

Question 1 438/538 (4pts) Give an equivalent FSA without the ε-transition (2pts) –answer in the form of a diagram or formal definition or Prolog definition are all ok Set-of-States Construction method: s xy a a b ε {s}{x,y}{y} ab aba sab ab ab

Question 2 438/538 (8pts) convert the NDFSA into a deterministic FSA (3pts) figure 2.27 in the textbook {1} a {2} b {3,4} a {2,3} b a {1} a {2} b {3,4} a {2,3} b a set-of-states construction:

Question 2 438/538 (8pts) implement both the NDFSA and the equivalent FSA in Prolog using the “one predicate per state” encoding Prolog code: one([a|L]) :- two(L). two([b|L]) :- three(L). two([b|L]) :- four(L). three([]). three([a|L]) :- two(L). four([a|L]) :- three(L). strings abab and abaaba, how many steps (transitions + final stop)?

Question 2 438/538 (8pts) implement both the NDFSA and the equivalent FSA in Prolog using the “one predicate per state” encoding Prolog code: s1([a|L]) :- s2(L). s2([b|L]) :- s34(L). s34([]). s34([a|L]) :- s23(L). s23([]). s23([b|L]) :- s34(L). s23([a|L]) :- s2(L). {1} a {2} b {3,4} a {2,3} b a strings abab and abaaba, how many steps (transitions + final stop)?

Question 3 438/538 (8pts) (5pts) Give a FSA in Prolog that accepts a binary string (made up of 0’s and 1’s) if and only if it begins with a 1 and contains exactly one 0 –examples: – –10 –* FSA:

Question 3 438/538 (8pts) (5pts) Give a FSA in Prolog that accepts a binary string (made up of 0’s and 1’s) if and only if it begins with a 1 and contains exactly one 0 (3pts) Given the regular expression equivalent of the FSA Regular Expression: –11*01*

Homework #4

Question 1 438/538 (8pts) Implement the e-insertion rule (Context-Sensitive) Spelling Rule: (3.5) –   e / { x, s, z } ^ __ s# –as a FST in Prolog Goals: –pass through non-matching cases unchanged –implement rule exactly –no deletion of boundaries ^ and #

Question 2 438/538 (6pts) What does the Porter Stemmer output for the following words: –(2 pts) availability –(2 pts) shipping –(2pts) unbelievable Show the steps (stages) in your answer

Question 2 438/538 (6pts) –the Porter Stemmer handles -ement for cases like replacement  replac(e) –it doesn’t handle statement  stat(e) i.e. it outputs statement –Why? Explain (2pts) –Modify the Porter rule responsible to allow for statement  stat(e) Submit your rule (2pts) Give 2 examples where the modified rule would be too liberal, i.e. it overstems (2pts)

Summary Q1: 8pts Q2: 6+6=12pts Total: 20 pts