Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University.

Slides:



Advertisements
Similar presentations
Natural Language Processing Lecture 3—9/3/2013 Jim Martin.
Advertisements

Morphological Analysis Chapter 3. Morphology Morpheme = "minimal meaning-bearing unit in a language" Morphology handles the formation of words by using.
Finite-State Transducers Shallow Processing Techniques for NLP Ling570 October 10, 2011.
Formal Language, chapter 9, slide 1Copyright © 2007 by Adam Webber Chapter Nine: Advanced Topics in Regular Languages.
Finite-state automata 2 Day 13 LING Computational Linguistics Harry Howard Tulane University.
Pushdown Automata Part II: PDAs and CFG Chapter 12.
Chapter Section Section Summary Set of Strings Finite-State Automata Language Recognition by Finite-State Machines Designing Finite-State.
6/2/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 2 Giuseppe Carenini.
1 Regular Expressions and Automata September Lecture #2-2.
Finite-State Automata Shallow Processing Techniques for NLP Ling570 October 5, 2011.
CS5371 Theory of Computation
Finite Automata Finite-state machine with no output. FA consists of States, Transitions between states FA is a 5-tuple Example! A string x is recognized.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Computational language: week 9 Finish finite state machines FSA’s for modelling word structure Declarative language models knowledge representation and.
1 Morphological analysis LING 570 Fei Xia Week 4: 10/15/07 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
Morphological analysis
1 Finite state automaton (FSA) LING 570 Fei Xia Week 2: 10/07/09 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA.
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
Grammars, Languages and Finite-state automata Languages are described by grammars We need an algorithm that takes as input grammar sentence And gives a.
Introduction to English Morphology Finite State Transducers
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture4 1 August 2007.
Morphological Recognition We take each sub-lexicon of each stem class and we expand each arc (e.g. the reg-noun arc) with all the morphemes that make up.
Introduction Morphology is the study of the way words are built from smaller units: morphemes un-believe-able-ly Two broad classes of morphemes: stems.
Finite State Machines Chapter 5. Languages and Machines.
Chapter 2. Regular Expressions and Automata From: Chapter 2 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition,
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions.
Ling 570 Day #3 Stemming, Probabilistic Automata, Markov Chains/Model.
Lecture 3, 7/27/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 3 27 July 2005.
Finite State Transducers
Words: Surface Variation and Automata CMSC Natural Language Processing April 3, 2003.
Lexical Analysis Constructing a Scanner from Regular Expressions.
Morphological Analysis Chapter 3. Morphology Morpheme = "minimal meaning-bearing unit in a language" Morphology handles the formation of words by using.
Natural Language Processing Lecture 2—1/15/2015 Susan W. Brown.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
2. Regular Expressions and Automata 2007 년 3 월 31 일 인공지능 연구실 이경택 Text: Speech and Language Processing Page.33 ~ 56.
CSA3050: Natural Language Algorithms Finite State Devices.
Finite-state automata Day 12 LING Computational Linguistics Harry Howard Tulane University.
Natural Language Processing Chapter 2 : Morphology.
October 2007Natural Language Processing1 CSA3050: Natural Language Algorithms Words and Finite State Machinery.
MORPHOLOGY definition; variability among languages.
Finite State Machines 1.Finite state machines with output 2.Finite state machines with no output 3.DFA 4.NDFA.
Modeling Computation: Finite State Machines without Output
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing.
Regular expressions Day 11 LING Computational Linguistics Harry Howard Tulane University.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
1 Chapter Pushdown Automata. 2 Section 12.2 Pushdown Automata A pushdown automaton (PDA) is a finite automaton with a stack that has stack operations.
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Two Level Morphology Alexander Fraser & Liane Guillou CIS, Ludwig-Maximilians-Universität München Computational Morphology.
Introduction to Automata Theory Theory of Computation Lecture 5 Tasneem Ghnaimat.
Theory of Computation Automata Theory Dr. Ayman Srour.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
CIS, Ludwig-Maximilians-Universität München Computational Morphology
Basic Parsing with Context Free Grammars Chapter 13
Non Deterministic Automata
CSCI 5832 Natural Language Processing
Speech and Language Processing
CSCI 5832 Natural Language Processing
LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing Dan Jurafsky 11/24/2018 LING 138/238 Autumn 2004.
CSCI 5832 Natural Language Processing
Non-Deterministic Finite Automata
Non-Deterministic Finite Automata
CSC NLP - Regex, Finite State Automata
Chapter Nine: Advanced Topics in Regular Languages
Regular expressions 3 Day /26/16
CPSC 503 Computational Linguistics
Chapter 1 Regular Language
CPSC 503 Computational Linguistics
Presentation transcript:

Finite-state automata 3 Morphology Day 14 LING Computational Linguistics Harry Howard Tulane University

25-Sept-2009LING , Prof. Howard, Tulane University2 Course organization   NLTK is installed on the computers in this room!  How would you like to use the Provost's $150?

SLP §2.2 Finite-state automata Recognition as search

25-Sept-2009LING , Prof. Howard, Tulane University4 Non-deterministic recognition: Search  In a non-deterministic FSA, there is at least one path through the machine for a string that is in the language defined by the machine.  There is no path through the machine that leads to an accept state for a string not in the language.  But not all paths directed through the machine for an accept string lead to an accept state.

25-Sept-2009LING , Prof. Howard, Tulane University5 Non-deterministic recognition  So success in non-deterministic recognition occurs when a path is found through the machine that ends in an accept.  Failure occurs when all of the possible paths for a given string lead to failure.

25-Sept-2009LING , Prof. Howard, Tulane University6 Back to the example ba a a !$ q0q0 q1q1 q2q2 q2q2 q3q3 q4q4

25-Sept-2009LING , Prof. Howard, Tulane University7 Example q0q0 baaa! q1q1 baaa! q2q2 baaa! q2q2 baaa! q2q2 baaa! X q3q3 baaa! q4q4 baaa!

25-Sept-2009LING , Prof. Howard, Tulane University8 Summary  States in the search space are pairings of tape positions and states in the machine.  By keeping track of as yet unexplored states, a recognizer can systematically explore all the paths through the machine given an input.

25-Sept-2009LING , Prof. Howard, Tulane University9 Keeping track  But how do you keep track?  Depth-first/last in first out (LIFO)/stack  Unexplored states are added to the front of the agenda, and they are explored by going to the most recent.  Breadth-first/first in first out (FIFO)/queue  Unexplored states are added to the back of the agenda, and they are explored by going to the most recent.

25-Sept-2009LING , Prof. Howard, Tulane University10 Depth-first/LIFO/stack q2q2 q 18 q 12 q 41 q 27 q2q2 q 12 q 27 q 50 q 31 stack

25-Sept-2009LING , Prof. Howard, Tulane University11 Breadth-first/FIFO/queue q2q2 q 18 q 12 q 41 q 27 q2q2 q 12 q 27 q 50 q 31 queue

SLP §2.2 Finite-state automata Comparison

25-Sept-2009LING , Prof. Howard, Tulane University13 Equivalence  Non-deterministic machines can be converted to deterministic ones with a fairly simple construction.  That means that they have the same power:  non-deterministic machines are not more powerful than deterministic ones in terms of the languages they can accept.

25-Sept-2009LING , Prof. Howard, Tulane University14 Why bother?  Non-determinism doesn’t get us more formal power and it causes headaches, so why bother?  More natural (understandable) solutions.

SLP §3 Words and transducers Intro

25-Sept-2009LING , Prof. Howard, Tulane University16 Concepts and terminology  study of spelling  study of word composition  to build a structured representation of a word or sentence  input to this process  a process that applies without limitations  Can all forms be stored in advance?  orthography  morphology  parsing  surface or input form  productive

25-Sept-2009LING , Prof. Howard, Tulane University17 Concepts and terminology  the minimal meaning-bearing unit in a language  the main unit  additional units  a unit that:  precedes the main one  follows the main one  surrounds the main one  is inserted within the main one  a language in which the main unit can have many additional units  morpheme  stem  affix  prefix  suffix  circumfix  infix  agglutinative

25-Sept-2009LING , Prof. Howard, Tulane University18 Concepts and terminology  Combining an affix to a stem does not change the part of speech of the stem.  Combining an affix to a stem DOES change the part of speech of the stem.  Combining multiple stems.  Combining a stem with a phonologically reduced stem.  inflection  derivation  compounding  cliticization

SLP §3 Words and transducers §3.1 Survey of (mostly) English morphology

25-Sept-2009LING , Prof. Howard, Tulane University20 Inflectional morphology stem-s-ingpreteritepast part. walkwalkswalkingwalked trytriestryingtried mapmapsmappingmapped eateatseatingateeaten catchcatchescatchingcaught beisbeingwasbeen

Next time P4 SLP §3.2ff