Download presentation
Presentation is loading. Please wait.
Published byDella Garrison Modified over 9 years ago
1
The Simplest NL Applications: Text Searching and Pattern Matching Read J & M Chapter 2
2
Searching for a Single String Using a Nondeterministic FSM c o c o n u t 1 2 3 4 5 6 7 8
3
Searching for a Single String Using the Boyer Moore Algorithm
4
Searching for Multiple Strings c o c o n u t 1 2 3 4 5 6 7 8 o c o s 2 3 4 5 6 l Example: lococonut
5
Converting to a Deterministic FSM c o c o n u t 1 2 3 4 5 6 7 8 o c o s 2 3 4 5 6 l
6
Regular Expressions Two different (but related) uses of the term: Expressions that define all and only the regular languages (aa ab ba bb)* Expressions in a useful pattern language Matching ip addresses: S! ([0-9]+ (\. [0-9]+) {3}) ! $1 ! Finding doubled words: \
7
REs: Syntax and Semantics Syntax The regular expressions over an alphabet are all strings over the alphabet {(, ), , , *} that can be obtained as follows: 1. and each member of is a regular expression. 2. If , are regular expressions, then so is . 3. If , are regular expressions, then so is . 4. If is a regular expression, then so is *. 5. If is a regular expression, then so is ( ). 6. Nothing else is a regular expression.
8
REs: Syntax and Semantics Regular expressions define languages via a semantic interpretation function we'll call L: 1. L( ) = and L(a) = {a} for each a 2. If , are regular expressions, then L( ) = L( ) L( ) = all strings that can be formed by concatenating to some string from L( ) some string from L( ). 3. If , are regular expressions, then L( ) = L( ) L( ) 4. If is a regular expression, then L( *) = L( )* 5. If ( ) is a regular expression, thenL( ( ) ) = L( ) A language is regular if and only if it can be described by a regular expression. Note: L is compositional.
9
The Importance of Compositionality What is the meaning of: Mary cooked the yujutes. Mary tyroked the yujutes.
10
Morphological Analysis Read J & M Chapter 3 Recognize words Parse words
11
Morphological Parsing Goal: to represent the facts declaratively so that a single representation can be used for both recognition and generation. Note: ^ marks morpheme boundaries. # marks word boundaries.
12
From Lexical to Intermediate Note: All the transducers in the book are described as lexical:intermediate, but they can run the other direction.
13
Where Did reg-noun-stem Come From?
14
We Can Cascade or Compose
15
From Intermediate to Surface For text, we need spelling rules. x e / s ^ ___ s # z Read this as “Replace as e in the context after the /.
16
Turning the Rule into a Transducer foxes xerox fox#sat
17
Disambiguation - Local Local ambiguities: asses # s# luxury
18
Disambiguation - Harder Sometimes additional knowledge is necessary: foxes: fox +N + PL or fox +V +SG Can we think of nouns that cannot also be verbs?
19
Search For FSMs, we can build a deterministic machine. In other cases, we will have to search: Depth-first Breadth-first – chart parsing S VP NP PP NP NP V V PR N det N PREP DET N I hit the boy with a bat.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.