Formal Language, chapter 4, slide 1Copyright © 2007 by Adam Webber Chapter Four: DFA Applications.

Slides:



Advertisements
Similar presentations
I/O Basics 12 January 2014Smitha N. Pai, CSE Dept.1.
Advertisements

1 Streams and Input/Output Files Part 2. 2 Files and Exceptions When creating files and performing I/O operations on them, the systems generates errors.
1 StringBuffer & StringTokenizer Classes Chapter 5 - Other String Classes.
Introduction to Java 2 Programming Lecture 7 IO, Files, and URLs.
Lecture 15: I/O and Parsing
Chapter 10 Ch 1 – Introduction to Computers and Java Streams and File IO 1.
MOD III. Input / Output Streams Byte streams Programs use byte streams to perform input and output of 8-bit bytes. This Stream handles the 8-bit.
Today’s lecture Review of Chapter 1 Go over homework exercises for chapter 1.
Chapter Three: Closure Properties for Regular Languages
Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Chapter 1: Computer Systems
CSc 453 Lexical Analysis (Scanning)
Lexical Analysis (4.2) Programming Languages Hiram College Ellen Walker.
Formal Language, chapter 9, slide 1Copyright © 2007 by Adam Webber Chapter Nine: Advanced Topics in Regular Languages.
Chapter Two: Finite Automata
Unit 201 FILE IO Types of files Opening a text file for reading Reading from a text file Opening a text file for writing/appending Writing/appending to.
Fall 2006Costas Busch - RPI1 Deterministic Finite Automata And Regular Languages.
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
Grammars and Parsing. Sentence  Noun Verb Noun Noun  boys Noun  girls Noun  dogs Verb  like Verb  see Grammars Grammar: set of rules for generating.
1 Course Lectures Available on line:
Abstract Data Types (ADTs) and data structures: terminology and definitions A type is a collection of values. For example, the boolean type consists of.
Recitation 1 CS0445 Data Structures Mehmud Abliz.
Prepared by : A.Alzubair Hassan Kassala university Dept. Computer Science Lecture 2 I/O Streams 1.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Introduction to Programming David Goldschmidt, Ph.D. Computer Science The College of Saint Rose Java Fundamentals (Comments, Variables, etc.)
Basic Java Programming CSCI 392 Week Two. Stuff that is the same as C++ for loops and while loops for (int i=0; i
JAVA Tokens. Introduction A token is an individual element in a program. More than one token can appear in a single line separated by white spaces.
Geoff Holmes and Bernhard Pfahringer COMP206-08S General Programming 2.
Chapter 9 1 Chapter 9 – Part 1 l Overview of Streams and File I/O l Text File I/O l Binary File I/O l File Objects and File Names Streams and File I/O.
Fall 2006Costas Busch - RPI1 Deterministic Finite Automaton (DFA) Input Tape “Accept” or “Reject” String Finite Automaton Output.
D. M. Akbar Hussain: Department of Software & Media Technology 1 Compiler is tool: which translate notations from one system to another, usually from source.
8-1 Chapter 8: Arrays Arrays are objects that help us organize large amounts of information Today we will focuses on: –array declaration and use –bounds.
1 Week 12 l Overview of Streams and File I/O l Text File I/O Streams and File I/O.
1 Assignment #1 is due on Friday. Any questions?.
1 Prove the following languages over Σ={0,1} are regular by giving regular expressions for them: 1. {w contains two or more 0’s} 2. {|w| = 3k for some.
Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.
1 Languages and Compilers (SProg og Oversættere) Lexical analysis.
CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CSc 453 Lexical Analysis (Scanning)
Fall 2002CS 150: Intro. to Computing1 Streams and File I/O (That is, Input/Output) OR How you read data from files and write data to files.
CMSC 330: Organization of Programming Languages Finite Automata NFAs  DFAs.
Java Language Basics By Keywords Keywords of Java are given below – abstract continue for new switch assert *** default goto * package.
Chapter 9 1 Chapter 9 – Part 2 l Overview of Streams and File I/O l Text File I/O l Binary File I/O l File Objects and File Names Streams and File I/O.
CSI 3125, Preliminaries, page 1 Java I/O. CSI 3125, Preliminaries, page 2 Java I/O Java I/O (Input and Output) is used to process the input and produce.
ICS3U_FileIO.ppt File Input/Output (I/O)‏ ICS3U_FileIO.ppt File I/O Declare a file object File myFile = new File("billy.txt"); a file object whose name.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 2: Lexical Analysis.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
A data type in a programming language is a set of data with values having predefined characteristics.data The language usually specifies:  the range.
1 Arrays Chapter 8. Objectives You will be able to Use arrays in your Java programs to hold a large number of data items of the same type. Initialize.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Chapter 2-II Scanning Sung-Dong Kim Dept. of Computer Engineering, Hansung University.
Object Oriented Programming Lecture 2: BallWorld.
Lecture Three: Finite Automata Finite Automata, Lecture 3, slide 1 Amjad Ali.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
CSc 453 Lexical Analysis (Scanning)
CS 153: Concepts of Compiler Design October 17 Class Meeting
CSc 453 Lexical Analysis (Scanning)
CHAPTER 5 JAVA FILE INPUT/OUTPUT
I/O Basics.
Primitive Types Vs. Reference Types, Strings, Enumerations
Lexical Analysis - An Introduction
Introduction to Java Programming
An overview of Java, Data types and variables
Deterministic Finite Automata And Regular Languages Prof. Busch - LSU.
Chapter Four: DFA Applications
Chapter Two: Finite Automata
Lexical Analysis - An Introduction
CSc 453 Lexical Analysis (Scanning)
Presentation transcript:

Formal Language, chapter 4, slide 1Copyright © 2007 by Adam Webber Chapter Four: DFA Applications

Formal Language, chapter 4, slide 2Copyright © 2007 by Adam Webber We have seen how DFAs can be used to define formal languages. In addition to this formal use, DFAs have practical applications. DFA- based pieces of code lie at the heart of many commonly used computer programs.

Formal Language, chapter 4, slide 3Copyright © 2007 by Adam Webber Outline 4.1 DFA Applications 4.2 A DFA-Based Text Filter in Java 4.3 Table-Driven Alternatives

Formal Language, chapter 4, slide 4Copyright © 2007 by Adam Webber DFA Applications Programming language processing –Scanning phase: dividing source file into "tokens" (keywords, identifiers, constants, etc.), skipping whitespace and comments Command language processing –Typed command languages often require the same kind of treatment Text pattern matching –Unix tools like awk, egrep, and sed, mail systems like ProcMail, database systems like MySQL, and many others

Formal Language, chapter 4, slide 5Copyright © 2007 by Adam Webber More DFA Applications Signal processing –Speech processing and other signal processing systems use finite state models to transform the incoming signal Controllers for finite-state systems –Hardware and software –A wide range of applications, from industrial processes to video games

Formal Language, chapter 4, slide 6Copyright © 2007 by Adam Webber Outline 4.1 DFA Applications 4.2 A DFA-Based Text Filter in Java 4.3 Table-Driven Alternatives

Formal Language, chapter 4, slide 7Copyright © 2007 by Adam Webber The Mod3 DFA, Revisited We saw that this DFA accepts a language of binary strings that encode numbers divisible by 3 We will implement it in Java We will need one more state, since our natural alphabet is Unicode, not {0,1}

Formal Language, chapter 4, slide 8Copyright © 2007 by Adam Webber The Mod3 DFA, Modified Here,  is the Unicode character set The DFA enters the non-accepting trap state on any symbol other than 0 or 1

Formal Language, chapter 4, slide 9Copyright © 2007 by Adam Webber /** * A deterministic finite-state automaton that * recognizes strings that are binary * representations of integers that are divisible * by 3. Leading zeros are permitted, and the * empty string is taken as a representation for 0 * (along with "0", "00", and so on). */ public class Mod3 { /* * Constants q0 through q3 represent states, and * a private int holds the current state code. */ private static final int q0 = 0; private static final int q1 = 1; private static final int q2 = 2; private static final int q3 = 3; private int state;

Formal Language, chapter 4, slide 10Copyright © 2007 by Adam Webber static private int delta(int s, char c) { switch (s) { case q0: switch (c) { case '0': return q0; case '1': return q1; default: return q3; } case q1: switch (c) { case '0': return q2; case '1': return q0; default: return q3; } case q2: switch (c) { case '0': return q1; case '1': return q2; default: return q3; } default: return q3; } }

Formal Language, chapter 4, slide 11Copyright © 2007 by Adam Webber /** * Reset the current state to the start state. */ public void reset() { state = q0; } /** * Make one transition on each char in the given * string. in the String to use */ public void process(String in) { for (int i = 0; i < in.length(); i++) { char c = in.charAt(i); state = delta(state, c); } }

Formal Language, chapter 4, slide 12Copyright © 2007 by Adam Webber /** * Test whether the DFA accepted the string. true iff the final state was accepting */ public boolean accepted() { return state==q0; } } Usage example: Mod3 m = new Mod3(); m.reset(); m.process(s); if (m.accepted())...

Formal Language, chapter 4, slide 13Copyright © 2007 by Adam Webber import java.io.*; /** * A Java application to demonstrate the Mod3 class by * using it to filter the standard input stream. Those * lines that are accepted by Mod3 are echoed to the * standard output. */ public class Mod3Filter { public static void main(String[] args) throws IOException { Mod3 m = new Mod3(); // the DFA BufferedReader in = // standard input new BufferedReader(new InputStreamReader(System.in));

Formal Language, chapter 4, slide 14Copyright © 2007 by Adam Webber // Read and echo lines until EOF. String s = in.readLine(); while (s!=null) { m.reset(); m.process(s); if (m.accepted()) System.out.println(s); s = in.readLine(); } } }

Formal Language, chapter 4, slide 15Copyright © 2007 by Adam Webber C:\>type numbers C:\>java Mod3Filter < numbers C:\>

Formal Language, chapter 4, slide 16Copyright © 2007 by Adam Webber Outline 4.1 DFA Applications 4.2 A DFA-Based Text Filter in Java 4.3 Table-Driven Alternatives

Formal Language, chapter 4, slide 17Copyright © 2007 by Adam Webber Making Delta A Table We might want to encode delta as a two- dimensional array Avoids method invocation overhead Then process could look like this: static void process(String in) { for (int i = 0; i < in.length(); i++) { char c = in.charAt(i); state = delta[state, c]; } }

Formal Language, chapter 4, slide 18Copyright © 2007 by Adam Webber Keeping The Array Small If delta[state,c] is indexed by state and symbol, it will be big: 4 by 65536! And almost all entries will be 3 Instead, we could index it by state and integer, 0 or 1 Then we could use exception handling when the array index is out of bounds

Formal Language, chapter 4, slide 19Copyright © 2007 by Adam Webber /* * The transition function represented as an array. * The next state from current state s and character c * is at delta[s,c-'0']. */ static private int[][] delta = {{q0,q1},{q2,q0},{q1,q2},{q3,q3}}; /** * Make one transition on each char in the given * string. in the String to use */ public void process(String in) { for (int i = 0; i < in.length(); i++) { char c = in.charAt(i); try { state = delta[state][c-'0']; } catch (ArrayIndexOutOfBoundsException ex) { state = q3; } } }

Formal Language, chapter 4, slide 20Copyright © 2007 by Adam Webber Tradeoffs Function or table? Truncated table or full table? –By hand, a truncated table is easier –Automatically generated systems generally produce the full table, so the same process can be used for different DFAs Table representation –We used an int for every entry: wasteful! –Could have used a byte, or even just two bits –Time/space tradeoff: table compression saves space but slows down access