Week 13 - Wednesday
What did we talk about last time? Exam 3 Before review: Graphing functions Rules for manipulating asymptotic bounds Computing bounds for running time functions
Ten people are marooned on a deserted island During their first day they gather many coconuts and put them all in a community pile They are so tired that they decide to divide them into ten equal piles the next morning That night one castaway wakes up hungry and decides to take his share early After dividing up the coconuts he finds he is one coconut short of ten equal piles He notices a monkey holding one coconut He tries to take the monkey's coconut so that the total is evenly divisible by 10 However, when he tries to take it, the monkey hits him on the head with it, killing him Later, another castaway wakes up hungry and decides to take his share early On the way to the coconuts he finds the body of the first castaway and realizes that he is now be entitled to 1/9 of the total pile After dividing them up into nine piles he is again one coconut short of an even division and tries to take the monkey's (slightly) bloody coconut Again, the monkey hits the second man on the head and kills him Each of the remaining castaways goes through the same process, until the 10 th person to wake up realizes that the entire pile for himself What is the smallest number of coconuts in the original pile (ignoring the monkey's)?
Computer science grew out a lot of different pieces Mathematics Engineering Linguistics Describing an algorithm precisely requires that it be framed in terms of some formal language with exact rules
We say that a language is a set of strings A string is an ordered n-tuple of elements of an alphabet Σ or the empty string ε (which has no characters) An alphabet Σ is a finite set of characters
Let alphabet Σ = {a, b} Define a language L 1 over Σ to be the set of all strings that begin with the character a and have length at most three characters Write out L 1 A palindrome is a string which stays the same if the order of its characters is reversed Define a language L 2 over Σ to be the set of all palindromes made up of characters from Σ Write 10 strings in L 2
Let Σ be some alphabet For any nonnegative integer n, let Σ n be the set of all strings over Σ that have length n Σ + be the set of all strings over Σ that have length at least 1 Σ * be the set of all strings over Σ Σ * is called the Kleene closure of Σ and the * operator is often called the Kleene star
Let alphabet Σ = {x, y, z} Find Σ 0, Σ 1, and Σ 2 What is A = Σ 0 Σ 1 ? What is B = Σ 1 Σ 2 ? How would you describe these sets and set A B in words? Describe a systematic way of writing out Σ + How would you have to change your system to write out Σ * ?
Let Σ be a finite alphabet Given strings x and y over Σ, the concatenation of x and y is the string made by writing x with y appended afterwards With languages L and L' over Σ, we can define the following new languages: Concatenation of L and L', written LL' ▪ LL' = { xy | x L and y L' } Union of L and L', written L L' ▪ L L' = { x | x L or x L' } Kleene closure of L, written L * ▪ L * = { x | x is a concatenation of any finite number of strings in L }
Let alphabet Σ = {a, b} Let L 1 be the set of all strings consisting of an even number of a's (including the empty string) Let L 2 = {b, bb, bbb} Find L1L2L1L2 L1 L2L1 L2 (L1 L2)*(L1 L2)*
It's getting annoying trying to describe infinite languages using ellipses Notation called a regular expression can allow us to express languages precisely and compactly Given a finite alphabet Σ, we can define regular expressions recursively: I. Base: The empty set, the empty string ε, and any individual character in Σ is a regular expression II. Recursion: If r and s are regular expressions over Σ, then the following are too: a)Concatenation: (rs) b)Alternation: (r | s) c)Kleene star: (r*) III. Restriction: Nothing else is a regular expression
For a finite alphabet Σ, the language L(r) defined by a regular expression r is as follows Base: L( ) = , L(ε) = {ε}, L(a) = {a} for every a Σ Recursion: If L(r) and L(r') are the languages defined by the regular expressions r and r' over Σ, then L(r r') = L(r)L(r') L(r | r') = L(r) L(r') L(r * ) = (L(r)) *
Let Σ = {a, b, c} Let language L = a | (b | c)* | (ab)* Write 5 strings in L Let language M = ab * (c |ε) Write 5 strings in M
For the sake of consistency, regular expressions obey a particular order of precedence * is the highest precedence Concatenation is the next highest Alternation is the lowest Parentheses can be omitted if there is no ambiguity Write (a((bc)*)) with as few parentheses as possible Write a | b* c, using parentheses to mark the precedence of each operation
As before, let Σ = {a, b} Can you describe (a | b)* in another way? What about ( ε | a* | b* )*? Given that L = a*b(a | b)*, write 5 strings that belong to L Let M = a* | (ab)* Which of the following belong to M? ▪a▪a ▪b▪b ▪ aaaa ▪ abba ▪ ababab
Let Σ = {0, 1} Find regular expressions for the following languages: The language of all strings of 0's and 1's that have even length and in which the 0's and 1's alternate The language consisting of all strings of 0's and 1's with an even number of 1's The language consisting of all strings of 0's and 1's that do not contain two consecutive 1's The language that gives all binary numbers written in normal form (that is, without leading zeroes, and the empty string is not allowed)
Regular expressions are used in some programming languages (notably Perl) and in grep and other find and replace tools The notation is generally extended to make it a little easier, as in the following: [ A – C] means any character in that range, [A – C] means ( A | B | C ) [0 – 9] means ( 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ) [ABC] means (A | B | C ) ABC means the concatenation of A, B, and C A dot stands for any letter: A.C could match AxC, A&C, ABC ^ means NOT, thus [^D – Z] means not the characters D through Z Repetitions: R? means 0 or 1 repetitions of R R* means 0 or more repetitions of R R+ means 1 or more repetitions of R Notations vary and have considerable complexity Use this notation to describe the regular expression for legal C++ identifiers
A finite-state automaton is an idealized machine composed of five objects: 1. A finite set I, called the input alphabet, of input symbols 2. A set S of states the automaton can be in 3. A designated state s 0 called the initial state 4. A designed set of states called the set of accepting states 5. A next-state function N: S x I S that maps a current state with current input to the next state
FSA's are often described with a state transition diagram The starting state has an arrow The accepting states are marked with circles Each rule is represented by a labeled transition arrow The following FSA represents a vending machine 0¢ 25¢ 75¢ 50¢ $1 $1.25 half-dollar quarter half-dollar quarter half-dollar quarter
Consider this FSA: What are its states? What are its input symbols? What is the initial state of A? What are the accepting states of A? What is N(s 1, 1)? What's a verbal description for the strings accepted? s0s0 s0s0 s1s1 s1s1 s2s2 s2s
Consider the same FSA: We can also describe an FSA using an annotated next-state table A next-state table gives shows what the transition is for each state for all possible input An annotated next-state table also marks the initial state and accepting states Find the annotated next-state table for this FSA s0s0 s0s0 s1s1 s1s1 s2s2 s2s
Consider the following annotated next-state table marks initial state marks accepting states): Draw the corresponding transition state diagram abc UZYY VVVV YZVY ZZZZ
Consider this FSA again: Which state will be reached on the following inputs: i. 01 ii iii iv What's a verbal description for the strings accepted? s0s0 s0s0 s1s1 s1s1 s2s2 s2s
Let A be a FSA with a set of states S, set of input symbols I, and next-state function N: X x I S Let I * be the set of all strings over I The eventual-state function N * : S x I * S is the following N * (s,w) = the state that A goes to if the symbols of w are input to A in sequence, starting with A in state s All of this is just a notational convenience so that we have a way of talking about the state that a string will transition an FSA to We say that w is accepted by A iff N * (s 0, w) is an accepting state of A The language of A, L(A) = { w I * | w is accepted by A }
Design a finite-state automaton that accepts the set of all strings of 0's and 1's such that the number of 1's in the string is divisible by 3 Make a regular expression for this language Design a finite-state automaton that accepts the set of all strings of 0's and 1's that contain exactly one 1 Make a regular expression for this language
More on finite state automata Simplifying FSA's
Read Chapter 12