Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 1 String and Language. String string is a finite sequence of symbols. For example, string ( s, t, r, i, n, g) CS4384 ( C, S, 4, 3, 8) 101001 (1,

Similar presentations


Presentation on theme: "Lecture 1 String and Language. String string is a finite sequence of symbols. For example, string ( s, t, r, i, n, g) CS4384 ( C, S, 4, 3, 8) 101001 (1,"— Presentation transcript:

1 Lecture 1 String and Language

2 String string is a finite sequence of symbols. For example, string ( s, t, r, i, n, g) CS4384 ( C, S, 4, 3, 8) 101001 (1, 0) Symbols are given through alphabet. An alphabet is a finite set of symbols.

3 Examples of Alphabet {a, b, c,..., x, y, z} (Roman alphabet) {0, 1,..., 9} {0, 1} (binary alphabet)

4 Length of a String The length of a string x is the number of symbols contained in the string x, denoted by |x|. For example, | string | = 6, |CS5400| = 6, | 101001 | = 6. The empty string is a string having no symbol, denoted by ε.

5 Equal Two strings x 1 x 2 ···x n and y 1 y 2 ···y m are equal if and only if (1) n=m and (2) x i =y i for all i. For example, 01 ≠ 010 and 1010 ≠1110.

6 Substring s is a substring of x if there exist strings y and z such that x = ysz. In particular, when x = sz (y=ε), s is called a prefix of x; when x = ys (z=ε), s is called a suffix of x. For example, CS is a prefix of CS5400 and 5400 is a surfix of CS5400.

7 Concatenation The concatenation of two strings x and y is a string xy, i.e., x is followed by y. For example, CS5400 is a concatenation of CS and 5400. In particular, we denote xx = x, xxx = x, xxxx = x,..., and define x = ε For example, 101010 = (10), (10) = ε 23 4 0 3 0

8 Solve equation 011x=x011 If x=ε, then ok. If |x|=1, then no solution. If |x|=2, then no solution. If |x|>3, then x=011y. Hence, 011x=011y011. So, x=y011. Hence, 011y=y011. x=(011) for k > 0 k

9 Language A language is a set of strings. For example, {0, 1}, {all English words}, {0, 0, 0,...} are all languages. The following are operations on sets and hence also on languages. Union: A U B Intersection: A ∩ B Difference: A \ B (A - B when B A) Complement: A = Σ* - A where Σ* is the set of all strings on alphabet Σ. 012 _

10 Concatenation of Languages Concatenation: AB = {ab | a \in A, b \in B} For example, {0, 1}{1, 2} = {01, 02, 11, 12}. Especially, we denote A = A, A = AA,..., and define A = {ε}. 0 1 2

11 If AB=B for any B, then A ={ε}. Choose B = {ε }. Then A ≠ empty and A cannot contain a nonempty string.

12 Examples For Σ = {0, 1}, Σ = {00, 01, 10, 11}, (Σ is the set of all strings of length k on Σ.) Therefore, Σ* = Σ U Σ U Σ U ···. 2 012 k

13 Kleene Closure Kleene closure: A* = A U A U A U ··· Notation: A = A U A U A U ··· 012 +123

14 A={grand, ε}, B={father, mother}. What is A*B? A*B={father, mother, grandfather, grandmother, …}

15 What is ? Where is the empty language.

16 A* = A if and only if ε is in A If ε is in A, then ε is in A. Hence A* = A. If ε is not in A, then ε is not in A. Hence A* ≠ A. + ++ + +

17 {0, 10}* is the language of strings not containing substring 11 and not ending with 1. What is the language of strings not containing substring 11 and ending with 0? {0, 10} +

18 Puzzle How many strings of length at most 40 are in the following language ?

19 Lecture 2 Regular Language and Regular Expression.

20 Regular Languages The concept of regular languages on an alphabet Σ is defined recursively as follows: (1) The empty language is regular. (2) For every symbol a Σ, {a} is regular. (3) If A and B are regular languages, then A U B, AB, and A* are regular. (4) Nothing else is a regular language.

21 {ε} is regular. Because the empty language is regular, = {ε} is regular.

22 For Σ={0,1}, {011} is regular. Since {0} and {1} are regular, {011}={0}{1}{1} is regular Remark: Every language containing only one string is regular.

23 {011,100} is regular. Because {011} and {100} are regular, {011, 100} = {011}U{100} is regular. Remark: Every finite language is regular. Remark: Every infinite regular language must be obtained with Kleene closure.

24 Operation Preference ({0}*U{0}{1}{1}*){0}{0}{1}* (1) Kleene closure has the higher preference over union and concatenation. (2) Concatenation has the higher preference over union.

25 The language of all binary strings starting with 01 is regular. Proof. The string in this language is in form 01x 1 ··· x n where x 1 ··· x n {0,1}*. Therefore, the language can be written as {01} {0,1}* = ({0}{1})({0} U {1})*, which is regular.

26 The language of all binary strings ending at 01 is regular. Proof. The string in this language is in form x 1 ··· x n 01 where x 1 ··· x n {0,1}*. Therefore, the language can be written as {0,1}*{01} = ({0} U {1})*({0}{1}), which is regular.

27 The language of all binary strings having substring 01 is regular. Proof. The string in this language is in form x 1 ··· x n 01y 1 ··· y m where x 1 ··· x n, y 1 ··· y m {0,1}*. Therefore, the language can be written as {0,1}* {01} {0,1}* =({0}U{1})*({0}{1})({0}U{1})*, which is regular.

28 Question: Do you fell that the expression of the regular set in the above example contains too many parentheses? Here is a simple expression -- Regular Expression

29 Regular Expression (1) is a regular expression of the empty language. (2) ε is a regular expression of {ε}. (3) For any symbol a, a is a regular expression of {a}. (4) If r A and r B are regular expressions of languages A and B, then r A +r B is a regular expression of A U B, r A r B is a regular expression of AB, and r A * is a regular expression of A*.

30 Examples 011 is a regular expression of {0}{1}{1}. 0+1 is a regular expression of {0,1}. (0+1)* is a regular expression of {0,1}*. Remark: (0+1) is also considered to be a regular expression of {0, 1}. + +

31 The language of all binary strings starting with 01 has a regular expression 01(0+1)*. The language of all binary strings ending at 01 has a regular expression (0+1)*01. The language of all binary strings having substring 01 has a regular expression (0+1)*01(0+1)*.

32 Induction Proof Because the regular language is defined recursively, we can prove the property of regular languages by proving the following: (1) has the property. (2) For any symbol a Σ, {a} has the property. (3) If A and B has the property, then all A U B, AB, and A* have the property. Actually, this is an induction proof. (1), (2) serve the basis step and (3) is the induction step.

33 For a string x=x 1 x 2 …x n, x =x n …x 2 x 1. For a language A, A = {x | x A}. Show that if A is regular, so is A. Proof. (1) is regular. (2) For any symbol a, {a} = {a} is regular. (3) Suppose that for regular languages A and B, A and B are regular. Then (A U B) = A U B is regular, (AB) = B A is regular. (A*) = (A )* is regular. R R R R R R R R R R RR RR R

34 Find a regular expression for {xwx | x (0+1)*, w (0+1)*} {xwx | x (0+1)*, w (0+1)*} = (0+1)* R R

35 Find a regular expression for {xwx | x (0+1), w (0+1)*} {xwx | x (0+1), w (0+1)*} = 0(0+1)*0 + 1(0+1)*1 R R+ +

36 Puzzle How many regular expressions can a language have?


Download ppt "Lecture 1 String and Language. String string is a finite sequence of symbols. For example, string ( s, t, r, i, n, g) CS4384 ( C, S, 4, 3, 8) 101001 (1,"

Similar presentations


Ads by Google