Download presentation
Presentation is loading. Please wait.
1
CSC 212 – Data Structures Lecture 34: Strings and Pattern Matching
2
Problem of the Day You drive a bus from Rotterdam to Delft. At the 1 st stop, 33 people get in. At the 2 nd stop, 7 more people get in, and 11 passengers leave. The 3 rd stop, sees 5 people leave and 2 get in. After one hour, the bus arrives in Delft. What is the name of the driver? Read the question: You are the driver!
3
Strings Algorithmically, String is just sequence of concatenated data: “CSC212 STUDENTS IN DA HOUSE” “I can’t believe this is a String!” Java programs HTML documents Digitized image DNA sequences
4
Strings In Java Java Strings are immutable Java maintains a Map of text to String objects Each time String created, Map is checked If text exists, Java uses the String object to which it is mapped Otherwise, makes a new String & adds text and object to Map Happens “under the hood” Make String work like a primitive type Also makes it cheap to do lots of text processing
5
String Terminology String drawn from elements in an alphabet ASCII or Unicode Bits Pixels DNA bases Substring P[i... j] contains characters from P[i] through P[j] Substrings starting at rank 0 called a prefix Substrings ending with string’s last rank is suffix
6
Suffixes and Prefixes “I am the Lizard King!” PrefixesSuffixes I I I a I am … I am the Lizard Kin I am the Lizard King I am the Lizard King! ! g! ng! ing! … am the Lizard King! am the Lizard King! I am the Lizard King!
7
Pattern Matching Problem Given strings T & P, find first substring of T matching P T is the “text” P is the “pattern” Has many, many, many applications Search engines Database queries Biological research
8
Brute-Force Approach Common method of solving problems Easy to develop Often requires little coding Needs little brain power to figure out Uses computer’s speed for analysis Examines every possible option Painfully slow and use lots of memory Generally good only with small problems
9
Brute-Force Pattern Matching Compare P with every substrings in T, until find substring of T equal to P -or- reject all possible substrings of T If | P | = m and | T | = n, takes O(nm) time Worst-case: T aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa P aaag Common case for images & DNA data
10
Brute-Force Pattern Matching Algorithm BruteForceMatch(String T, String P) // Check if each rank of T starts a matching substring for i 0 to T.length() – P.length() // Compare substring starting at T[i] with P j 0 while j < P.length() && T.charAt(i + j) == P.charAt(j) j j + 1 if j == P.length() return i // Return 1 st place in T we find P return -1 // No matching substring exists
11
Your Turn Get back into groups and do activity
12
Before Next Lecture… Keep up with your reading! Cannot stress this enough Get ready for Lab Mastery Exam Start thinking about questions for Final
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.