Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A pair of sometimes useful functions Function ord returns a character’s ordinance / character code (Unicode) Function chr returns the character with.

Similar presentations


Presentation on theme: "1 A pair of sometimes useful functions Function ord returns a character’s ordinance / character code (Unicode) Function chr returns the character with."— Presentation transcript:

1 1 A pair of sometimes useful functions Function ord returns a character’s ordinance / character code (Unicode) Function chr returns the character with the given character code >>> ord('ff') Traceback (most recent call last): File " ", line 1, in ? TypeError: ord() expected a character, but string of length 2 found >>> ord('f') 102 >>> ord('.') 46 >>> chr(46) '.'

2 2 Danish Intelligence Agency Memo Concerning: incident where activists threw red paint at prime minister Anders Fogh Rasmussen Task: improve electronic surveillance to avoid such indicents in the future String searching using find

3 3 surveillance.py Magic: red text Find index of first occurrence of word starting at startindex Print substring around suspicious word without exceeding string Strings are immutable: manipulation methods return new strings

4 4 surveillancetest.py

5 5 surveillancetest.py, output Not all words found, text okay All words found, text is suspicious! ting by Douglas Coupland was sold for 100.000.000 a slight change of plans, the prime minister atten All words found, text is suspicious! fice. He hides the paint behind a plant. Tuesday m him and throw the paint. They keep attacking him, George and Ringo attack him and throw the paint. e paint. They keep attacking him until they're arr y're arrested. The attack should take place at 10a Here's the plan: Paul breaks into Christia the paint behind a plant. Tuesday morning before t We find words containing a suspicious word: may be important

6 6 Parents Music Resource Center Concerning: crude language in much of today’s music Task: implement censorship to remove bad words More string methods: splitlines, join, replace

7 7 censorship.py If any words were BEEPed, print line and play one beep per word Split text in list of lines In each line, replace each bad word with BEEP Join censored lines with newlines and return full text

8 8 Celine Dion: With each moment, moment pBEEPing by Beeped words: 1 Crime Mob: Ol' stankin BEEP (Hoe) Jank BEEP (Hoe) Suck my BEEP you (Hoe) Ol' fat BEEP (Hoe) But aiight! We finna get these lame BEEP niggaz You see a hoe BEEP nigga, call his BEEP out. Aye! Aye! Stomp his BEEP like (Hoe) Ol' lame BEEP (Hoe) I'ma tell you how it is nigga you betta get the BEEP back cause a nigga like me don't give a BEEP A nigga suppose to gon leave yo BEEP choked You sound like a BEEP yo BEEP I'ma hit we don't give a BEEP cause you is a lame One hitter quitter yo BEEP get popped Back the BEEP up 'fore I show you who reala Whats up wit ya BEEP nigga Ol' sucka BEEP, busta BEEP, cryin to yo momma BEEP I'ma keep up drama I'm a muthaBEEPin plum BEEP See you just a dumb BEEP go on wit yo young BEEP Try me like a sucka but I know you just a lame BEEP In my section they glad to see a nigga that don't give a BEEP Stomp you to the floor and tell you get yo pussy BEEP up Pick that nigga BEEP up, tear his lame BEEP up Niggaz representin Ellenwood time to mBEEP up Throwin blows like Johnny Cage, you think you wanna BEEP wit me Do this BEEP like Pastor Troy Uuh Huh I'm outside hoe Take my BEEPin word I ain't got no reason to lie hoe Beeped words: 34 Program tested on two songs by Celine Dion and Crime Mob : We find words containing a suspicious word: not desirable here. See exercise.

9 9 Regular Expressions – Motivation Problem: search suspicious text for any Danish email address: @.dk text1 = "No Danish email here bush@whitehouse.org *@$@.hls.29! fj3a“ text2 = "But here: chili@daimi.au.dk what a *(.@#$ nice @#*.( el ds“ text3 = "And here perhaps? rubbish@junk.garbage@bogus@dk @.dk a@.dk" - Cumbersome using ordinary string methods.

10 10 Text2 contains this Danish email address: chili@daimi.au.dk RegExp solution (to be explained later)

11 11 Regular Expressions Provide more efficient and powerful alternative to string search methods Instead of searching for a specific string we can search for a text pattern – Don’t have to search explicitly for ‘Monday’, ‘Tuesday’, ‘Wednesday’.. : there is a pattern in these search strings. – A regular expression is a text pattern In Python, regular expression processing capabilities provided by module re

12 12 Example Simple regular expression: regExp = “football” - matches only the string “football” To search a text for regExp, we can use re.search( regExp, text )

13 13 Compiling Regular Expressions re.search( regExp, text ) 1.Compile regExp to a special format (an SRE_Pattern object) 2.Search for this SRE_Pattern in text 3.Result is an SRE_Match object If we need to search for regExp several times, it is more efficient to compile it once and for all: compiledRE = re.compile( regExp) 1.Now compiledRE is an SRE_Pattern object compiledRE.search( text ) 2.Use search method in this SRE_Pattern to search text 3.Result is same SRE_Match object

14 14 Searching for ‘football’ import re text1 = "Here are the football results: Bosnia - Denmark 0-7" text2 = "We will now give a complete list of python keywords." regularExpression = "football" compiledRE = re.compile( regularExpression) SRE_Match1 = compiledRE.search( text1 ) SRE_Match2 = compiledRE.search( text2 ) if SRE_Match1: print "Text1 contains the substring ‘football’" if SRE_Match2: print "Text2 contains the substring ‘football’" Text1 contains the substring 'football' Compile regular expression and get the SRE_Pattern object Use the same SRE_Pattern object to search both texts and get two SRE_Match objects (or none if the search was unsuccesful)

15 15 Building more sophisticated patterns Metacharacters: ? : matches zero or one occurrences of the expression it follows + : matches one or more occurrences of the expression it follows * : matches zero or more occurrences of the expression it follows # search for zero or one t, followed by two a’s: regExp1 = “t?aa“ # search for g followed by one or more c’s followed by one a: regExp1 = “gc+a“ #search for ct followed by zero or more g’s followed by one a: regExp1 = “ctg*a“

16 16 Text contains the regular expression t?aa Text contains the regular expression gc+a Text contains the regular expression ctg*a Use the SRE_Pattern objects to search the text and get SRE_Match objects


Download ppt "1 A pair of sometimes useful functions Function ord returns a character’s ordinance / character code (Unicode) Function chr returns the character with."

Similar presentations


Ads by Google