Download presentation
Presentation is loading. Please wait.
Published byAntonin Delisle Modified over 5 years ago
1
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong
2
Administrivia Homework 4 Review: Continuing with Perl regex
graded, from me Continuing with Perl regex Ungraded Homework Exercises for next time
3
Homework 4 Review File: population.txt Contents: Source: Wikipedia
rank name continent population (2016) population (2017) fields are separated by a tab (\t) Source: Wikipedia
4
Homework 4: Question 1 Review
Using Perl read the file create hash table(s) indexed by country name containing the following information: continent/2016 population/2017 population Compute and print the country that decreased in population. Compute and print the country with the smallest positive increase in population. Print a table of countries in Asia and 2016 population ranked by population Print a table of countries in Africa and 2016 population ranked inversely by population
5
Homework 4: Question 1 Review
perl hw4.perl population.txt Japan decreased in population Only one country, Japan, decreased in population! Russia increased in population the least! Asian countries ranked by 2016 population China India Indonesia Pakistan Bangladesh Japan Philippines Vietnam Iran Turkey Thailand African countries ranked inversely by 2016 population Democratic Republic of the Congo Egypt Ethiopia Nigeria
6
Homework 4: Question 1 Review
Rank: #1, #2, #3, …
7
Homework 4: Question 1 Review
Separate hash table for each field (continent, p2016, p2017, diff) (avoids references) For each line: chomp trim split remove commas assign
8
Homework 4: Question 1 Review
$neggrowth = country with negative population growth $count = counts countries with negative population growth
9
Homework 4: Question 1 Review
10
Homework 4: Question 1 Review
11
Homework 4: Question 1 Review
12
Homework 4: Question 2 Review
Do the same exercise in Python3 using a dictionary or dictionaries
13
Homework 4: Question 2 Review
python3 hw4.py population.txt Only one country, Japan , lost population Country with minimum positive population growth is Russia Country 2016 population China 1,403,500,365 India 1,324,171,354 Indonesia 261,115,456 Pakistan 193,203,476 Bangladesh 162,951,560 Japan 127,748,513 Philippines 103,320,222 Vietnam 94,569,072 Iran 80,277,428 Turkey 79,512,426 Thailand 68,863,514 Country 2016 population Democratic Republic of the Congo 78,736,153 Egypt 95,688,681 Ethiopia 102,403,196 Nigeria 185,989,640
14
Homework 4: Question 2 Review
with … as f: automatically closes the filehandle f for … in f: iterates over all the lines .strip().replace() left-to-right sequence order fields[0] rank fields[1] name fields[2] continent fields[3] 2016 population fields[4] 2017 population fields[2:] slice = fields[2:5] 'China' ['Asia', ' ', ' ']
15
Homework 4: Question 2 Review
list comprehension grabs all country names c for countries where the population (table[c][1]) > 2017 population (table[c][2]) e.g. table['China'] = ['Asia', ' ', ' '] table['China'][0] = 'Asia' table['China'][1] = ' ' table['China'][2] = ' '
16
Homework 4: Question 2 Review
.pop() removes 'Japan' from the table; value is stored in variable saved function min() computes the smallest value in the table and returns the key (country name) associated with that value key= tells min() to compare the values given by a function (lambda) that when supplied with a country (k) return the expression given by population – 2016 population last line restores 'Japan' to the table
17
Homework 4: Question 2 Review
the list comprehension finds all the countries in Asia function sorted() reverse sorts that list with parameter key=lambda k: int(table[k][1]) format string has basic form '{:s} {:d}'.format(X,Y) s=string, d=(decimal) integer. Options are <= left align, > = right align, and , = thousands comma
18
Homework 4: Question 2 Review
Very similar code to that of the previous slide: no reversed=True
19
Homework 4: Question 3 Review
Most of you preferred Python 3 of you preferred Perl Some cited $ as making Perl hard (to write/read) Some used pandas (
20
Reading Homework Read up on the syntax of Perl Regular Expressions
Online tutorials Practice (ungraded): do regex exercises 2.1 in JM (pg. 42) I will review some of them on Thursday
21
Today's Topic More Perl Regex: Variables: $&, $`, $', $1, $2, $3, …
Backreferences Greedy and non-greedy matching
22
Online regex tester
23
Chapter 2: JM Precedence of operators Perl: Precedence Hierarchy:
/house(cat(s|)|)/ (| = disjunction; ? = optional) Perl: in a regular expression the pattern matched by within the pair of parentheses is stored in global variables $1 (and $2 and so on) Precedence Hierarchy:
24
returns 1 (true) or "" (empty if false)
Perl regex returns 1 (true) or "" (empty if false) A shortcut: list context for matching returns a list
25
Chapter 2: JM s/([0-9]+)/<\1>/
what does this do? Backreferences give Perl regexs more expressive power than Finite State Automata (FSA)
26
Shortest vs. Greedy Matching
default behavior in Perl RE match: take the longest possible matching string aka greedy matching This behavior can be changed, see next slide
27
Shortest vs. Greedy Matching
from Example: $_ = "The food is under the bar in the barn."; if ( /foo(.*?)bar/ ) { print ”matched <$1>\n"; } Output: greedy (.*): matched <d is under the bar in the > shortest (.*?): matched <d is under the > Notes: ? immediately following a repetition operator like * (or +) makes the operator work in non-greedy mode (.*?) (.*)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.