LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong.

Slides:



Advertisements
Similar presentations
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong. Administrivia Homework 3 graded.
Advertisements

LING/C SC/PSYC 438/538 Lecture 4 Sandiway Fong. Administrivia Homework 1 graded – you should have gotten an from me.
Perl Arrays and Lists Learning Objectives: 1. To understand the format and the declaration of Arrays & Lists in Perl 2. To distinguish the difference between.
COS 381 Day 22. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong. Adminstrivia.
WORLD POPULATION SITUATION
Population Density Look at the next slides. Do these places have high population densities or low population densities?
LING/C SC/PSYC 438/538 Lecture 5 9/8 Sandiway Fong.
Lists in Python.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 4: 8/30.
LING/C SC/PSYC 438/538 Lecture 4 Sandiway Fong. Continuing with Perl Homework 3: first Perl homework – due Sunday by midnight – one PDF file, by .
Human Population Growth
Built-in Data Structures in Python An Introduction.
Getting Started with Python: Constructs and Pitfalls Sean Deitz Advanced Programming Seminar September 13, 2013.
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong. Administrivia Homework 2 graded.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 8 Lists and Tuples.
5 1 Data Files CGI/Perl Programming By Diane Zak.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong. Adminstrivia Homework 4 not yet graded …
Perl Variables: Array Web Programming1. Review: Perl Variables Scalar ► e.g. $var1 = “Mary”; $var2= 1; ► holds number, character, string Array ► e.g.
LISTS and TUPLES. Topics Sequences Introduction to Lists List Slicing Finding Items in Lists with the in Operator List Methods and Useful Built-in Functions.
Programming Perl in UNIX Course Number : CIT 370 Week 2 Prof. Daniel Chen.
LING/C SC/PSYC 438/538 Online Lecture 7 Sandiway Fong.
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
Relational Databases: Basic Concepts
China India United States Indonesia
CSc 120 Introduction to Computer Programing II
Tuberculosis (TB): The 22 High-Burden Countries (HBCs)
Miscellaneous Items Loop control, block labels, unless/until, backwards syntax for “if” statements, split, join, substring, length, logical operators,
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
Dictionaries, File operations
LING/C SC/PSYC 438/538 Lecture 10 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 4 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong.
Why are some countries more developed than others?
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
Introduction to Python
LING/C SC/PSYC 438/538 Lecture 7 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 4 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
Perl Variables: Array Web Programming.
Introduction to Python
Bryan Burlingame Halloween 2018
LING/C SC/PSYC 438/538 Lecture 6 Sandiway Fong.
LING 408/508: Computational Techniques for Linguists
LING/C SC/PSYC 438/538 Lecture 10 Sandiway Fong.
The University of Texas – Pan American
LING/C SC/PSYC 438/538 Lecture 12 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING 408/508: Computational Techniques for Linguists
Python Tutorial for C Programmer Boontee Kruatrachue Kritawan Siriboon
Topics Sequences Introduction to Lists List Slicing
LING/C SC/PSYC 438/538 Lecture 15 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 13 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
Relational Databases: Basic Concepts
Relational Databases: Basic Concepts
Bryan Burlingame Halloween 2018
Topics Sequences Introduction to Lists List Slicing
Introduction to Computer Science
CMSC201 Computer Science I for Majors Lecture 16 – Tuples
Global Commercial Agriculture
LING/C SC/PSYC 438/538 Lecture 7 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 4 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong.
Introduction to Computer Science
LING/C SC 581: Advanced Computational Linguistics
LING/C SC/PSYC 438/538 Lecture 12 Sandiway Fong.
Presentation transcript:

LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong

Administrivia Homework 4 Review: Continuing with Perl regex graded, email from me Continuing with Perl regex Ungraded Homework Exercises for next time

Homework 4 Review File: population.txt Contents: Source: Wikipedia rank name continent population (2016) population (2017) fields are separated by a tab (\t) Source: Wikipedia

Homework 4: Question 1 Review Using Perl read the file create hash table(s) indexed by country name containing the following information: continent/2016 population/2017 population Compute and print the country that decreased in population. Compute and print the country with the smallest positive increase in population. Print a table of countries in Asia and 2016 population ranked by 2016 population Print a table of countries in Africa and 2016 population ranked inversely by 2016 population

Homework 4: Question 1 Review perl hw4.perl population.txt Japan decreased in population Only one country, Japan, decreased in population! Russia increased in population the least! Asian countries ranked by 2016 population China 1403500365 India 1324171354 Indonesia 261115456 Pakistan 193203476 Bangladesh 162951560 Japan 127748513 Philippines 103320222 Vietnam 94569072 Iran 80277428 Turkey 79512426 Thailand 68863514 African countries ranked inversely by 2016 population Democratic Republic of the Congo 78736153 Egypt 95688681 Ethiopia 102403196 Nigeria 185989640

Homework 4: Question 1 Review Rank: #1, #2, #3, …

Homework 4: Question 1 Review Separate hash table for each field (continent, p2016, p2017, diff) (avoids references) For each line: chomp trim split remove commas assign

Homework 4: Question 1 Review $neggrowth = country with negative population growth $count = counts countries with negative population growth

Homework 4: Question 1 Review

Homework 4: Question 1 Review

Homework 4: Question 1 Review

Homework 4: Question 2 Review Do the same exercise in Python3 using a dictionary or dictionaries

Homework 4: Question 2 Review python3 hw4.py population.txt Only one country, Japan , lost population Country with minimum positive population growth is Russia Country 2016 population China 1,403,500,365 India 1,324,171,354 Indonesia 261,115,456 Pakistan 193,203,476 Bangladesh 162,951,560 Japan 127,748,513 Philippines 103,320,222 Vietnam 94,569,072 Iran 80,277,428 Turkey 79,512,426 Thailand 68,863,514 Country 2016 population Democratic Republic of the Congo 78,736,153 Egypt 95,688,681 Ethiopia 102,403,196 Nigeria 185,989,640

Homework 4: Question 2 Review with … as f: automatically closes the filehandle f for … in f: iterates over all the lines .strip().replace() left-to-right sequence order fields[0] rank fields[1] name fields[2] continent fields[3] 2016 population fields[4] 2017 population fields[2:] slice = fields[2:5] 'China' ['Asia', '1403500365', '1409517397']

Homework 4: Question 2 Review list comprehension grabs all country names c for countries where the 2016 population (table[c][1]) > 2017 population (table[c][2]) e.g. table['China'] = ['Asia', '1403500365', '1409517397'] table['China'][0] = 'Asia' table['China'][1] = '1403500365 ' table['China'][2] = '1409517397 '

Homework 4: Question 2 Review .pop() removes 'Japan' from the table; value is stored in variable saved function min() computes the smallest value in the table and returns the key (country name) associated with that value key= tells min() to compare the values given by a function (lambda) that when supplied with a country (k) return the expression given by 2017 population – 2016 population last line restores 'Japan' to the table

Homework 4: Question 2 Review the list comprehension finds all the countries in Asia function sorted() reverse sorts that list with parameter key=lambda k: int(table[k][1]) format string has basic form '{:s} {:d}'.format(X,Y) s=string, d=(decimal) integer. Options are <= left align, > = right align, and , = thousands comma

Homework 4: Question 2 Review Very similar code to that of the previous slide: no reversed=True

Homework 4: Question 3 Review Most of you preferred Python 3 of you preferred Perl Some cited % @ $ as making Perl hard (to write/read) Some used pandas (https://pandas.pydata.org)

Reading Homework Read up on the syntax of Perl Regular Expressions Online tutorials http://perldoc.perl.org/perlrequick.html http://perldoc.perl.org/perlretut.html Practice (ungraded): do regex exercises 2.1 in JM (pg. 42) I will review some of them on Thursday

Today's Topic More Perl Regex: Variables: $&, $`, $', $1, $2, $3, … Backreferences Greedy and non-greedy matching

Online regex tester https://regex101.com

Chapter 2: JM Precedence of operators Perl: Precedence Hierarchy: /house(cat(s|)|)/ (| = disjunction; ? = optional) Perl: in a regular expression the pattern matched by within the pair of parentheses is stored in global variables $1 (and $2 and so on) Precedence Hierarchy:

returns 1 (true) or "" (empty if false) Perl regex http://perldoc.perl.org/perlretut.html returns 1 (true) or "" (empty if false) A shortcut: list context for matching returns a list

Chapter 2: JM s/([0-9]+)/<\1>/ what does this do? Backreferences give Perl regexs more expressive power than Finite State Automata (FSA)

Shortest vs. Greedy Matching default behavior in Perl RE match: take the longest possible matching string aka greedy matching This behavior can be changed, see next slide

Shortest vs. Greedy Matching from http://www.perl.com/doc/manual/html/pod/perlre.html Example: $_ = "The food is under the bar in the barn."; if ( /foo(.*?)bar/ ) { print ”matched <$1>\n"; } Output: greedy (.*): matched <d is under the bar in the > shortest (.*?): matched <d is under the > Notes: ? immediately following a repetition operator like * (or +) makes the operator work in non-greedy mode (.*?) (.*)