Download presentation
Presentation is loading. Please wait.
1
Programming Languages
Meeting 13 November 29/30, 2016
2
Short Exam Reflections
3
Matrix Project Helpfulness of LispWorks
Primitives and functionals submitted Questions?
4
Scripting Languages Originally, tools for
Quick, dirty programs Rapid prototyping for text-based computation Glue between other programs File format conversion Evolved to mainstream programming tools Check current estimates on the number of lines of Javascript code doing productive work
5
Characteristics Strings as the basic, maybe only, data type
Associative arrays (hashes?) as the basic structured data type Regular expressions as a principle programming structure Minimal use of types and declarations Usually interpreted rather than compiled
6
Examples Unix shell AWK Perl Python Tcl Javascript VBScript, Jscript
PHP Lua
7
Our Approach Look at AWK and Perl first
Determine the syntactical structure of each language, starting with AWK Use experimentation to discover the semantics Work several exercises on each language during class time
8
AWK Age: about 40 years Developed at Bell Labs by
Al Aho, Brian Kernighan, Peter Weinberger Aside: Where are each of these computing pioneers now? Intended for simple manipulation of, and data extraction, from text files 40 years ago data existed in text files almost exclusively
9
AWK Examples Print all lines longer than 80 characters. length > 80
Replace the second field in each line by its logarithm and print the line. { $2 = log($2); print } Add up the numbers in the first field and report the sum and average. { sum += $1 } END { print sum, sum/NR }
10
AWK Syntax <program> ::= [<begstate>] <stateseq> [<endstate>] <stateseq> ::= <statement> {<statement>} <statement> ::= <pattern> | { <action> } | <pattern> { <action> } <begstate> ::= BEGIN { <action> } <endstate> ::= END { <action> }
11
Heart of AWK <pattern> <action> A regular expression
A numeric expression A combination of the previous two <action> Executable code, similar in structure to C
12
Inferred Control Structure
Previous languages: sequence Execute S1, then S2, then S3, … AWK: a triply nested for-loop surrounding an if-then. Specifically for each file for each input line for each pattern if pattern matches input line then action
13
Program Development Use your favorite text editor to create an AWK program file. Run the program awk –f myprog [file1 file2 … ] OR for really short programs (one line) awk ‘program’ [file1 file2 …]
14
Some Interesting Data From the course website, under Resources, download the four data files Flight Aware data Moby Dick extract Web form data Classic AWK data on countries (these data come from the 1985 AWK manual)
15
Some AWK Programs Try each of these programs using flightaware.txt as the input file { print NR, $0 } { $1 = NR; print } Note that <action> is surrounded by { } and may consist of several statements separated by ;
16
More Programs {print $NF} /Cancelled/ END {print NR}
17
AWK Built-Ins NF – number of fields on a line NR – number of lines (records) in a file, actually the number of the current line read in the file $k – name of the kth field, count starts at 1 $0 – name of the current line FILENAME – name of current input file
18
More Built-Ins FS – field separator, typically reset in the BEGIN action if something other than space is needed OFS – output field separator RS – record separator ORS – output record separator OFMT – output format
19
Your Turn Write AWK programs to:
Determine how many flights were cancelled Find the latest arriving flight from those in the list List the different aircraft used for this flight number.
20
Checking What You Know Go to kahoot.it Enter quiz PIN: Enter your name
21
Regular Expressions A single character not otherwise endowed with special meaning matches that character. Example: m matches any string containing an m Two regular expressions concatenated match a match of the first pattern followed immediately by a match of the second. Examples: mc matches any string containing mc Robert matches any string containing Robert (this is a concatenation of 6 regular expressions)
22
Regular Expressions (2)
Characters with special meaning in certain contexts are:
\ [ ] ^ $ * ? ( ) | Two questions: What are the “certain contexts” that give rise to “special meanings”? What if you don’t want a “special meaning”? Answer: A \ followed by a single character matches that character. Examples: \$ matches the dollar sign $ \\ matches the backslash \ \\\\ matches two backslashes in a row
23
Regular Expressions (3)
A . matches any (one) character. Examples: A.b matches Aeb, A$b, A_b, Abb
24
Regular Expressions (4)
A string enclosed in brackets [ ] matches any single character from the string. Some characters in the string may have special meanings depending on their positions. [0-9] matches a line with a digit in it ] as a string element must be the first character. []a] matches either ] in the line or a in the line
25
Regular Expressions (5)
- between two characters in ascending ASCII order denotes the inclusive set of characters; otherwise, it has no special meaning. E.g. A-M denotes the first 13 upper case letters. ^ as the first string element indicates the complement of the string in the alphabet; otherwise, it has no special meaning. [^0-9] matches a string that has a character other than a digit in it [^a-zA-Z0-9] matches a string that has a least one symbol in it
26
Regular Expressions (6)
A regular expression followed by * matches a sequence of 0 or more matches of the regular expression. A regular expression followed by + matches a sequence of 1 or more matches of the regular expression. A regular expression preceded by ^ is constrained to matches at the beginning of the line. A regular expression followed by $ is constrained to matches at the end of the line.
27
AWK Patterns One form of a pattern for AWK is a regular expression enclosed in a pair of slashes / / A pattern can be limited to one field by using the match (or does not match) symbols /plane/ ~ $2 /train/ !~ $4
28
AWK Patterns (2) Your turn: For each of the patterns on the next slide, describe the lines that would be matched by an AWK program using that pattern.
29
AWK Patterns (3) /F$/ /[xy]/ /Mc/ /ab*c/ /[A-Za-z]*/ /[0-9].[0-9]/
/$$/ /ab.c*/ /^[^A-Za-z0-9]/ /[JS]r\.$/
30
AWK Patterns (4) For the web form data, generate a list of the properly constructed US and Canada telephone numbers. Find all proper names in the Moby Dick extract. Caution: This is an html file, so tags abound For the countries data What is the population of Asia? What percentage of the world’s population lives in Asia? How many countries the size of Sudan will fit inside Canada?
31
AWK Exercises Given a file of words, one per line, write an AWK script that returns the frequency count of the letters in the words. Use a template that has one action statement in body, a for loop has one statement for the END pattern, a for loop that controls the printing uses one user-defined variable, an array called lc uses the substring function, substr, to split each word into its individual characters.
32
AWK Exercises Suppose each line of a file looks like the 14 character string that represents a US phone number in cellphone standard form (610) How many fields are there in the string if you are using default FS? What are their lengths? What if FS = “-”? Neither of these answers is acceptable for processing phone numbers. Reformat the string (610) and others of this form by using an AWK script with a single action statement and writing the string as
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.