CS 403: Programming Languages Lecture 15 Fall 2003 Department of Computer Science University of Alabama Joel Jones
Lecture 15©2003 Joel Jones2 Overview Announcements Homework Minilanguages Preliminaries—data formats Taxonomy of Minilanguages Case Studies
Lecture 15©2003 Joel Jones3
Lecture 15©2003 Joel Jones4 Minilanguage Definition—little language specialized for a particular application domain A good notation has a subtlety and suggestiveness which at times makes it almost seem like a live teacher. —Bertrand Russell The World of Mathematics (1956) Pair Up: Why use a minilanguage ? Hint: Error rates are largely independent of the language used.
Lecture 15©2003 Joel Jones5 Data Formats Programs store and retrieve data from files and/or transmit data to another program, via a network or some other mechanism. But, what should the format be? Binary–fast, but inflexible Textual–flexible, but slow(?)
Lecture 15©2003 Joel Jones6 But what distinguishes text from binary? Pair Up: List differences between a text format and a binary format
Lecture 15©2003 Joel Jones7 On the importance of being textual Flexible, but what does that mean? Not brittle, allows for future changes in the format/contents Pair Up: What happens if larger numbers are needed in a binary format? In a text format? We’ll examine the issue of speed later
Lecture 15©2003 Joel Jones8 Styles (metaformats) of structuring textual data Delimiter separated value Example, Unix passwd file: games:*:12:100:games:/usr/games: gopher:*:13:30:gopher:/usr/lib/gopher-data: ftp:*:14:50:FTP User:/home/ftp: esr:0SmFuPnH5JlNs:23:23:Eric S. Raymond:/home/esr: nobody:*:99:99:Nobody:/: Colons are allowed inside fields by using escape mechanism: \: Much better than quoting mechanism used by various CSV files from Microsoft–quotes within quotes
Lecture 15©2003 Joel Jones9 Textual works for protocols also SMTP: Simple Mail Transfer Protocol Used to send mail C: C: HELO snark.thyrsus.com sending host identifies self S: 250 OK Hello snark, glad to meet you receiver acknowledges C: MAIL FROM: identify sending user S: Sender ok receiver acknowledges C: RCPT TO: identify target user S: 250 root... Recipient ok receiver acknowledges C: DATA S: 354 Enter mail, end with "." on a line by itself C: Scratch called. He wants to share C: a room with us at Balticon. C:.end of multiline send S: 250 WAA01865 Message accepted for delivery C: QUIT sender signs off S: 221 cpmy.com closing connection receiver disconnects C:
Lecture 15©2003 Joel Jones10 Taxonomy of minilanguages /etc/passwd Yacc make XSLT awk Postscript Data FormatsMinilanguagesInterpreters Less to more general declarative to imperative Flat to structured SNGregexpsbc JavaScript sh Perl Python Java Increasing Loopiness
Lecture 15©2003 Joel Jones11 Case Studies Performed live on stage! The following demo will be captured using script, a Unix command that captures input and output Using a shell as glue Pipes and filters Wrappers vs. programs
Lecture 15©2003 Joel Jones12 Case Studies: Regular Expressions. Matches any single character * Matches zero or more copies of the preceding expression [] Character class which matches any character within the brackets. If it starts with a circumflex ‘^’ negates test. A dash indicates a range ^ Matches the beginning of a line $ Matches the end of a line \ escapes metacharacters e.g. “\n” is a newline character, “\*” is a literal asterisk + Matches one or more occurrences ? Matches zero or one occurences
Lecture 15©2003 Joel Jones13 Regular Expression (Continued) | Matches either the regular expression before or after “…” Interprets everything inside quotes literally () Groups a series of regular expression together into a new regular expression