Download presentation
Presentation is loading. Please wait.
Published byTyler Harrell Modified over 9 years ago
1
Introduction to Perl Pawel Sirotkin 28.11-01.12.2008, Riga
2
Overview Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 2 About programming Why Perl? How to write, how to run Variables Operations Basic input and output Conditionals and loops Regular expressions
3
About programming Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 3 Working with algorithms Program needs to contain exact commands (Mostly) not: Go buy some bread But: Put on your coat and shoes, open the door, go through it, close the door, go down the stairs… Has a certain input Processes it Produces a certain output
4
Why Perl? Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 4 Easy to learn Simple syntax Good at manipulating text Good at dealing with regular expressions
5
How to write a Perl program Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 5 Perl programs can be written in any text editor Notepad, vim, even Word… Recommended: A simple text editor with syntax highlighting Write the program code Save the file as xxx.pl .pl extension not necessary, but useful
6
What is a Perl program like? Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 6 # This *very* simple program prints "Hello World!“ print "Hello World!";
7
What is a Perl program like? Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 7 The content of a line after the # is commentary. It is ignored by the program What are commentaries for, then? They are for you, and others who will have to read the code Imaging looking at a complex program in a few months and trying to figure out what it does Write as much commentary as you can # This *very* simple program prints "Hello World!“ print "Hello World!";
8
What is a Perl program like? Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 8 This is a Perl command In this case, for printing text on the screen Every command should start at a new line Not a Perl requirement, but crucial for readability Every command should end with a semicolon; Many commands take arguments Here: “Hello World!” # This *very* simple program prints "Hello World!“ print "Hello World!";
9
What to do with the program? Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 9 Perl works from the command line Windows: „Start“ „Run…“ Go to the directory where you saved the program E.g.: cd C:\Perl\MyPrograms Run the program: perl myprogram.pl See the results of your labours!
10
Exercise (1) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 10 Create a folder for your Perl programs Open the editor of your choice and write the „Hello World“ program The command is print „Hello World!“; Don‘t forget the commentary! Save the program Run it! What happens if you misprint the print command?
11
Variables Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 11 The „Hello World“ program always has the same output Not a very useful program, as such We need to be able to change the output Variables are objects that can hold different values
12
Defining variables Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 12 To define a variable, write a dollar sign followed by the variable’s name Names should consist of letters, numbers and the underscore They should start with a letter Variable names are case-sensitive! $a and $A are different variables! Generally, a variable’s name should tell you what the variable does # We define a variable „a“ and assign it a value of „42“ $a = 42;
13
Defining variables Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 13 Variables can be assigned values String: text (character sequence) in quotes/double quotes Numbers $a = 42; $a = “some text”; # We define a variable „a“ and assign it a value of „42“ $a = 42;
14
Changing variables Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 14 Arithmetic operations $a = 42 / 2;# division $a = 42 + 5;# addition $a = $b * 2;# multiplication $a = $a - $b;# subtraction Also useful: $a += 42;# the same as $a = $a + 42; The same for +, -, / String operations $a = “some“. “ text“;# concatenation $a = $a. “ more text“;
15
Basic output Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 15 We have already seen an output command print “text“; print $a; print “text $a“; print “text “. $a+$b. “ more text.“; Special characters: \n – new line \t – tabulator
16
Exercise (2) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 16 Define a variable Assign it a value of 15 Print it Double the value Print it again Define another variable with the string „apples“ Print both variables Change the first variable to its square and the second to „pears“ Print both variables
17
Basic input Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 17 The <> operator returns input from the standard source (usually, the keyboard) Syntax: $a = <>; Don’t forget to tell the user what he’s supposed to enter! Try the following program: # This program asks the user for his name and greets him print "What is your name? "; $name = <>; print "Hello $name!";
18
Input, output and new lines Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 18 As the user input is followed by the [Enter] key, the string in $name ends in a new line The chomp function deletes the new line at the end of a string Try the following, modified program: # This program asks the user for his name and greets him print "What is your name? "; $name = <>; chomp($name); print "Hello $name!";
19
Exercise (3) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 19 Let the user enter the radius of a circle Tell him the diameter (2r), circumference (2 π r) and area ( π r²) of the circle Try doing this using one variable for each measure Try doing this using only one variable
20
If, else Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 20 Until now, the course the program runs is fixed The if clause allows us to take different actions in different circumstances # Let‘s try out a conditional clause print "Please enter password: "; $password = <>; if ($password == 42) { print "Correct password! Welcome."; } else { print "Wrong password! Access denied."; }
21
If, else Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 21 Note: = is the assignment operator, == is the comparison operator Else is an optional operator triggering if the if condition fails # Let‘s try out a conditional clause print "Please enter password: "; $password = <>; if ($password == 42) { print "Correct password! Welcome."; } else { print "Wrong password! Access denied."; }
22
Exercise (4) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 22 Try out the password program. Why doesn‘t it work correctly? Fix it. Tell the user if the number he entered is too large or too small Hint: The comparison operators you’ll need are Ask the user for a geometrical form (circle or square), and then for a radius or side length. Return the area and perimeter.
23
While Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 23 What if we want to do checks until something happens? The while loop repeats commands until its criteria are met Note: in the example below, $password has no value, so it specifically doesn’t have the value 42 # Now on to a "while" loop while ($password != 42) { print "Access denied.\n"; print "Please enter password: "; $password = <>; chomp($password); } print "Correct password! Welcome.";
24
Exercise (5) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 24 Write a small game: take a number, and make the user guess it. Tell him if it‘s too high or too low. If the user gets it right, the program terminates. If you like, you can take a random number: $random = int (rand(10) );
25
Perl regular expressions Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 25 Regular expressions very useful for text processing Perl matching character: =~ Perl non-matching character: !~ The regular expression must be in backslashes: /regex/ The program below accepts any password that contains the characters „42“ anywhere # A "while" loop with regular expressions while ($password !~ /42/) { # While the entered line doesn’t contain “42” print "Access denied.\n"; print "Please enter password: "; $password = <>; chomp($password); } print "Correct password! Welcome.";
26
Perl regular expressions Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 26 Simple string: some text One of a number of symbols: [aA] Matches a or A Also possible: [tT]he, matching the or The One of a continuous string of symbols: [a-h][1-8] Matches any two-character string from a1 to h8 Special characters ^ matches the beginning of a line $ matches the end of a line
27
Perl regular expressions Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 27 More special characters Wildcard: the dot. Matches any single character b.d matches bad, bed, bid, bud… Don‘t forget: it also matches forbid, badly… + matches one or more of the previous character re+d matches red and reed (and also reeed and so on!) * matches zero or more occurrences of the previous character bel* matches be, bel and bell (and belll…) ? matches zero or one occurrences of the previous character soo?n Matches son or soon
28
Perl regular expressions Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 28 Character classes \d: digits Rule \d+ matches Rule 1, Rule 2,..., Rule 334... \w: “word characters” – letters, digits, _ \w \w – any two “words” separated by a blank \s: any whitespace (blanks, tabs) ^\s+\d – any line where the first character is a digit Capitalize the symbols to get the opposite \S is anything but whitespace, \D are non-digits…
29
Exercise (6) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 29 Write a program which asks the user for his e-mail address. Check if the address is syntactically correct. Possible rules: Must contain an @ character At least one symbol before it Must contain a dot At least two symbols between @ and. At least two symbols after. No fancy symbols like {§* Do you accept addresses with more than one dot?
30
Perl regular expressions Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 30 Switches Tell Perl how to deal with the regular expression /regex/i: ignore lower/upper case /wiebke/i matches Wiebke and wiebke s/regex/regex2/: substitute regex with regex2 $text =~ s/Mark/Euro/ /regex/g: repeat match until end of the line # What the //g switch does $text = “The meat costs 10 Mark, the fish costs 15 Mark.”; $text2 = $text1; $text =~ s/Mark/Euro/; # “The meat costs 10 Euro, the fish costs 15 Mark.” $text2 =~ s/Mark/Euro/g; # “The meat costs 10 Euro, the fish costs 15 Euro.”
31
Perl regular expressions Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 31 Grouping Allows us to use matched string /(text)/ matches text and stores it in a variable The first group is stored in $1, the second in $2... # Substitution and grouping $sum = 0; # initializing the variable with zero $text = “The meat costs 10 Mark, the fish costs 15 Mark.” while ($text =~ s/(\d+) Mark/$1 Euro/) { # numbers-spaces-”Mark” $sum = $sum + $1; # adding amount to $sum value } print “Substituted $sum Mark for Euro!”;
32
Reading files Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 32 What if we want to have input from a file, not from the user? Open file for reading: open(INPUT, "<file.ext"); Read a line: $line = ; $line = <>; # is just a special case
33
Writing files Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 33 What if we want to print to a file, not to the screen? Open file for writing: open(OUTPUT, “>file.ext"); Write: print OUTPUT “Some text...”;
34
Reading files Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 34 A program for testing e-mail addresses Note: If we want to use a special character literally, we need to escape it with a backslash In strings : " In regular expressions:. + * ^ $ and the backslash \ itself open(INPUT, "<test.txt"); while ($line = ) { chomp($line); if ($line =~ /^.+@..+\...+$/) { # testing for e-mail: x@xx.xx print "\"$line\" is a valid e-mail address.\n"; } else { print "E-mail address \" $line\" not valid.\n"; }
35
Exercise (7) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 35 Make a text file and fill it with a Wikipedia article Count the number of definite and indefinite articles Count the number of numbers and digits Insert a tag before every number
36
Arrays Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 36 Arrays contain lists of variables Syntax: @days = [“Monday“, “Tuesday“, “Friday“]; $days[0] = “Saturday“; $day = $days[2]; Useful for storing linear sequences of variables Note: @ for whole lists, $ for single variables
37
Arrays Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 37 Useful array commands push(@array, “element“); Adds a new element to the end of the array Creates the array if necessary $element = pop(@array); Moves the last value of @array to $element # Trying out arrays @tags = (“N”, “V”, “Adj”); $tag1 = pop(@tags);# $tag1 is now “Adj”, @tags is (“N”, “V”) $tag2 = pop(@tags);# $tag2 is now “V”, @tags is (“N”) Push(@tags, „V“, $tag2);# @tags is now again (“N”, “V”, “Adj”)
38
Hashes Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 38 Hashes are associative arrays They are lists where the elements are not ordered, but identified by a „name“ Syntax: %probability = (”verb“, 0.32, “adjective“, 0.02, “adverb“, 0); $probability{“noun”} = 0.52;
39
Exercise (7) Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 39 What happens if you try to print an array? What about a hash? What happens if you convert an array into a hash, or the other way round?
40
Practical: Tokenizer Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 40 Take a Wikipedia article and put it into a text file Clean it up if necessary Tokenize it! We only want one word per line Insert a „sentence boundary“ symbol where appropriate The output should be another file Think about what choices you make and why!
41
Practical: Tagger Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 41 Take the POS-annotated corpus from treebank.txt Clean and tokenize it Count the tag-token probabilities Count the transition probabilities For the first time, I strongly recommend bigrams Apply the Viterbi algorithm and tag an input file of your choice!
42
Practical: Tagger++ Introduction to Perl, NLL Riga 2008, by Pawel Sirotkin 42 If it‘s still too easy, or if you want a long-term aim: Implement smoothing: words can have tags you haven‘t seen them with, or appear in contexts you never saw them before Try to figure out a way to guess the tags for unknown words better Write a program to train on 9/10 of the corpus, and test it on the rest. Compare your results to the actual annotations Do this 10 times for every 9/10 Still too easy? Implement trigrams and compare the results.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.