Perl. Perl notes 2 Perl Perl - Practical extraction report language –for text files –system management –combines C, SED, AWK, SH –interpreted –dynamic.

Slides:



Advertisements
Similar presentations
1/12 Steven Leung Very Basic Perl Tricks A Few Ground Rules File I/O and Formatting Operators, Flow Control Statements Regular Expression Subroutines Hash.
Advertisements

A Guide to Unix Using Linux Fourth Edition
Computer Programming for Biologists Class 9 Dec 4 th, 2014 Karsten Hokamp
Second edition Your UNIX: The Ultimate Guide Das © 2006 The McGraw-Hill Companies, Inc. All rights reserved. UNIX – The Master Manipulator perl Perl is.
Programming Perls* Objective: To introduce students to the perl language. –Perl is a language for getting your job done. –Making Easy Things Easy & Hard.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
PERL Part 3 1.Subroutines 2.Pattern matching and regular expressions.
Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){
Perl hashes, compound data structures, formatting output, and special variables.
Guide To UNIX Using Linux Third Edition
Scripting Languages Chapter 8 More About Regular Expressions.
 2004 Prentice Hall, Inc. All rights reserved. Chapter 25 – Perl and CGI (Common Gateway Interface) Outline 25.1 Introduction 25.2 Perl 25.3 String Processing.
Lecture 7: Perl pattern handling features. Pattern Matching Recall =~ is the pattern matching operator A first simple match example print “An methionine.
Introduction to Perl & BioPerl Dr G. P. S. Raghava Bioinformatics Centre Bioinformatics Centre IMTECH, Chandigarh Web:
1 Perl Perl basics Perl Elements Arrays and Hashes Control statements Operators OOP in Perl.
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
Introduction to Perl Programming Morris Law December 8, 2012.
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Introduction To Perl Susan Lukose. Introduction to Perl Practical Extraction and Report Language Easy to learn and use.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
1 Perl Syntax: control structures Learning Perl, Schwartz.
Books. Perl Perl (Practical Extraction and Report Language) by Larry Wall Perl 1.0 was released to usenet's alt.comp.sources in 1987 Perl 5 was released.
Perl Language Yize Chen CS354. History Perl was designed by Larry Wall in 1987 as a text processing language Perl has revised several times and becomes.
Perl: Lecture 1 The language. What Perl is Merger of Unix tools – Very popular under UNIX – shell, sed, awk Programming language – C syntax Scripting.
Chapter 9: Perl Programming Practical Extraction and Report Language Some materials are taken from Sams Teach Yourself Perl 5 in 21 Days, Second Edition.
Chapter 9: Perl (continue) Advanced Perl Programming Some materials are taken from Sams Teach Yourself Perl 5 in 21 Days, Second Edition.
CPTG286K Programming - Perl Chapter 7: Regular Expressions.
Introduction to Unix – CS 21
Prof. Alfred J Bird, Ph.D., NBCT Office – McCormick 3rd floor 607 Office Hours – Tuesday and.
Perl II Part III: Motifs and Loops. Objectives Search for motifs in DNA or Proteins Interact with users at the keyboard Write data to files Use loops.
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
Introduction to sed. Sed : a “S tream ED itor ” What is Sed ?  A “non-interactive” text editor that is called from the unix command line.  Input text.
CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Topic 4:Subroutines CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 4, pages 56-72, Programming Perl 3rd edition pages 80-83,
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong. Adminstrivia Homework 4 not yet graded …
Introduction to Perl October 4, 2004 Class Meeting 7 * Notes on Perl by Lenwood Heath, Virginia Tech © 2004.
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting Regular Expressions.
CPTG286K Programming - Perl Chapter 1: A Stroll Through Perl Instructor: Denny Lin.
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
Introduction to Perl. What is Perl Perl is an interpreted language. This means you run it through an interpreter, not a compiler. Similar to shell script.
Perl Variables: Array Web Programming1. Review: Perl Variables Scalar ► e.g. $var1 = “Mary”; $var2= 1; ► holds number, character, string Array ► e.g.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
Prof. Alfred J Bird, Ph.D., NBCT Door Code for IT441 Students.
Perl Scripting III Arrays and Hashes (Also known as Data Structures) Ed Lee & Suzi Lewis Genome Informatics.
1 More Perl CIS*2450 Advanced Programming Concepts.
Computer Programming for Biologists Class 4 Nov 14 th, 2014 Karsten Hokamp
PERL By C. Shing ITEC Dept Radford University. Objectives Understand the history Understand constants and variables Understand operators Understand control.
Part 4 Arrays: Stacks foreach command Regular expressions: String structure analysis and substrings extractions and substitutions Command line arguments:
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
Finding substrings my $sequence = "gatgcaggctcgctagcggct"; #Does this string contain a startcodon? if ($sequence =~ m/atg/) { print "Yes"; } else { print.
The Scripting Programming Language
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
Perl References arrays and hashes can only contain scalars (numbers and strings)‏ if we want something more complicated (like an array of arrays) we use.
CSC 4630 Perl 3 adapted from R. E. Beck. Problem But we worked on it first: Input: Read from a text file named in a command line argument Output: List.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
CMSC330 More Ruby. Last lecture Scripting languages Ruby language –Implicit variable declarations –Many control statements –Classes & objects –Strings.
1 UNIX Operating Systems II Part 2: Shell Scripting Instructor: Stan Isaacs.
Week Four Agenda Announcements Link of the week Review week three lab assignment This week’s expected outcomes Next lab assignment Break-out problems.
Chapter 18 The HTML Tag
COMP234-Perl Variables, Literals Context, Operators Command Line Input Regex Program template.
Regular Expressions Copyright Doug Maxwell (
Looking for Patterns - Finding them with Regular Expressions
Chapter 5 - Control Structures: Part 2
Perl hashes, compound data structures, formatting output, and special variables.
Programming Perls* Objective: To introduce students to the perl language. Perl is a language for getting your job done. Making Easy Things Easy & Hard.
Presentation transcript:

Perl

Perl notes 2 Perl Perl - Practical extraction report language –for text files –system management –combines C, SED, AWK, SH –interpreted –dynamic

Perl notes 3 Data Structures scalars$num associative arrays%num $num[50] –50th element of the array num $#num –last index of num

Perl notes 4 Examples #! /usr/local/bin/perl -w # find the sum of a list of numbers from STDIN # one number per line $sum = 0; while( ) { $sum += int $_; } print "the sum is $sum\n";

Perl notes 5 Examples #!/usr/bin/perl -w # find the sum of a list of numbers from STDIN # several numbers per line $sum = 0; while( ) = split; foreach { $sum += int $_; } print "the sum is $sum\n";

Perl notes 6 Average #!/usr/bin/perl -w # find the average of a list of # numbers from STDIN # several numbers per line $sum = 0; $count = 0; while( ) = split; foreach { $sum += int $_; $count++; } print "the average is ", $sum/$count, "\n";

Perl notes 7 median #!/usr/bin/perl -w # find the median of a list of number # from STDIN # several numbers per = (); while( ) = split ); = if($#nums % 2) { $median = ($nums[($#nums - 1)/2] + $nums[($#nums + 1)/2])/2; } else { $median = $nums[$#nums/2]; } print "the median is $median\n";

Perl notes 8 Output? #!/usr/bin/perl = ("one", "two", "three"); "\n"; $stuff = ("one", "two", "three"); print $stuff, "\n"; $stuff print $stuff, "\n"; onetwothree8 three 3

Perl notes 9 Pattern Matching m// s/// Modifiers icase-insensitive mmultiple lines ssingle line xextend

Perl notes 10 Regular Expressions Code Meaning \w Alphanumeric Characters \W Non-Alphanumeric Characters \s White Space \S Non-White Space \d Digits \D Non-Digits \b Word Boundary \B Non-Word Boundary \A ^At the Beginning of a String \Z $At the End of a String. Match Any Single Character

Perl notes 11 Regular Expressions * Zero or More Occurrences ? Zero or One Occurrence + One or More Occurrences { N }Exactly N Occurrences { N,M } Between N and M Occurrences.* Greedy Match, up to the last thingy.*? Non-Greedy Match, up to the first thingy [ set_of_things ] Match Any Item in the Set [ ^ set_of_things ]Does Not Match Anything in the Set ( some_expression )Tag an Expression $1..$N Tagged Expressions used in Substitutions

Perl notes 12 Rules Rule 1 –The engine tries to match as far left as it can Rule 2 –The regular expression is regarded as set of alternatives. Tries them left to right. (see page 61) Rule 3 –Items that have choices match from left to right /x*y*/ Rule 4 –Assertions –^ $ \b \B \A \Z \G (?…) (?!…)

Perl notes 13 Rules Rule 5 –A quantified atom matches only if the atom itself matches some number of times allowed by the quantifier Maximalminimal {n,m}{n,m}? {n,}{n,}?At least n {n}{n}?Exactly n **?0 or more ++?1 or more ???0 or 1

Perl notes 14 Rules Rule 6 –Each atom matches according to its type –(…) ==> grouping + storage $1, $2 –. matches any char except \n –[…] groups –Special characters \a \n \r … –\1 \2... backreference to (…) –\033 octal char –\xf7 hex char –\cD control char –any other \ matches the char itself

Perl notes 15 precedence () (?: ) Repetition Sequence | alteration

Perl notes 16 How do you fix it? /(‘[^’]’*’)/

Perl notes 17 Examples s/^([^ ]) +([^ ]+)/$2 $1/ /(\w+)\s*=\s*\1/ /.{40,}/ /^((\d+\.?\d*|\.\d+)$/ if (/Time: (..):(..):(..)/){ $hours = $1; $minutes = $2; $seconds = $3; }

Perl notes 18 Default STDIN sub foo{ my $x = shift; default in the main while($_ = shift) { if(/^-(.*)/){ process_optein($1); } else { process_file($_); }

Perl notes 19 Reading a stream open FIN, “myfile” or die; while ( ){ # do something with $_ } foreach ( ){ # do something with $_ } print sort ;

Perl notes 20 Reading a stream # print a = ; foreach ( 0..$#f ) { if[$[$_] =~ /\bShazam\b/){ $lo = ($_ > 0)? $_ -1 : $_; $hi = ($_ < $#f) )? $_ +1 : $_; print map{“$_: $f[$_]”} $lo.. $hi; }

Perl notes 21 Sorting sort numerically sub numerically { $a $b = sort numerically (16, 1, 8, 2, 4, 32); = sort { $a $b } (16, 1, 8, 2, 4, = sort{uc($a) cmp uc($b)} qw(this is a test); = sort { $b $a } (16, 1, 8, 2, 4, 32);

Perl notes 22 example #! /usr/bin/perl -w # This script will count the frequency of distinct words # in the file that is given as an argument. # Warning: Error checking is minimal! die "usage: $0 file\n" while(<>){ tr/A-Z/a-z/; # translate to = split(/[\W]+/,$_); # split into words foreach $list{$_}++; # increment the counter } foreach $key (sort {$list{$b} $list{$a}} keys %list) { print $key, ' = ', $list{$key}, "\n"; }

Perl notes 23 Tokenizing # tokenize an arithmetic expression while($_){ if(/^(\d+)/) { ‘num’, $1; } elsif(/^([+\-\/*()])/) { ‘punct’, $1; } elsif (/^([\d\D])/) { die “invalid char $1 in input”; } $_ = substr($_, length $1); } substr slows things down –cut start of string

Perl notes 24 Tokenizing 2 while(/ (\d+) | ([+\-\/*()]) | ([\d\D])/gx) { if($1 ne “”){ ‘num’, $1; }elsif ($2 ne “”) { ‘punct’, $2; }else { die “invalid char $3 in input”; }

Perl notes 25 Tokenizing 3 { if(/\G(\d+)/gc) { ‘num’, $1; } elsif(/\G([+\-\/*()])/gc) { ‘punct’, $1; } elsif (/\G([\d\D])/gc) { die “invalid char $1 in input”; }else{ last; } redo; }

Perl notes 26 Use split for clarity ($a, $b, $c) = /^(\S+)\s+(\S+)\s+(\S+)/; ($a, $b, $c) = split /\s+/, $_; ($a, $b, $c) = split; Get the fifth field: ($a) = /[^:]*:[^:]*:[^:]*:[^:]*:([^:]*)/; or ($a) = /(?:[^:]*:){4}([^:]*)/; or ($a) = (split /:/)[4];

Perl notes 27 unpac ps l F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND rt_sig S pts/2 0:00 -tcsh R pts/2 0:00 ps l chomp = `ps l`); ($uid, $pid, $sz, $tt) = unpack A7', $_; print "$uid, $pid, $sz, $tt\n"; }

Perl notes 28 Avoid regex for simple strings do_it() if $answer eq ‘yes’; do_it() if $answer =~ /^yes$/; do_it() if $answer =~ /yes/; do_it() if lc($answer) eq ‘yes’; do_it() if $answer =~ /^yes$/i;

Perl notes 29 #!/usr/bin/perl # remove the comments from a C program $filename = shift or die "usage $0 filename\n"; open FIN, $filename or die "can't open file"; while ( ){ for(split m!("(:?\\\W|.)*?"|/\*|\*/)!){ if($in_comment){ $in_comment = 0 if $_ eq "*/"; } else { if ($_ eq "/*") { $in_comment = 1; print " "; } else { print; } print "\n"; }

Perl notes 30 References $a = ; $scalar_ref = \$a; $array_ref = $hash_ref = \%a; $array_el_ref = \$a[3]; $hash_el_ref = \$a{‘John’};

Perl notes 31 Lists of = ( [“fred”, “barney” ], [“george”, “jane”, “elroy” ], [“homer”, “marge”, “bart” ], ); print $LoL[2][2]; # prints “bart” $ref_to_LoL = [ [“fred”, “barney” ], [“george”, “jane”, “elroy” ], [“homer”, “marge”, “bart” ], ]; print $ref_to_LoL ->[2][2]; Note: $LoL[2][2] implies $LoL[2]->[2]

Perl notes 32 Grow your own = split; ]; }

Perl notes 33 Hashes of Arrays %HoL = ( flinstones => [“fred”, “barney” ], jetsons => [“george”, “jane”, “elroy” ], simpsons => [“homer”, “marge”, “bart” ], ); generation # reading from a file with format: # flistones: fred barney.. while(<>){ next unless s/^(.*?):\s*//; $HoL{$1} = [ split ]; } or while($line = <>){ ($who, $rest) = split /:\s*/, = split ‘ ‘, $rest; $Hol{$who} = ]; }

Perl notes 34 Hashes of Arrays # calling a function for $group (flinstones, jetsons, simpsons) { %HoL($group) = [ get_family($group) ]; ); # append member to existing family $HoL{flinstones} }, “wilma”, “betty”; access $HoL{flinstone}[0] = “fred”;

Perl notes 35 Packages, Modules, and Object Classes