1.0.1.8 – Introduction to Perl 10/24/2015 1.0.1.8.4 - Introduction to Perl - Hashes and Sorting 1 1.0.1.8.4 Introduction to Perl Session 4 · hashes · sorting.

Slides:



Advertisements
Similar presentations
Intro to Scala Lists. Scala Lists are always immutable. This means that a list in Scala, once created, will remain the same.
Advertisements

Arrays A list is an ordered collection of scalars. An array is a variable that holds a list. Arrays have a minimum size of 0 and a very large maximum size.
Chapter 6 Lists and Dictionaries CSC1310 Fall 2009.
Programming and Data Structure
1 Arrays in PHP PHP Functions. 2 Array Definition An array is a complex variable that enables you to store multiple values in a single variable; It comes.
LING/C SC/PSYC 438/538 Lecture 4 Sandiway Fong. Administrivia Homework 1 graded – you should have gotten an from me.
Hashes a “hash” is another fundamental data structure, like scalars and arrays. Hashes are sometimes called “associative arrays”. Basically, a hash associates.
Fourteen lists and compound data. Some primitive data types Integers (whole numbers) 1, 2, 3, -1, 0, , etc. “Floating-point” numbers ,
Sorting. Simple Sorting As you are probably aware, there are many different sorting algorithms: selection sort, insertion sort, bubble sort, heap sort,
CS 106 Introduction to Computer Science I 03 / 30 / 2007 Instructor: Michael Eckmann.
CS 106 Introduction to Computer Science I 03 / 03 / 2008 Instructor: Michael Eckmann.
MATLAB Cell Arrays Greg Reese, Ph.D Research Computing Support Group Academic Technology Services Miami University.
CS 106 Introduction to Computer Science I 02 / 19 / 2007 Instructor: Michael Eckmann.
CS 106 Introduction to Computer Science I 10 / 15 / 2007 Instructor: Michael Eckmann.
CS 106 Introduction to Computer Science I 10 / 16 / 2006 Instructor: Michael Eckmann.
CMPT 120 Lists and Strings Summer 2012 Instructor: Hassan Khosravi.
1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.
PERL Variables and data structures Andrew Emerson, High Performance Systems, CINECA.
– Intermediate Perl 9/16/ Intermediate Perl - References Intermediate Perl Session 1 · references · complex data structres.
Lists in Python.
LING/C SC/PSYC 438/538 Lecture 4 Sandiway Fong. Continuing with Perl Homework 3: first Perl homework – due Sunday by midnight – one PDF file, by .
Session 7 JavaScript/Jscript: Arrays Matakuliah: M0114/Web Based Programming Tahun: 2005 Versi: 5.
Control Structures. Important Semantic Difference In all of these loops we are going to discuss, the braces are ALWAYS REQUIRED. Even if your loop/block.
DHTML AND JAVASCRIPT Genetic Computer School LESSON 5 INTRODUCTION JAVASCRIPT G H E F.
Object Oriented Programming Lecture 5: Arrays and Strings Mustafa Emre İlal
Arrays BCIS 3680 Enterprise Programming. Overview 2  Array terminology  Creating arrays  Declaring and instantiating an array  Assigning value to.
– Introduction to Perl 10/20/ Introduction to Perl - Lists and Arrays Introduction to Perl Session 3 · lists and arrays.
Meet Perl, Part 2 Flow of Control and I/O. Perl Statements Lots of different ways to write similar statements –Can make your code look more like natural.
Perl Practical(?)‏ Extraction and Report Language.
– Introduction to Perl 10/25/ Introduction to Perl - Recipes and Idioms Introduction to Perl Session 8 · recipes and.
Class 2Intro to Databases Goals of this class Include & Require in PHP Generating Random Numbers in PHP Arrays – Numerically Indexed and Associative Program.
CS107 References and Arrays By Chris Pable Spring 2009.
Introduction to Perl Giorgos Georgakilas Graduated from C.E.I.D.Graduated from C.E.I.D. M.Sc. degree in ITMBM.Sc. degree in ITMB Ph.D. student in DIANA-LabPh.D.
Built-in Data Structures in Python An Introduction.
Chapter 9: Perl Programming Practical Extraction and Report Language Some materials are taken from Sams Teach Yourself Perl 5 in 21 Days, Second Edition.
And other languages….  Array literals/initialization a = [1,2,3] a2 = [-10..0, 0..10] a3 = [[1,2],[3,4]] a4 = [w*h, w, h] a5 = [] empty = Array.new zeros.
Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
CS 105 Perl: Data Types Nathan Clement 15 May 2014.
Python Primer 1: Types and Operators © 2013 Goodrich, Tamassia, Goldwasser1Python Primer.
Perl Basics. sh-bang !!!! Every perl program starts with a sh-bang line #!/usr/bin/perl # hello.pl printf “Hello, world!\n”; printf STDOUT “Hello, world!\n”;
– Introduction to Perl 12/12/ Introduction to Perl - Searching and Replacing Text Introduction to Perl Session 7 ·
– Introduction to Perl 12/13/ Introduction to Perl - Strings, Truth and Regex Introduction to Perl Session 2 · manipulating.
Introduction to Perl October 4, 2004 Class Meeting 7 * Notes on Perl by Lenwood Heath, Virginia Tech © 2004.
 In computer programming, a loop is a sequence of instruction s that is continually repeated until a certain condition is reached.  PHP Loops :  In.
Introduction to Perl. What is Perl Perl is an interpreted language. This means you run it through an interpreter, not a compiler. Similar to shell script.
Perl Variables: Array Web Programming1. Review: Perl Variables Scalar ► e.g. $var1 = “Mary”; $var2= 1; ► holds number, character, string Array ► e.g.
Copyright © 2000, Department of Systems and Computer Engineering, Carleton University 1 Introduction An array is a collection of identical boxes.
CS 106 Introduction to Computer Science I 03 / 02 / 2007 Instructor: Michael Eckmann.
– Intermediate Perl 2/1/ Intermediate Perl - sort/grep/map Intermediate Perl – Session 3 · map · transforming data ·
An Introduction to Java – Part 1 Erin Hamalainen CS 265 Sec 001 October 20, 2010.
_______________________________________________________________________________________________________________ PHP Bible, 2 nd Edition1  Wiley and the.
BINF 634 Fall LECTURE061 Outline Lab 1 (Quiz 3) Solution Program 2 Scoping Algorithm efficiency Sorting Hashes Review for midterm Quiz 4 Outline.
Arrays and Lists. What is an Array? Arrays are linear data structures whose elements are referenced with subscripts. Just about all programming languages.
Arrays Declaring arrays Passing arrays to functions Searching arrays with linear search Sorting arrays with insertion sort Multidimensional arrays Programming.
Creating PHP Pages Chapter 10 PHP Arrays. Arrays An array can store one or more values in a single variable name. An element of an associative accessed.
Operators. Perl has MANY operators. –Covered in Chapter 3 of Camel –perldoc perlop Many operators have numeric and string version –remember Perl will.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Arrays.
– sort/grep/map in Perl 6/14/ A closer look at sort/grep/map Perl’s sort/grep/map · map · transforming data · sort.
Lists/Dictionaries. What we are covering Data structure basics Lists Dictionaries Json.
Course Contents KIIT UNIVERSITY Sr # Major and Detailed Coverage Area
Working with Collections of Data
Containers and Lists CIS 40 – Introduction to Programming in Python
Miscellaneous Items Loop control, block labels, unless/until, backwards syntax for “if” statements, split, join, substring, length, logical operators,
Lists Part 1 Taken from notes by Dr. Neil Moore & Dr. Debby Keen
Arrays, For loop While loop Do while loop
Perl Variables: Array Web Programming.
Object Oriented Programming in java
Python Primer 1: Types and Operators
Arrays.
Presentation transcript:

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting Introduction to Perl Session 4 · hashes · sorting do this for clarity and conciseness Perlish construct unless you are a donkey, don’t do this

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 2 Recap · array variables are prefixed and are 0-indexed · arrays are used when · you have an ordered set of values, or · you want to group values together, without caring about = = (1..3); $array[0];# first element $array[1];# second element $array[-1];# last element $array[-2];# second-last element $#array;# index of last make a copy of array – list context $length number of elements in array – scalar context $array[$#array];# last element # last element

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 3 Recap · we iterated over an array in two ways · iterate over elements · iterate over index · we saw that arrays grow and shrink as necessary · push/unshift were used to add elements to back/front of array · manipulating $#array directly changed the size of the = (1..10); # iterate over elements for $elem { print qq{element is $elem}; } # iterate over index for $i { print qq{index is $i element is $array[$i]}; }

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 4 Final Variable Type - Hash · recall that Perl variables are preceded by a character that identifies the plurality of the variable · today we will explore the hash variable, prefixed by % · an array is a set of elements indexed by a range of integers [0,1,2,...] · a hash is a set of elements indexed by any set of distinct strings animal scalararrayhash

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 5 Scalars, Arrays and Hashes · scalar holds a single value · “indexed” by variable n-1 apple banana grape pear plum... $fruit apple %fruits red yellow blue green purple apple banana grape pear lemon... scalar array hash · array holds any number of values · elements indexed by integers 0..n-1, where n is the number of elements · elements are stored in order, i.e. there is a sense of previous/next element · hash holds any number of values · values indexed by strings (keys), which must be unique · values are not stored in order, there is no sense of previous/next element key:value index:value

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 6 Declaring and Initializing Hashes · a hash is composed of a set of key/value pairs · an element is accessed using $hash{key} syntax · c.f. $array[$index] · whereas [ ] were used for arrays, { } are used in = ();# empty array %fruits = ();# empty hash $fruits{yellow} = “banana”; $fruits{red} = qq(apple); $fruits{green} = q(pear); ($fruits{purple},$fruits{orange}) = qw(plum mango);

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 7 Declaring and Initializing Hashes · you can declare and initialize an entire hash at once · you do not need to quote single-word keys · hash can be interpreted as an array with even number of elements with element 2i being the key and 2i+1 being the value %fruits = ( yellow => “banana”, red => “apple”, green => “pear” ) ; %fruits = ( yellow => “banana”, red => “apple”, green => “pear” ); notice the ( ) brackets here which are reminiscent of initializing an array do not use { } brackets when initializing a hash you’ll get strange results which we will explore in Intermediate Perl

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 8 Accessing Hash Elements · to fetch a hash value, use $hash{$key} · if you have a list of the keys, you can iterate across the hash · most of the time you won’t have the list of keys and will need to get it from the hash directly – this is where keys comes in print qq(One red fruit is an $fruits{red}); print qq(One green fruit is a = qw(red green purple orange yellow); for $color { print qq($fruits{$color} is $color); }

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 9 Extracting Hash Keys with keys · the keys function returns a list of the keys of the hash · the keys are returned in no particular (but reproducible if the hash is not altered) = keys %fruits; for $color { print qq($fruits{$color} is $color); } # it’s better to avoid a temporary variable that holds the keys for $color (keys %fruits) { print qq($fruits{$color} is $color); }

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 10 An Example – OMG a real script! · let’s create a script that performs the following · creates 1000 random 4bp sequences · stores and prints the number of times each sequence has been seen · returns sequences and counts of all sequences that contain aaa, ccc, ggg or ttt · returns the number of a, c, g and t characters across all = qw(a t g c); # explicitly initialize the list of = (); for ( ) { # set the sequence to an empty string – not necessary $seq = “”; for (1..4) { # add a random base pair $seq = $seq. } $seq; } # in Intermediate Perl you will see how to take the code above and write = map { join("", map { qw(a t g c)[rand(4)] } (1..4)) } ( );

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 11 An Example · we now have our 1,000 random sequences · let’s count how many times each sequence appears · we’re going to use a hash · the key is the sequence · the value is the number of times it is  qw(atgc aatg ggtc... ggtc); %sequence_count = (); for $seq { $sequence_count{$seq} = $sequence_count{$seq} + 1; }

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 12 An Example · to print the number of times each sequence has been seen, iterate through the hash of counts · how many unique sequences were seen? · this is the number of keys in the hash for $seq (keys %sequence_count) { print qq(sequence $seq seen $sequence_count{$seq} times); } sequence acgc seen 3 times sequence ggta seen 2 times sequence aacg seen 3 times sequence gatt seen 6 times... my $unique_sequence_count = keys %sequence_count; print qq(Saw $unique_sequence_count unique sequences);

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 13 An Example · now let’s report on sequences that contain aaa, ttt, ggg or ccc · still iterating across the entire hash · applying regex to key – using alternation via | (i.e. aaa OR ttt OR ccc OR ggg) for $seq (keys %sequence_count) { if ($seq =~ /aaa|ttt|ccc|ggg/) { print qq(3-homo polymer sequence $seq seen $sequence_count{$seq} times); } 3-homo polymer sequence aaag seen 3 times 3-homo polymer sequence gaaa seen 4 times 3-homo polymer sequence aaaa seen 9 times 3-homo polymer sequence accc seen 2 times 3-homo polymer sequence cccc seen 5 times 3-homo polymer sequence tttt seen 4 times... regex “|” is alternation $str =~ /this|that/;

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 14 An Example · finally, let’s count all the base pairs across all sequences %bp_count = (); # method 1 – iterate across sequences, split sequence into list of characters for $seq { for $bp (split(“”,$seq)) { $bp_count{$bp} = $bp_count{$bp} + 1; } # method 2 – iterate across hash, split key, increment by hash value for $seq (keys %sequence_count) { for $bp (split(“”,$seq)) { $bp_count{$bp} = $bp_count{$bp} + $sequence_count{$seq}; } for $bp (keys %bp_count) { print qq(base pair $bp seen $bp_count{$bp} times); } base pair c seen 1053 times base pair a seen 979 times base pair g seen 997 times base pair t seen 971 times split(“”,$string) produces list of individual characters in $string split(“”,”baby”)  qw(b a b y)

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 15 Iterating Across a Hash with values · consider the task of determining the average number of times a sequence appears · we want the sequence counts, but not necessarily the sequences · we don’t care about the key · we care about the value · we can accomplish this by verbosely iterating across with keys and fetching the counts via $sequence_count{$key} · we can be more concise by using values $sum = 0; for $seq (keys %sequence_count) { $sum = $sum + $sequence_count{$seq}; } print “average sequence count is ”,$sum / keys %sequence_count;

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 16 Iterating across a Hash with values · recall that keys produced a list of a hash’s keys · values returns a list of a hash’s values %fruits red yellow blue green purple apple banana grape pear lemon... hash keys %fruits  qw(red yellow blue green purple) values %fruits  qw(apple banana grape pear lemon) key:value

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 17 Iterating across a Hash with values · we’re now in a position to determine the average count · if not, assume position · remember that a hash has no inherent order · when you use keys, generally it is to use the list for iterating over the hash · when you use values, generally it is because you don’t need the keys $sum = 0; for $count (values %sequence_count) { $sum = $sum + $count; } print “average sequence count is ”,$sum / keys %sequence_count; averge sequence count is

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 18 Checking for Existence · given an array, you can easily determine whether a certain index is populated · fetch $#array · elements indexed by 0..$#array exist, though any of them may be undefined (undef) · given a hash, it is frequently desirable to check whether a certain key exists · like with arrays, a key may exist but point to an undefined value (undef) %fruits red yellow blue green apple 0 undef hash if $fruits{red} # value=apple, TRUE if $fruits{yellow} # value=0, FALSE if defined $fruits{yellow} # value=0, 0 is defined  TRUE if $fruits{blue} # value=undef, FALSE if defined $fruits{blue} # value=undef, FALSE if exists $fruits{blue} # value=undef, key exists  TRUE if exists $fruits{green} # no such key, FALSE key:value

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 19 Testing Values with defined vs exists · exists is used on arrays/hashes to check whether an element/key has ever been initialized · an element is true only if it is defined · an element is defined only if it exists · both statements are not necessarily true in the converse · e.g., 0 is defined but is not true · e.g., undef exists, but it is not defined · be conscious of testing values (e.g. counts) which may be zero · are you testing for truth (excludes zero) or definition (includes zero) if $sequence{atgc} # TRUE only if atgc key exists and hash value is TRUE if defined $sequence{atgc} # TRUE if atgc key exists and hash value is defined (e.g. 0) if exists $sequence{atgc} # TRUE if atgc key exists (hash value may be undefined)

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 20 Quick Hash Recap %fruits = (); $fruits{red} = “apple”; $fruits{green} = “pear”; $fruits{yellow} = “lemon; keys %fruits; # qw(red green yellow), but in no particular order values %fruits; # qw(apple pear lemon), but in no particular order # (but compatible with output of keys) for $color (keys %fruits) {... $fruits{$color}... } for $fruit (values %fruits) {... $fruit... } print “no color purple” if ! exists $fruits{purple}; print “found color red” if exists $fruits{red}; print “found red fruit” if defined $fruits{red};

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 21 Sorting · we’ve seen several Perl functions now, such as print, split and join · they each took one or more arguments · Perl’s sort is slightly different · it takes as arguments a function and a list · the list tells sort what to sort · the function tells sort how to sort · what does sorting require? · a set of elements · for a given pair of elements, some method to determine which comes first · e.g. size (numbers) or alphabetical order (characters) or length (strings)

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 22 Sorting - Introduction # by default sort will arrange things ASCIIbetically – good for = for $seq { print $seq; } aaaa aaac aaag aaat aaca aacc...

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 23 Sorting - Introduction # remember – ASCIIbetically! – bad for numbers for $num (sort (1..20)) { print $num; }

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 24 Sorting – Specifying How · to tell sort how to sort, the sort { CODE } LIST paradigm is used · CODE is Perl code that informs sort about the relative ordinality of two elements · the is the spaceship operator · returns relative ordinality of = (1..20); # default sort – asciibetic – not what we = # numerical sort, ascending = sort { $a $b -1 if $a < $b $a $b  0 if $a == $b +1 if $a > $b

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 25 Sort – Specifying How · while is the operator for relative ordinality of numbers, cmp is the corresponding operator for strings # asciibetic sort, ascending = sort { $a cmp $b # { $a cmp $b } is sort’s default behaviour # the above gives the same result = -1 if $a lt $b $a cmp $b  0 if $a eq $b +1 if $a gt $b lt, eq, gt string equivalents of comparisons

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 26 Sort – Specifying Direction · to specify the direction of sort, it is sufficient to exchange the position of the $a and $b variables # numerical sort, ascending = sort { $a $b # numerical sort, descending = sort { $b $a

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 27 Sorting in Place · you can sort in place, without defining temporary variables · sort returns a list, so you can do anything with the output of sort that you can do with a list · what do you think these do? # sort in = # sort and concatenate in place $big_sequence = join(“”, $x = ($y) =

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 28 More Complex Sorting · the CODE passed to sort can be anything you want · remember, it is expected to return -1, 0 or 1 based on the relative ordinality · it can use other information to sort your elements · applying a function to $a and $b during sort is common · sort based on transformed values # recall length() returns the length of a = sort { length($a) length($b) # for some function f() sort { f($a) f($b)

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 29 Shuffling · you can short circuit the sort algorithm by feeding it random results · here relative ordinality is not based on the value of sorted elements, but determined based on two random numbers · since CODE should return -1, 0, 1 all you need is to return one of these values, at random · rand(3) returns a random float in the range [0,3) · int(rand(3)) truncates the decimal, resulting in random value from [0,1,2] · int(rand(3))-1 therefore maps randomly onto [-1,0,1] sort { rand() rand() sort { int( rand(3) ) - 1

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 30 Sorting Based on Hash Values · frequently you want to iterate through an array or hash in an ordered fashion based on array or hash contents · we iterated through the hash using keys, but remember that this was done in no order in particular (hashes aren’t ordered data structures) · recall the %sequence_counts hash · how do we iterate across it from most to least frequently seen sequence? · we want the keys to be sorted based on their associated values · first key points to largest value · second key points to second-largest value, etc # this iteration is in no particular order for $seq (keys %sequence_count) { print qq(sequence $seq seen $sequence_count{$seq} times); }

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 31 Sorting Based on Hash Values # this iteration is from most to least common sequence for $seq (sort { $sequence_count{$b} $sequence_count{$a} } keys %sequence_count) { print qq(sequence $seq seen $sequence_count{$seq} times); } sequence tcca seen 11 times sequence ctcg seen 11 times sequence aagc seen 10 times sequence tatc seen 10 times sequence cgcg seen 10 times sequence ggga seen 8 times sequence cccg seen 8 times sequence tata seen 8 times sequence gagc seen 8 times sequence ccga seen 8 times sequence cttt seen 8 times sequence gtga seen 7 times sequence tgct seen 7 times...

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting 32 Sorting Based on Array Values · consider an array of 10 random numbers for (1..10) { rand() } # iterate across the index of the array in the order it was created for $i ( ) { print qq(index $i value $random_numbers[$i]); } # sort across the index based on array values – ascending, numerical order for $i ( sort { $random_numbers[$a] $random_numbers[$b] } ) { print qq(index $i value $random_numbers[$i]); } index 4 value index 3 value index 9 value index 1 value index 2 value index 6 value index 7 value index 8 value index 0 value index 5 value index 0 value index 1 value index 2 value index 3 value index 4 value index 5 value index 6 value index 7 value index 8 value index 9 value (A) (B) (A) (B)

– Introduction to Perl 10/24/ Introduction to Perl - Hashes and Sorting Introduction to Perl Session 4 · you now know · all about hashes · declaring and initializing a hash · iterating across keys and values of a hash · checking for existence of a key · checking for definition of a value · numerical and asciibetical sorting · changing sort order · random shuffling · sorting based on complex conditions (and that's a lot!)