Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic 5: Hashes CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 5, pages 73-85 Programming Perl 3rd edition pages 76-78, 697-700, 703-704,

Similar presentations


Presentation on theme: "Topic 5: Hashes CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 5, pages 73-85 Programming Perl 3rd edition pages 76-78, 697-700, 703-704,"— Presentation transcript:

1 Topic 5: Hashes CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 5, pages 73-85 Programming Perl 3rd edition pages 76-78, 697-700, 703-704, 733-734 perldata manpage

2 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 2 In this topic  Hashes ► aka associative arrays  Hash variables  Functions which use hashes  Uses of hashes  Accessing Perl’s environment  Hashes ► aka associative arrays  Hash variables  Functions which use hashes  Uses of hashes  Accessing Perl’s environment

3 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 3 Arrays  Arrays are ► ordered ► indexed by a number (integer) ► dense –if element n exists, so do elements 0 to n-1  Arrays are ► ordered ► indexed by a number (integer) ► dense –if element n exists, so do elements 0 to n-1 012345 @array indices 42"dog"-0.2undef420

4 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 4 Arrays  Arrays aren’t always best data structure  Imagine array of students’ marks ► indexed by 8-digit student ID number  Arrays aren’t always best data structure  Imagine array of students’ marks ► indexed by 8-digit student ID number @marks 12345678123456791234568012345681 8943undef70 0 Ten million empty elements in here!

5 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 5 Arrays  Student ID numbers aren’t really numbers anyway ► can’t do arithmetic on them ► order of two student IDs not really important ► really just strings that happen to contain digits  Want some data structure where indices are strings ► usually called associative arrays –or dictionary –or (lookup) table –or hash table  Student ID numbers aren’t really numbers anyway ► can’t do arithmetic on them ► order of two student IDs not really important ► really just strings that happen to contain digits  Want some data structure where indices are strings ► usually called associative arrays –or dictionary –or (lookup) table –or hash table

6 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 6 Associative arrays  Associative array is an array where ► can locate an array element’s value given index ► indices are strings ► indices are unique ► indices are unordered  For example, to look up capital cities of countries  Associative array is an array where ► can locate an array element’s value given index ► indices are strings ► indices are unique ► indices are unordered  For example, to look up capital cities of countries PeruJapanUKRussiaCanadaEgypt LimaTokyoLondonMoscowOttawaCairo In Perl, associative arrays are called “hashes” (because they’re implemented using hash tables)

7 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 7 Hashes in Perl  Indices called keys ► strings ► must be unique ► e.g., country names  Contents called values ► any scalar ► may be duplicated ► e.g., capital city names  Can look up value given key, but not vice versa ► What’s the capital of Egypt? (easy) ► What country is Monrovia the capital of? (hard)  Unordered ► You can’t sort a hash! ► Perl stores elements in an order optimized for fast lookup  Indices called keys ► strings ► must be unique ► e.g., country names  Contents called values ► any scalar ► may be duplicated ► e.g., capital city names  Can look up value given key, but not vice versa ► What’s the capital of Egypt? (easy) ► What country is Monrovia the capital of? (hard)  Unordered ► You can’t sort a hash! ► Perl stores elements in an order optimized for fast lookup Llama3 pages 73-74; Camel3 pages 51, 76-77; perldata manpage

8 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 8 Hash elements  Hash key written inside { curly braces } ► contrast with normal arrays using [ square brackets ] ► $capital{"Egypt"} # Equal to "Cairo" ► $capital{$nation} # Depends on $nation  Can assign to a hash element ► overwrites the old value, if there was one –or creates a new element, if there wasn’t ► doesn’t change any other element ► $capital{"Australia"} = "Canberra";  Using nonexistent key returns undef ► $capital{"Atlantis"} # No such country  Hash key written inside { curly braces } ► contrast with normal arrays using [ square brackets ] ► $capital{"Egypt"} # Equal to "Cairo" ► $capital{$nation} # Depends on $nation  Can assign to a hash element ► overwrites the old value, if there was one –or creates a new element, if there wasn’t ► doesn’t change any other element ► $capital{"Australia"} = "Canberra";  Using nonexistent key returns undef ► $capital{"Atlantis"} # No such country Llama3 pages 76-78; Camel3 page 67

9 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 9 Testing hash elements  Can determine if hash key exists using exists function ► exists $capital{"Canada"} # True ► exists $capital{"Atlantis"} # False  Not same as using defined ► key can exist, but value can be undefined ► exists $capital{"Vatican City"} # True ► defined $capital{"Vatican City"} # False  Can determine if hash key exists using exists function ► exists $capital{"Canada"} # True ► exists $capital{"Atlantis"} # False  Not same as using defined ► key can exist, but value can be undefined ► exists $capital{"Vatican City"} # True ► defined $capital{"Vatican City"} # False Llama3 page 83; Camel3 pages 697-698, 710-711; perlfunc manpage

10 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 10 Deleting hash elements  To remove an entry from a hash, use delete function ► delete $capital{"Czechoslovakia"}; ► exists will now return false for that key  To clear a hash, assign empty list to entire hash ► %capital = (); # World anarchy  To remove an entry from a hash, use delete function ► delete $capital{"Czechoslovakia"}; ► exists will now return false for that key  To clear a hash, assign empty list to entire hash ► %capital = (); # World anarchy Llama3 pages 76-77, 83-84; Camel3 pages 699-700; perlfunc manpage

11 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 11 Entire hashes  To refer to an entire hash, use %hash ► % instead of $ ► no curly braces  Can copy hashes ► %clone = %hash;  Can initialize hash with many elements by assigning list to it ► for each element, write key followed by value ► order of key/value pairs not important ► %capital = ("Peru", "Lima", "Japan", "Tokyo", "UK", "London", "Russia", "Moscow", "Canada", "Ottawa", "Egypt", "Cairo");  Hashes flatten back into lists when used in list context ► e.g., when passed to a subroutine  To refer to an entire hash, use %hash ► % instead of $ ► no curly braces  Can copy hashes ► %clone = %hash;  Can initialize hash with many elements by assigning list to it ► for each element, write key followed by value ► order of key/value pairs not important ► %capital = ("Peru", "Lima", "Japan", "Tokyo", "UK", "London", "Russia", "Moscow", "Canada", "Ottawa", "Egypt", "Cairo");  Hashes flatten back into lists when used in list context ► e.g., when passed to a subroutine Llama3 pages 78-79; Camel3 pages 76-78

12 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 12 Hash elements  Hashes, subroutines, arrays and scalars occupy different namespaces ► %x, $x{... } refer to hash %x ► @x, $x[... ] refer to array @x ► &x, x(... ) refer to subroutine &x ► $x refers to scalar $x  Hash elements interpolate into double-quoted strings ► print "The capital of $nation is $capital{$nation}\n";  Entire hashes don’t interpolate at all. ► print "%capital"; # Prints "%capital"  Hashes, subroutines, arrays and scalars occupy different namespaces ► %x, $x{... } refer to hash %x ► @x, $x[... ] refer to array @x ► &x, x(... ) refer to subroutine &x ► $x refers to scalar $x  Hash elements interpolate into double-quoted strings ► print "The capital of $nation is $capital{$nation}\n";  Entire hashes don’t interpolate at all. ► print "%capital"; # Prints "%capital"

13 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 13 Functions that use hashes  How do you print out the contents of a hash? ► need to know what keys a hash has –from each key, can get value with $hash{key}  keys function returns a list of all keys in a hash ► order is indeterminate, but same every time ► every key is unique –by definition of hash ► keys %capital # Returns list ("Canada", "UK", "Egypt", "Japan", "Peru", "Russia") (maybe)  values function returns a list of all values in a hash ► order is same as from keys function ► values may be duplicated –values may be any scalar ► values %capital # Returns list ("Ottawa", "London", "Cairo", "Tokyo", "Lima", "Moscow")  How do you print out the contents of a hash? ► need to know what keys a hash has –from each key, can get value with $hash{key}  keys function returns a list of all keys in a hash ► order is indeterminate, but same every time ► every key is unique –by definition of hash ► keys %capital # Returns list ("Canada", "UK", "Egypt", "Japan", "Peru", "Russia") (maybe)  values function returns a list of all values in a hash ► order is same as from keys function ► values may be duplicated –values may be any scalar ► values %capital # Returns list ("Ottawa", "London", "Cairo", "Tokyo", "Lima", "Moscow") Llama3 pages 80-81; Camel3 pages 733-734, 824; perlfunc manpage

14 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 14 Timeout # Printing an entire hash using keys function. # Initialize the hash. # The => notation is just a pretty-looking # synonym for the, (comma) operator that also quotes # the the word on the left side. Great for hashes. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Order is indeterminate. foreach $nation (keys %capital) { print "Capital of $nation is $capital{$nation}\n"; } # Printing an entire hash using keys function. # Initialize the hash. # The => notation is just a pretty-looking # synonym for the, (comma) operator that also quotes # the the word on the left side. Great for hashes. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Order is indeterminate. foreach $nation (keys %capital) { print "Capital of $nation is $capital{$nation}\n"; }

15 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 15 Timeout # Printing an entire hash, sorted by country. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Note that this isn't sorting the hash, # nor even iterating over the hash, but # iterating over a sorted list of the hash's keys. foreach $nation (sort keys %capital) { print "Capital of $nation is $capital{$nation}\n"; } # Printing an entire hash, sorted by country. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Note that this isn't sorting the hash, # nor even iterating over the hash, but # iterating over a sorted list of the hash's keys. foreach $nation (sort keys %capital) { print "Capital of $nation is $capital{$nation}\n"; }

16 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 16 Functions that use hashes  keys may return a very large list ► perhaps inefficient if you need only one hash element at a time  each function iterates over a hash ► one element at a time ► on first call, returns a two-element list containing one key/value pair ► subsequent calls return other key/value pairs –order indeterminate, but guaranteed not to repeat any pairs ► when all key/value pairs have been returned once, returns empty list ► state is kept by Perl with hidden attribute on hash variable ► much more space-efficient than using keys ► typical use –while (($key, $value) = each %hash) {... }  keys may return a very large list ► perhaps inefficient if you need only one hash element at a time  each function iterates over a hash ► one element at a time ► on first call, returns a two-element list containing one key/value pair ► subsequent calls return other key/value pairs –order indeterminate, but guaranteed not to repeat any pairs ► when all key/value pairs have been returned once, returns empty list ► state is kept by Perl with hidden attribute on hash variable ► much more space-efficient than using keys ► typical use –while (($key, $value) = each %hash) {... } Llama3 pages 81-82; Camel3 pages 703-704; perlfunc manpage

17 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 17 Timeout # Printing an entire hash, using each function. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # No provision for sorting the output here, # because order returned by each function # is indeterminate. while (($nation, $city) = each %capital) { print "Capital of $nation is $city\n"; } # Printing an entire hash, using each function. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # No provision for sorting the output here, # because order returned by each function # is indeterminate. while (($nation, $city) = each %capital) { print "Capital of $nation is $city\n"; }

18 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 18 Uses of hashes  Hashes useful for ► implementing sparse arrays ► implementing lookup tables/databases ► counting strings ► removing duplicates from a list ► passing named parameters to subroutines  Hashes useful for ► implementing sparse arrays ► implementing lookup tables/databases ► counting strings ► removing duplicates from a list ► passing named parameters to subroutines Llama3 pages 75-76

19 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 19 Hashes: sparse arrays  Normal arrays are dense ► creating $a[10000] creates @a[0..9999] too.  Hash keys are independent ► creating $h{"10000"} creates no other other elements –only elements that exist need to take up memory ► just have to pretend that keys (really strings) are integers –like student ID numbers ► may have to write some code to fake “order” of elements –foreach $element (sort {$a $b} keys %h)  Normal arrays are dense ► creating $a[10000] creates @a[0..9999] too.  Hash keys are independent ► creating $h{"10000"} creates no other other elements –only elements that exist need to take up memory ► just have to pretend that keys (really strings) are integers –like student ID numbers ► may have to write some code to fake “order” of elements –foreach $element (sort {$a $b} keys %h)

20 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 20 Hashes: lookup table  Using hash, can look up string (value) given string (key) ► look up the capital of a country –capital of Malaysia is Kuala Lumpur ► look up a word in a dictionary –definition of dog is “domestic canine” ► look up the IP address of machine –slashdot.org’s IP address is 66.35.250.150 ► look up the value of a variable in an interpreter –value of variable x is 5 ► look up the title of a book –book with ISBN 0-596-00027-8 is “Programming Perl” ► look up the real name of a student –student 11111111 is Bart Simpson  Any relationship with a one-to-many relationship is perfect for a hash  Using hash, can look up string (value) given string (key) ► look up the capital of a country –capital of Malaysia is Kuala Lumpur ► look up a word in a dictionary –definition of dog is “domestic canine” ► look up the IP address of machine –slashdot.org’s IP address is 66.35.250.150 ► look up the value of a variable in an interpreter –value of variable x is 5 ► look up the title of a book –book with ISBN 0-596-00027-8 is “Programming Perl” ► look up the real name of a student –student 11111111 is Bart Simpson  Any relationship with a one-to-many relationship is perfect for a hash

21 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 21 Timeout # Using the program's environment # All processes have a set of names and values which # they inherit from their parents. These can be # set in the shell by typing NAME=VALUE. print "Your home directory is $ENV{'HOME'}\n"; if ($ENV{'SHELL'} eq "/bin/csh") { # Commiserate with user. print "Your shell is csh. Yuck!"; } print "Commands are looked for in these dirs:\n"; print " $_\n" foreach (split /:/, $ENV{'PATH'}) # split: Topic 7 # Using the program's environment # All processes have a set of names and values which # they inherit from their parents. These can be # set in the shell by typing NAME=VALUE. print "Your home directory is $ENV{'HOME'}\n"; if ($ENV{'SHELL'} eq "/bin/csh") { # Commiserate with user. print "Your shell is csh. Yuck!"; } print "Commands are looked for in these dirs:\n"; print " $_\n" foreach (split /:/, $ENV{'PATH'}) # split: Topic 7

22 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 22 Hashes: counting strings  Use hash to count frequency of strings ► key is the string (“dog”) ► value (integer) is the count (has been seen 3 times so far) ► increment the value every time a key is read  Can be used to find intersection (common elements) between two arrays ► iterate over first array: count elements found ► iterate over second array: include element in result only if it was seen in the first array ► can compute union and difference similarly  Use hash to count frequency of strings ► key is the string (“dog”) ► value (integer) is the count (has been seen 3 times so far) ► increment the value every time a key is read  Can be used to find intersection (common elements) between two arrays ► iterate over first array: count elements found ► iterate over second array: include element in result only if it was seen in the first array ► can compute union and difference similarly

23 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 23 Timeout # Counting strings. %seen = (); # Nothing has been seen so far. while (<>) # Read words from input. { chomp; # Increment the counter with line's text as key. $seen{$_}++; print "$_ has been seen $seen{$_} times so far\n"; } # Final report. while (($line, $count) = each %seen) { print "$line was seen $count times overall\n"; } # Counting strings. %seen = (); # Nothing has been seen so far. while (<>) # Read words from input. { chomp; # Increment the counter with line's text as key. $seen{$_}++; print "$_ has been seen $seen{$_} times so far\n"; } # Final report. while (($line, $count) = each %seen) { print "$line was seen $count times overall\n"; }

24 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 24 Timeout # Intersection of two arrays. %seen = (); @intersection = (); foreach (@one) # Iterate through first array. { # Remember which elements have been seen. $seen{$_} = 1; # Any true value will do. } foreach (@two) # Now iterate through second array. { # Only add to result if was seen in @one. push @intersection, $_ if $seen{$_}; } # Intersection of two arrays. %seen = (); @intersection = (); foreach (@one) # Iterate through first array. { # Remember which elements have been seen. $seen{$_} = 1; # Any true value will do. } foreach (@two) # Now iterate through second array. { # Only add to result if was seen in @one. push @intersection, $_ if $seen{$_}; }

25 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 25 Hashes: removing duplicates  An extension of counting elements in a list ► if this is the first time element seen, include in result ► otherwise, skip this element  An extension of counting elements in a list ► if this is the first time element seen, include in result ► otherwise, skip this element

26 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 26 Timeout # Simple implementation of Unix sort and sort -u # Was -u (unique) switch given? if ($ARGV[0] eq "-u") { $unique = 1; shift; # Remove -u argument. } # Read all input lines and sort them. @result = sort <>; if ($unique) { # Filter out anything already seen. @result = grep { !$seen{$_}++ } @result; } print @result; # Output remaining lines. # Simple implementation of Unix sort and sort -u # Was -u (unique) switch given? if ($ARGV[0] eq "-u") { $unique = 1; shift; # Remove -u argument. } # Read all input lines and sort them. @result = sort <>; if ($unique) { # Filter out anything already seen. @result = grep { !$seen{$_}++ } @result; } print @result; # Output remaining lines.

27 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 27 Hashes: named parameters  Calling subroutines with many parameters is messy ► printformatted(56, "$", 8, 2, "decimal"); –what did the 8 mean again? ► especially when some parameters are optional and have a reasonable default anyway  Can use hash to identify optional parameters and give them values ► printformatted(56, prefix => '$', format => "decimal", precision => 8, places => 2); –self-documenting code –order of parameters no longer matters ► printformatted(56, format => "hex"); –only need to name the parameters with non-default values ► subroutines require a little code to handle this  Calling subroutines with many parameters is messy ► printformatted(56, "$", 8, 2, "decimal"); –what did the 8 mean again? ► especially when some parameters are optional and have a reasonable default anyway  Can use hash to identify optional parameters and give them values ► printformatted(56, prefix => '$', format => "decimal", precision => 8, places => 2); –self-documenting code –order of parameters no longer matters ► printformatted(56, format => "hex"); –only need to name the parameters with non-default values ► subroutines require a little code to handle this

28 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 28 Timeout # Map formats to printf percent-things. %format = (decimal => "d", hex => "x", octal => "o"); # Print a number with a certain format. sub printformatted { my $number = shift; # Value to print. my %param = ( format => "decimal", # Defaults. precision => "6", @_ # Rest of sub params. ); printf( # Build up printf format string. ($param{"prefix"}. "%". $param{"precision"}. ".". $param{"places"}. $format{$param{"format"}}), $number); } # Map formats to printf percent-things. %format = (decimal => "d", hex => "x", octal => "o"); # Print a number with a certain format. sub printformatted { my $number = shift; # Value to print. my %param = ( format => "decimal", # Defaults. precision => "6", @_ # Rest of sub params. ); printf( # Build up printf format string. ($param{"prefix"}. "%". $param{"precision"}. ".". $param{"places"}. $format{$param{"format"}}), $number); }

29 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 29 Covered in this topic  Hashes  Hash variables ► $hash{key}, %hash  Functions which use hashes ► keys, values ► each  Uses of hashes ► data lookup ► sparse arrays ► counting elements in a list ► removing duplicates from a list ► accessing a process’ environment ► subroutines with optional parameters  Hashes  Hash variables ► $hash{key}, %hash  Functions which use hashes ► keys, values ► each  Uses of hashes ► data lookup ► sparse arrays ► counting elements in a list ► removing duplicates from a list ► accessing a process’ environment ► subroutines with optional parameters

30 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 30 Going further  Tying ► treat an external file (or any other object) like an internal hash (or any other type) ► Camel3 pages 363-398  Databases ► talking to databases with Perl ► Programming the Perl DBI by Alligator Descartes and Tim Bunce, O’Reilly 2000  Shells ► the Unix command-line interface ► man sh  Tying ► treat an external file (or any other object) like an internal hash (or any other type) ► Camel3 pages 363-398  Databases ► talking to databases with Perl ► Programming the Perl DBI by Alligator Descartes and Tim Bunce, O’Reilly 2000  Shells ► the Unix command-line interface ► man sh

31 Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 31 Next topic  Regular expressions ► pattern matching  Regular expressions ► pattern matching Llama3 chapters 7-9, pages 98-127 Camel3 pages 139-195 perlre manpage


Download ppt "Topic 5: Hashes CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 5, pages 73-85 Programming Perl 3rd edition pages 76-78, 697-700, 703-704,"

Similar presentations


Ads by Google