Presentation is loading. Please wait.

Presentation is loading. Please wait.

8ex.1 References and complex data structures. 8ex.2 An associative array (or simply – a hash) is an unordered set of key=>value pairs. Each key is associated.

Similar presentations


Presentation on theme: "8ex.1 References and complex data structures. 8ex.2 An associative array (or simply – a hash) is an unordered set of key=>value pairs. Each key is associated."— Presentation transcript:

1 8ex.1 References and complex data structures

2 8ex.2 An associative array (or simply – a hash) is an unordered set of key=>value pairs. Each key is associated with a value. A hash variable name always start with a “%”: my %h = ("a"=>5, "bob"=>"zzz", 50=>"Johnny"); You can access a value by its key: print $h{50}.$h{"a"};Johnny5 $h{"bob"} = "aaa"; (modifying an existing value) $h{555} = "z"; (adding a new key-value pair) Hash – an associative array

3 8ex.3 To iterate over the keys in %h foreach $key (keys(%h))... For example: foreach $key (keys(%h)) { print "The key is $key\n"; print "The value is $h{$key}\n"; } The elements are given in an arbitrary order, so if you want a certain order use sort: foreach $key (sort(keys(%h)))... Iterating over hash elements

4 8ex.4 So far, we know two types of data structures: An Array is an ordered list of scalar values: my @names = ("Shmuel", "Moti", "Rahel"); A Hash is an unordered set of pairs of scalar values: my %phoneBook = ("Shmuel"=>5820, "Moti"=>2745); However, in many situations we may need to store more complex data records. For example – how to keep the phone number, address and list of grades for each student in a course? We would like a data record that looks like this: "Shmuel" => (5820, "34 HaShalom St.", (85,91,67)) For this to work we’re going to need references… Why do we need complex data structures?

5 8ex.5 Variable types in PERL ScalarArrayHash $number -3.54 $string "hi\n" @array %hash => $reference 0x225d14 %hash => @array1 @array2 @array3

6 8ex.6 A reference to a variable is a scalar value that “points” to the variable: $nameRef = \$name; @grades = (85,91,67); $gradesRef = \@grades; $phoneBookRef = \%phoneBook; References $phoneBookRef %phoneBook => @grades $gradesRef$nameRef$name

7 8ex.7 A reference to a variable is a scalar value that “points” to the variable: $nameRef = \$name; @grades = (85,91,67); $gradesRef = \@grades; $phoneBookRef = \%phoneBook; We can make an anonymous reference without creating a variable with a name: [ITEMS] creates a new, anonymous array and returns a reference to it; {ITEMS} creates a hash: $arrayRef = [85,91,67]; $hashRef = {85=>4,91=>3}; (These are variables with no variable name) References @grades $gradesRef$arrayRef

8 8ex.8 $nameRef = \$name; $gradesRef = \@grades; $phoneBookRef = \%phoneBook; print $gradesRef;ARRAY(0x225d14) To access the data from a reference we need to dereference it: print $$nameRef;Yossi print "@$gradesRef";85 91 67 $$gradesRef[3] = 100; print "@grades";85 91 67 100 $phoneNumber = $$phoneBookRef{"Yossi"}; De-referencing 100 was added to the original array @grades ! @grades $gradesRef

9 8ex.9 $gradesRef = \@grades; $phoneBookRef = \%phoneBook; print "@$gradesRef";85 91 67 $$gradesRef[3] = 100; $phoneNumber = $$phoneBookRef{"Yossi"}; The following notation is equivalent, and sometimes it is more readable: $gradesRef->[3] = 100; $phoneNumber = $phoneBookRef->{"Yossi"}; De-referencing @grades $gradesRef

10 8ex.10 Because a reference is a scalar value, we can store a reference to an array\hash in as an element in another array\hash: @grades = (85,91,67); %students = ("Yossi" => \@grades); $students{"Yossi"} = \@grades; $students{"Shmuel"} = [83,76]; Now the key “Yossi” is paired to a reference value: print $students{"Yossi"};ARRAY(0x22e714) print "@{$students{"Yossi"}}";85 91 67 print ${$students{"Yossi"}}[1];91 print $students{"Yossi"}->[1];91 This form is more readable, we strongly recommend it… References allow complex structures %students NAME => [GRADES] %students =>

11 8ex.11 Now we can do it: “how to keep the phone number, address and list of grades for each student in a course?” $students{"Yossi"} = {"phone"=>3744, "address"=>"34 HaShalom St.", "grades"=>[93,72,87]}; $students{"Rahel"} = {"phone"=>5732, "address"=>"5 Bazel St.", "grades"=>[91,86,88]}; References allow complex structures %students => %students NAME => { " phone " => PHONE " address " => ADDRESS " grades " => [GRADES]}

12 8ex.12 Now we can do it: “how to keep the phone number, address and list of grades for each student in a course?” $students{"Yossi"} = {"phone"=>3744, "address"=>"34 HaShalom St.", "grades"=>[93,72,87]}; print $students{"Yossi"}->{"grades"}->[2]; 87 It is more convenient to use a shorthand notation: print $students{"Yossi"}{"grades"}[2] But remember that there are references in there! References allow complex structures %students NAME => { " phone " => PHONE " address " => ADDRESS " grades " => [GRADES]} %students =>

13 8ex.13 The following code is an example of iterating over two levels of the structure – The top hash (each student) and the internal arrays (lists of grades): foreach my $name (keys(%students)) { foreach my $grade (@{$students{$name}->{"grades"}}) { print $grade; } References allow complex structures %students => %students NAME => { " phone " => PHONE " address " => ADDRESS " grades " => [GRADES]}

14 8ex.14 When building a complex data structure in some loop you may come across a problem if you insert a non-anonymous array or hash into the data structure: my ($line, $id, @grades, %students); while ($line = ) {... @grades =... $students{$id} = \@grades; } Let’s see what happens when we enter the lines: a 86 73 89 b 79 90 87 c 100 90 93 The REUSED_ADDRESS problem

15 8ex.15 The debugger will show you that there is a problem: The REUSED_ADDRESS problem

16 8ex.16 The problem is that for every student we store a reference to the same array. We have to create new array in every iteration: 1. By using an anonymous array reference: $students{$id} = {GRADES=>[...],... 2. or, we could declare (with my) the array inside the loop, so that a new one is created in every iteration: while ($line = ) { my @grades =... $students{$id} = \@grades; } The REUSED_ADDRESS problem (You may have this problem with the multiple #RP fields in ex5.5)

17 8ex.17 %genes PRODUCT => { " protein_id " => PROTEIN_ID " strand " => STRAND " CDS " => [START, END]} %genes PRODUCT => { " protein_id " => PROTEIN_ID " strand " => STRAND} %genes PRODUCT => { " protein_id " => PROTEIN_ID} Class exercise 10 1. Read the adenovirus genome file and build a hash of genes, where the key is the "product" name: For each gene store a hash with the protein ID. Print all keys (names) in the hash. 2. Add to the hash the strand of the gene on the genome: “ + ” for the sense strand and “ - ” for the antisense strand. Print all antisense genes. 3. Add to the hash an array of two coordinates – the start and end of the CDS. Print genes shorter than 500bp. 4. Print the product name of all genes on the sense strand whose CDS spans more than 1kbp, and all genes on the antisense strand whose CDS spans less than 500bp.

18 8ex.18 @table Now we can also create a 2-dimensional array (a table or a matrix): @table = ([1,2,3],[4,5,6],[7,8,9]);\ print $table[1]->[0];4 Or: print $table[1][0];4 Two dimensional arrays 546 897 2 1 3


Download ppt "8ex.1 References and complex data structures. 8ex.2 An associative array (or simply – a hash) is an unordered set of key=>value pairs. Each key is associated."

Similar presentations


Ads by Google