9.1 Hash revision
9.2 Variable types in PERL ScalarArrayHash $number $string %hash => $array[0] $hash{key}
9.3 An associative array (or simply – a hash) is an unordered set of pairs of keys and values. Each key is associated with a value. A hash variable name always start with a “%”: my %hash; Inserting values: $hash{"a"} = 5; $hash{"bob"} = "zzz"; $hash{50} = "John"; Accessing: you can access a value by its key: print $hash{50};John Tip you can reset the hash (to an empty one) by %hash = (); Hash – an associative array %hash 5"a" => "zzz""bob" => "John"50 =>
9.4 modifying : $hash{bob} = "aaa"; (modifying an existing value) You can ask whether a certain key exists in a hash: if (exists $hash{50} )... You can delete a certain key-value pair in a hash: delete($hash{50}); Hash – an associative array %hash 5"a" => "zzz""bob" => "John"50 => %hash 5"a" => "aaa""bob" => "John"50 => %hash 5"a" => "aaa""bob" =>
9.5 To iterate over all the values in %hash = values(%hash); foreach $value To iterate over the keys in %hash = keys(%hash); foreach $key Iterating over hash elements %hash 5"a" => "zzz""bob" => "John"50 5 "zzz" "a" "bob" 50
9.6 References & Complex Data Structures
9.7 References are your friends…
9.8 Variable types in PERL ScalarArrayHash $number $string %hash $reference @array3
9.9 A reference to a variable is a scalar value that “points” to another variable. creates a copy of the array and return a reference to this copy: = (85,91,67); my $arrayRef = References $arrayRef
9.10 A reference to a variable is a scalar value that “points” to another variable. creates a copy of the array and return a reference to this copy: = (85,91,67); my %gradeHash; $gradeHash{"Eyal"} = References %gradesHash "Eyal"
9.11 A reference to a variable is a scalar value that “points” to another variable. creates a copy of the array and return a reference to this copy: = (85,91,67); my %gradeHash; $gradeHash{"Eyal"} = (100,82); $gradeHash{"Neta"} = (56,99,77); $gradeHash{"Era"} = %gradesHash "Eyal" References example %gradesHash "Eyal" "Neta" %gradesHash "Eyal" "Neta" "Era"
9.12 A reference to a variable is a scalar value that “points” to another variable. return a reference to the array itself. THIS MIGHT BE DANGEROUS. = (85,91,67); my $arrayRef = my %gradeHash; $gradeHash{"Eyal"} = References $arrayRef %gradesHash "Eyal"
9.13 A reference to a variable is a scalar value that “points” to another variable. return a reference to the array itself. THIS MIGHT BE DANGEROUS. = (85,91,67); my %gradeHash; $gradeHash{"Eyal"} = (100,82); $gradeHash{"Neta"} = (56,99,77); $gradeHash{"Era"} = %gradesHash "Eyal" %gradesHash "Eyal" "Neta" %gradesHash "Eyal" "Neta" "Era" References (bad) example
9.14 To access the data from a reference we need to dereference it: = (85,91,67); my $arrayRef = print $arrayRef;ARRAY(0x225d14) print De-referencing $arrayRef To get the array
9.15 To access the data from a reference we need to dereference it: = (85,91,67); my $arrayRef = my $firstGrade = $arrayRef->[0]; print $firstGrade; 85 De-referencing $arrayRef Use ->[ x ] to get to the x element of the referenced array
9.16 Get all the grades of Eyal: Get second grade of Neta: my $Neta2 = $gradeHash{"Neta"}->[1]; Change first grade of Era: $gradeHash{"Era"}->[0] = 72; De-referencing examples %gradesHash "Eyal" "Neta" "Era"
9.17 Get sorted grades of Eyal: = Push another grade to Neta: my $grade = 97; push More de-referencing examples %gradesHash "Eyal" "Neta" "Era"
9.18 Referencing array : $arrayRef = $gradesRef = (careful) Referencing – Dereferencing Arrays Dereferencing array $element1 = $arrRef->[0]; $gradesRef BCA $arrRef $element1 = $arrRef->[0] = A
9.19 Class exercise 9a 1.Write a script that reads a file with a list of protein names, and their levels measured in different time points, such as: AP_ ,0.54,0.90,0.04,0.04 AP_ ,0.20,0.50 Store the information in a hash. The names of the proteins as hash keys, and the protein levels as referenced arrays. a)Ask the user for a protein name and print out the sorted array of the levels measured of that protein. For example, if the user enter AP_000155, The script should print b)Ask the user for a protein name and a protein level, and add this level as the last measurement of the appropriate protein, and print out the updated array of level. 2. Read the adenovirus genome file and build a hash of genes, where the key is the "product" name and the CDS start and end coordinates are an array referenced to by that key. Ask the user for a product and print its coordinated. For example if the user types " E1B 19K ", the script should print out: " ". (note that the CDS lines appear before the product line…)
9.20 A reference to a variable is a scalar value that “points” to another variable. {%hash} creates a copy of the hash and return a reference to this copy: my %details; $details{"phone"} = 5012; $details{"address"} = "Swiss"; my $hashRef = {%details}; References 5012"Phone" "Swiss""Addrs" %details 5012"Phone" "Swiss""Addrs" $hashRef
9.21 A reference to a variable is a scalar value that “points” to another variable. {%hash} creates a copy of the hash and return a reference to this copy: my %details; $details{"phone"} = 5012; $details{"address"} = "Swiss"; my %bookHash; $ bookHash{"Eyal"} = {%details}; References 5012"Phone" "Swiss""Addrs" %details 5012"Phone" "Swiss""Addrs" %bookHash "Eyal"
9.22 %bookHash my %details; $details{"phone"} = 5012; $details{"address"} = "Swiss"; my %bookHash; $bookHash{"Eyal"} = {%details}; $details{"phone"} = 6023; $details{"address"} = "Yavne"; $bookHash{"Neta"} = {%details}; References example 5012"Phone" "Swiss""Addrs" %details 5012"Phone" "Swiss""Addrs" "Eyal" 6023"Phone" "Yavne""Addrs" 6023 "Yavne" "Neta"
9.23 %bookHash References example Another way to build the same data structure: $bookHash{"Eyal"}->{"phone"} = 5012; $bookHash{"Eyal"}->{"address"} = "Swiss"; $bookHash{"Neta"}->{"phone"} = 6023; $bookHash{"Neta"}->{"address"} = "Yavne"; 5012"Phone" "Swiss""Addrs" "Eyal" 6023"Phone" "Yavne""Addrs" "Neta"
9.24 To access the data from a reference we need to dereference it: my $hashRef; $hashRef->{"Phone"} = 5012; $hashRef->{"Address"} = "Swiss; my %details = %{$arrayRef}; = values (%details); print 5012 Swiss De-referencing To get the hash use %{$reference} 5012"Phone" "Swiss""Addrs" $hashRef 5012"Phone" "Swiss""Addrs" %details
9.25 To access the data from a reference we need to dereference it: my $hashRef; $hashRef->{"Phone"} = 5012; $hashRef->{"Address"} = "Swiss; my $phone = $ hash Ref->{"Phone"}; print $phone; 5012 De-referencing 5012"Phone" "Swiss""Addrs" $hashRef Use ->{ key } to get the value of key in the referenced hash
9.26 Get all the details of Neta: my %NetaDetails= %{$bookHash{"Neta"}} Get the phone of Eyal: my $EyalPhone = $bookHash{"Eyal"}->{"Phone"}; De-referencing examples %bookHash 5012"Phone" "Swiss""Addrs" "Eyal" 6023"Phone" "Yavne""Addrs" "Neta"
9.27 Change Neta's address: $bookHash{"Neta"}->{"Address"} = "Tel-Aviv"; Get all the keys(%bookHash) forach my $name print "Phone of $name: "; print $bookHash{$name}->{"Phone"}."\n"; } De-referencing examples %bookHash 5012"Phone" "Swiss""Addrs" "Eyal" 6023"Phone" "Yavne""Addrs" "Neta" "TelAviv"
9.28 Referencing hash : $hashRef = {%phoneBook}; $bookRef = \%phoneBook; (careful) Referencing – Dereferencing Hashes Dereferencing hash : %hash = %{$hashRef}; $myVal = $hashRef->{"A"}; $bookRef %phoneBook XA YB ZC $hashRef XA YB ZC %hash XA YB ZC $myVal = $hashRef->{"A"} = "X"
9.29 Class exercise 9b 1.Write a script that reads a file with a list of protein names, lengths and location (such as in proteinLengthsAndLocation.txt ), with lines such as: AP_ Nuc AP_ Cyt Stores the names of the sequences as hash keys, and use "length" and "location" as keys in an internal hash for each protein. For example: $proteins{"AP_000081"}->{"length"} should be 181 $proteins{"AP_000081"}->{"location"} should be "Nuc". a)Ask the user for a protein name and print its length and location. b)Print for each protein its name and location. 2.Read the adenovirus GenBank file and build a hash of genes, where the key is the product name: For each gene store an internal hash with two keys, one contains the protein_id and the other contains the db_xref. a)Ask the user for a product, and print its protein_id and db_xref. b)Use the CDS line to decide whether the coding sequence is on the positive or negative stands (" complement " before the coordinates marks a sequence coded on the negative strand). Add a key strand to the hash of each gene that contains "+" if the coding sequence is coded on the positive strand or "-" if it is on the negative. print all the product names of the proteins coded on the negative strand.