1 96-Summer 生物資訊程式設計實習 ( 二 ) Bioinformatics with Perl 8/13~8/22 蘇中才 8/24~8/29 張天豪 8/31 曾宇鳯.

Slides:



Advertisements
Similar presentations
A Guide to Unix Using Linux Fourth Edition
Advertisements

Lecture 9. Lecture 9: Outline Strings [Kochan, chap. 10] –Character Arrays/ Character Strings –Initializing Character Strings. The null string. –Escape.
The Linux Operating System Lecture 6: Perl for the Systems Administrator Tonga Institute of Higher Education.
Dynamic Arrays Lecture 4. Arrays In many languages the size of the array is fixed however in perl an array is considered to be dynamic: its size can be.
The Basics Getting started with Perl for Bioinformatics programming.
Bioinformatics is … - the use of computers and information technology to assist biological studies - a multi-dimensional and multi-lingual discipline Chapters.
Programming Perls* Objective: To introduce students to the perl language. –Perl is a language for getting your job done. –Making Easy Things Easy & Hard.
CSET4100 – Fall 2009 Perl Introduction Scalar Data, Operators & Control Blocks Acknowledgements: Slides adapted from NYU Computer Science course on UNIX.
Bioinformatics Lecture 7: Introduction to Perl. Introduction Basic concepts in Perl syntax: – variables, strings, input and output – Conditional and iteration.
Scripting Languages Chapter 6 I/O Basics. Input from STDIN We’ve been doing so with $line = chomp($line); Same as chomp($line= ); line input op gives.
Scalar Variables Start the file with: #! /usr/bin/perl –w No spaces or newlines before the the #! “#!” is sometimes called a “shebang”. It is a signal.
Practical Extraction & Report Language Picture taken from
Introduction to C Programming Overview of C Hello World program Unix environment C programming basics.
Places To Put Things Exploring Perl's Built-In Variable Containers: Arrays and Hashes.
Perl Lecture #1 Scripting Languages Fall Perl Practical Extraction and Report Language -created by Larry Wall -- mid – 1980’s –needed a quick language.
CS 898N – Advanced World Wide Web Technologies Lecture 7: PERL Chin-Chih Chang
Chapter 2 Data Types, Declarations, and Displays
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
Introduction to Programming Prof. Rommel Anthony Palomino Department of Computer Science and Information Technology Spring 2011.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Outline Variables 1.
Chapter 9 Formatted Input/Output. Objectives In this chapter, you will learn: –To understand input and output streams. –To be able to use all print formatting.
A Variable is symbolic name that can be given different values. Variables are stored in particular places in the computer ‘s memory. When a variable is.
PERL Variables and data structures Andrew Emerson, High Performance Systems, CINECA.
Introduction to Perl Practical Extraction and Report Language or Pathologically Eclectic Rubbish Lister or …
C-Language Keywords(C99)
Introduction to programming in Perl. What is Perl ? Perl : Practical Extraction and Report Language by Larry Wall in 1987 Text-processing language Glue.
Introduction to Perl Programming Morris Law December 8, 2012.
1 System Administration Introduction to Scripting, Perl Session 3 – Sat 10 Nov 2007 References:  chapter 1, The Unix Programming Environment, Kernighan.
Introduction to PHP A user navigates in her browser to a page that ends with a.php extension The request is sent to a web server, which directs the request.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
COMP519: Web Programming Autumn 2010 Perl Tutorial: The very beginning  A basic Perl Program The first line Comments and statements Simple printing 
96-Summer 生物資訊程式設計實習 ( 二 ) Bioinformatics with Perl 8/13~8/22 蘇中才 8/24~8/29 張天豪 8/31 曾宇鳯.
Introduction to Perl Giorgos Georgakilas Graduated from C.E.I.D.Graduated from C.E.I.D. M.Sc. degree in ITMBM.Sc. degree in ITMB Ph.D. student in DIANA-LabPh.D.
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
Books. Perl Perl (Practical Extraction and Report Language) by Larry Wall Perl 1.0 was released to usenet's alt.comp.sources in 1987 Perl 5 was released.
Perl: Lecture 1 The language. What Perl is Merger of Unix tools – Very popular under UNIX – shell, sed, awk Programming language – C syntax Scripting.
Chapter 9: Perl Programming Practical Extraction and Report Language Some materials are taken from Sams Teach Yourself Perl 5 in 21 Days, Second Edition.
3 1 Sending Data Using an Online Form CGI/Perl Programming By Diane Zak.
Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.
Prof. Alfred J Bird, Ph.D., NBCT Office – McCormick 3rd floor 607 Office Hours – Tuesday and.
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
Project 1: Using Arrays and Manipulating Strings Essentials for Design JavaScript Level Two Michael Brooks.
Chapter 5 Strings CSC1310 Fall Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based.
Perl Basics. sh-bang !!!! Every perl program starts with a sh-bang line #!/usr/bin/perl # hello.pl printf “Hello, world!\n”; printf STDOUT “Hello, world!\n”;
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
An Overview of Perl A language for Systems and Network Administration and Management: An overview of the language.
Perl COEN 351  Thomas Schwarz, S.J Perl Scripting Language Developed by Larry Wall 1987 to speed up system administration tasks. Design principles.
Introduction to Perl October 4, 2004 Class Meeting 7 * Notes on Perl by Lenwood Heath, Virginia Tech © 2004.
Introduction to Perl. What is Perl Perl is an interpreted language. This means you run it through an interpreter, not a compiler. Similar to shell script.
Basic Variables & Operators Web Programming1. Review: Perl Basics Syntax ► Comments: start with # (ignored by Perl) ► Statements: ends with ; (performed.
Perl Variables: Array Web Programming1. Review: Perl Variables Scalar ► e.g. $var1 = “Mary”; $var2= 1; ► holds number, character, string Array ► e.g.
Perl Scripting III Arrays and Hashes (Also known as Data Structures) Ed Lee & Suzi Lewis Genome Informatics.
2.1 Scalar data - revision numeric e-14 ( = 6.35 × )‏ operators: + (addition) - (subtraction) * (multiplication) / (division)
Computer Programming for Biologists Class 4 Nov 14 th, 2014 Karsten Hokamp
 History  Ease of use  Portability  Standard  Security & Privacy  User support  Application &Popularity Today  Ten Most Popular Programming Languages.
ITM © Port, KazmanVariables - 1 ITM 352 Data types, Variables Class #4.
Programming Perl in UNIX Course Number : CIT 370 Week 2 Prof. Daniel Chen.
Arrays and Lists. What is an Array? Arrays are linear data structures whose elements are referenced with subscripts. Just about all programming languages.
2000 Copyrights, Danielle S. Lahmani Foreach example = ( 3, 5, 7, 9) foreach $one ) { $one*=3; } is now (9,15,21,27)
The Scripting Programming Language
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
96-Summer 生物資訊程式設計實習 ( 二 ) Bioinformatics with Perl 8/13~8/22 蘇中才 8/24~8/29 張天豪 8/31 曾宇鳯.
Scripting Languages Course 7 Diana Trandab ă ț Master in Computational Linguistics - 1 st year
Chapter 5 - Control Structures: Part 2
Perl Variables: Array Web Programming.
Basics of ‘C’.
Programming Perls* Objective: To introduce students to the perl language. Perl is a language for getting your job done. Making Easy Things Easy & Hard.
PHP.
EECE.2160 ECE Application Programming
EECE.2160 ECE Application Programming
Presentation transcript:

1 96-Summer 生物資訊程式設計實習 ( 二 ) Bioinformatics with Perl 8/13~8/22 蘇中才 8/24~8/29 張天豪 8/31 曾宇鳯

2 Schedule DateTimeSubjectSpeak er 8/13 一 13:30~17:30Perl Basics 蘇中才 8/15 三 13:30~17:30Programming Basics 蘇中才 8/17 五 13:30~17:30Regular expression 蘇中才 8/20 一 13:30~17:30Retrieving Data from Protein Sequence Database 蘇中才 8/22 三 13:30~17:30Perl combines with Genbank, BLAST 蘇中才 8/24 五 13:30~17:30PDB database and structure files 張天豪 8/27 一 8:30~12:30Extracting ATOM information 張天豪 8/27 一 13:30~17:30Mapping of Protein Sequence IDs and Structure IDs 張天豪 8/31 五 13:30~17:30Final and Examination 曾宇鳳

3 Reference Books Learning Perl (Perl 學習手冊 ) Beginning Perl for Bioinformatics Bioinformatics Biocomputing and Perl: An Introduction to Bioinformatics Computing Skills and Practice

4

5 Learning Perl

6 Perl Practical Extraction and Report Language Created by Larry Wall in the middle 1980`s. Suitable for “quick-and-dirty” Suitable for string-handling Powerful regular expression

7 Preparation Downloading putty.exe / pietty.exe Getting materials for this course:  Server:  ssh  Id : course1 ~ course20  Password:

8 Installing Perl on Windows Download package from   5.8/ActivePerl MSWin32-x msi 5.8/ActivePerl MSWin32-x msi Versions of Perl  Unix, Linux, Windows (ActivePerl), Mac (MacPerl) 

9 Text Editors A convenient (text) editor for programming Ultraedit: good for me Notepad: just an editor Vim: UNIX/Linux lover   _menu.html _menu.html Joe : easy to use for Unix beginner

10 Finding Help Best resource finding tool – On-line Resources, use  HTML Help in ActivePerl Command Line (highly recommended)  perldoc –f # search function  perldoc –q # search FAQ  perldoc # search module  perldoc perldoc

11 Perl Basic Starting

12 $ vi welcome #! /usr/bin/perl -w print “Hello, world\n”; $ chmod +x welcome $./welcome Hello, world $ perl welcome Hello, world Program: run thyself! perl]$ ls -al -rw-rw-r-- 1 sbb sbb 20 Jul 2 15:27 welcome perl]$ chmod +x welcome perl]$ ls -al -rwxrwxr-x 1 sbb sbb 20 Jul 2 15:27 welcome

13 #! /usr/bin/perl -w # The 'forever' program - a (Perl) program, # which does not stop until someone presses Ctrl-C. use constant TRUE => 1; use constant FALSE => 0; while ( TRUE ) { print "Welcome to the Wonderful World of Bioinformatics!\n"; sleep 1; } Using the Perl while construct

14 $ chmod +x forever $./forever Welcome to the Wonderful World of Bioinformatics!. Running forever...

15 Perl Basic Variables

16 Variables Scalar ($)  Number 1; 1.23; 12e34  String “abc”; ‘ABC’ ; “Hello, world!”; Array / List Hash (%)

17 Introducing variable containers The simplest type of variable container is the scalar ( 純量 ). In Perl, scalars can hold, for example, a number, a word, a sentence or a disk-file. $name $_address $programming_101 $z $abc $swissprot_to_interpro_mapping $SwissProt2InterProMapping Variable naming is ART !

18 scalar #!/usr/bin/perl -w # lower case for user defined ; upper case for system default my $ARGV = “example.pl"; my $number = 1.2; my $string = "Hello, world!"; my $123 = 123;#error my $abc = "123"; my $_123 = '123'; my $O000OoO00 = 1; my $OO00Oo000 = 2; my $OO00OoOOO = 3; $abc = $O000OoO00 * $OO00Oo000 - $OO00OoOOO; print $abc x 4. "\n"; print 5 x 4. "\n"; print 5 * 4. "\n";

19 Number Format (range: 1e-100 ~ 1e100 ?)  2000  1.25  -6.5e45 (-6.5*10^45)   123_456_789 Other format  0377 #octal (decimal 255)  0xFF #hexadecimal  0b #binary

20 number $integer = 12; $real = 12.34; $oct = 0377; $bin = 0b ; $hex = 0xff; $long = ; $long_ = 123_456_789; $large = 1E100;#1E200 $small = 1E-100;#1E-200 print "integer : $integer\n"; print "real : $real\n"; print "oct=$oct bin=$bin hex=$hex\n"; #printf("oct=0%o bin=0b%b hex=0x%x\n",$oct,$bin,$hex);

21 parameters of printf (ref : number) specifierOutputExample c Character a d or i Signed decimal integer 392 e Scientific notation (mantise/exponent) using e character e+2 E Scientific notation (mantise/exponent) using E character E+2 f Decimal floating point g Use the shorter of %e or %f G Use the shorter of %E or %f o Signed octal 610 s String of characters sample u Unsigned decimal integer 7235 x Unsigned hexadecimal integer 7fa X Unsigned hexadecimal integer (capital letters) 7FA p Pointer address B800:0000 n Nothing printed. The argument must be a pointer to a signed int, where the number of characters written so far is stored. % A % followed by another % character will write % to stdout.

22 operator 2 + 3#5 5.1 – 2.4#2.7 3 * 12#36 14 / 2# / 0.3#34 10 / 3#3.333… 10 % 3#1

23 Operator Function + Addition - Subtraction, Negative Numbers, Unary Negation * Multiplication / Division % Modulus ** Exponent OperatorFunction =Normal Assignment +=Add and Assign -=Subtract and Assign *=Multiply and Assign /=Divide and Assign %=Modulus and Assign **=Exponent and Assign $number = $number + 100;$number += 100;

24 Take a break … modulus  10.5 % 3.2 = ? exponentiation  2^3 = ?

25 string Format  Single quotes ‘hello’ ‘hello\nhello’ ‘hello,$name’  Double quotes “hello” “hello\nhello” “hello,$name” Exceptions  ‘\’\\’  “\”\\” #!/usr/bin/perl –w print ‘hello’; print “hello”;

26 Backslash escapes Escape Sequences Description or Character Escape Sequences Description or Character \b\b Backspace Ampersand \e\e Escape \ 0nnn Any Octal byte \f\f Form Feed \ xnn Any Hexadecimal byte \n\n New line \ cn Any Control character \r\r Carriage Return \l\l Change the next character to lowercase \t\t Tab \u\u Change the next character to uppercase \v\v Vertical Tab \\ Backslash \$\$ Dollar Sign

27 conversion between String and number $answer = “Hello ”. “ “. “ world\n”; $answer = “12”. “3”; $answer = “12” * “3”; $answer = “12Hello34” * “3”;#warning !!! $answer = “A”. 3*5; $answer = “A” x (3*5); $answer = “12”x”3”;

28 #! /usr/bin/perl -w # The 'tentimes' program - a (Perl) program, # which stops after ten iterations. use constant HOWMANY => 10; $count = 0; while ( $count < HOWMANY ) { print "Welcome to the Wonderful World of Bioinformatics!\n"; $count++; } Variable containers and loops

29 $ chmod +x tentimes $./tentimes Welcome to the Wonderful World of Bioinformatics! Running tentimes...

30 #! /usr/bin/perl -w # The 'fivetimes' program - a (Perl) program, # which stops after five iterations. use constant TRUE => 1; use constant FALSE => 0; use constant HOWMANY => 5; $count = 0; while ( TRUE ) { $count++; print "Welcome to the Wonderful World of Bioinformatics!\n"; if ( $count == HOWMANY ) { last; } Using the Perl if construct

31 #! /usr/bin/perl -w # The 'oddeven' program. use constant HOWMANY => 4; $count = 0; while ( $count < HOWMANY ) { $count++; if ( $count % 2 == 0 ) { print “$count : even\n"; } else # $count % 2 is not zero. { print “$count : odd\n"; } The oddeven program

32 Comparison operator ComparisonNumberString Equal==eq Not equal!=ne Less than<lt Greater than>gt Less than or equal<=le Greater than or equal>=ge Comparison cmp

33 Variable Interpolation #! /usr/bin/perl -w # The ‘interpolation' program which interpolate variables by variable. $language = “Perl”; $string = “I love $language”; print $string.”\n”; $string = ‘I love $language”; print $string.”\n”; $string = ‘I love ‘.$language; print $string.”\n”; $string = “I love \$language”; print $string.”\n”; $string = “I love $languages”; print $string.”\n”; #${language}s

( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA' = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA' ); Arrays: Associating Data With Numbers

35 Array

36 print "$list_of_sequences[1]\n"; GCTCAGTTCT $list_of_sequences[1] = 'CTATGCGGTA'; $list_of_sequences[3] = 'GGTCCATGAA'; Working with array elements

37 The Array

38 print "The array size is: ", $#list_of_sequences+1, ".\n"; print "The array size is: ", ".\n"; The array size is: 4. How big is the array?

= ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA' = 'CTATGCGGTA' ); print TTATTATGTT GCTCAGTTCT GACCTCTTAA = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA' = ( 'CTATGCGGTA' ); print CTATGCGGTA Adding elements to an array

= ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA' = ( 'CTATGCGGTA', 'CTATTATGTC' ) ); print TTATTATGTT GCTCAGTTCT GACCTCTTAA CTATGCGGTA = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA' = ( 'GCTCAGTTCT', 'GACCTCTTAA' ); print TTATTATGTT GCTCAGTTCT GACCTCTTAA GCTCAGTTCT GACCTCTTAA Adding more elements to an array

= ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA', 'TTATTATGTT' = 1, 2; print print GCTCAGTTCT GACCTCTTAA TTATTATGTT #clean all elements of an = (); Removing elements from an array

42 #! /usr/bin/perl -w # The 'slices' program - slicing = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA', 'CTATGCGGTA', 'ATCTGACCTC' ); ]; print = 1, 3; print print The slices program

43 TTATTATGTT GCTCAGTTCT GACCTCTTAA CTATGCGGTA ATCTGACCTC GCTCAGTTCT GACCTCTTAA CTATGCGGTA TTATTATGTT GCTCAGTTCT GACCTCTTAA CTATGCGGTA ATCTGACCTC TTATTATGTT ATCTGACCTC GCTCAGTTCT GACCTCTTAA CTATGCGGTA Results from slices...

44 #! /usr/bin/perl -w # The 'iterateW' program - iterate over an entire array # with = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA', 'CTATGCGGTA', 'ATCTGACCTC' ); $index = 0; $last_index = $#sequences; while ( $index <= $last_index ) { print "$sequences[ $index ]\n"; ++$index; } Processing every element in an array

45 TTATTATGTT GCTCAGTTCT GACCTCTTAA CTATGCGGTA ATCTGACCTC Results from iterateW...

46 #! /usr/bin/perl -w # The 'iterateF' program - iterate over an entire array # with = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA', 'CTATGCGGTA', 'ATCTGACCTC' ); foreach $value ) { print "$value\n"; } The iterateF program

= ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA', 'CTATGCGGTA', 'ATCTGACCTC' = ( TTATTATGTT, GCTCAGTTCT, GACCTCTTAA, CTATGCGGTA, ATCTGACCTC = qw( TTATTATGTT GCTCAGTTCT GACCTCTTAA CTATGCGGTA ATCTGACCTC ); Making lists easier to work with

48 Quoted words #!/usr/bin/perl -w # The ‘quoted_words’ = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA' = qw/TTATTATGTT GCTCAGTTCT = qw{TTATTATGTT GCTCAGTTCT = qw!TTATTATGTT GCTCAGTTCT = qw[TTATTATGTT GCTCAGTTCT = qw = qw#TTATTATGTT GCTCAGTTCT GACCTCTTAA#; print print "The array size is: ", $#list_of_sequences+1, ".\n";

49 pop/push/shift/unshift #!/usr/bin/perl -w #The “array_operator” = 5..9; print "array = $item = print "item = [$item]\n"; print "array = 9; print "array = $item = print "item = [$item]\n"; print "array = 1..5; print "array =

50 pop/push/shift/unshift array = [ ] ==========pop========== item = [9] array = [ ] ==========push 9========== array = [ ] ==========shift========== item = [5] array = [ ] ==========unshift 1..5========== array = [ ]

51 reverse / sort #!/usr/bin/perl -w #The “array_operator1” = qw / /; print "array = print "reverse array = print "sort array = reverse print "reverse sort array = sort print "sort reverse array =

52 reverse / sort array = [ ] ======================================== reverse array = [ ] ======================================== sort array = [ ] ======================================== reverse sort array = [ ] ======================================== sort reverse array = [ ]

53 split/join #!/usr/bin/perl -w #The “array_operator2” program - join / split $string = " = split/ /, $string; print "array = $string = join print "array = [$string]\n"; array = [ ] array = [5,4,9,8,1,3,6,2,7,10]

54 How to map between IP and domain name ? IPDomain name gene.csie.ntu.edu.tw biominer.csie.ntu.edu.tw knn.csie.ntu.edu.tw

55 Use 2 array to map between IP and domain name gene.csie.ntu.edu.tw biominer.csie.ntu.edu.tw knn.csie.ntu.edu.tw [0] [1] [2] [0] [1] [2]

56 How to search a certain ip or domain name gene.csie.ntu.edu.tw biominer.csie.ntu.edu.tw knn.csie.ntu.edu.tw [0] [1] [2] [0] [1] [2]

57 Why Hash ? %Domain_name gene.csie.ntu.edu.tw biominer.csie.ntu.edu.tw knn.csie.ntu.edu.tw [ ] [ ] [ ] KeyValue

58 How to get a certain domain name? %Domain_name gene.csie.ntu.edu.tw biominer.csie.ntu.edu.tw knn.csie.ntu.edu.tw [ ] [ ] [ ] KeyValue $Domain_name{“ ”}

59 Examples of Hash

60 Hashes: Associating Data With Words %nucleotide_bases %nucleotide_bases = ( A, Adenine, T, Thymine ); %nucleotide_based = ( A => Adenine, T => Thymine); keyvalue

61 print "The expanded name for 'A' is $nucleotide_bases{ 'A' }\n"; The expanded name for 'A' is Adenine Working with hash entries

62 %nucleotide_bases = ( A, Adenine, T, Thymine = keys %nucleotide_bases; print "The names in the %nucleotide_bases hash The names in the %nucleotide_bases hash are: A T %nucleotide_bases = ( A, Adenine, T, Thymine ); $hash_size = keys %nucleotide_bases; print "The size of the %nucleotide_bases hash is: $hash_size\n"; The size of the %nucleotide_bases hash is: 2 How big is the hash?

63 $nucleotide_bases{ 'G' } = 'Guanine'; $nucleotide_bases{ 'C' } = 'Cytosine'; %nucleotide_bases = ( A => Adenine, T => Thymine, G => Guanine, C => Cytosine ); Adding entries to a hash

64 The Grown %nucleotide_bases Hash

65 delete $nucleotide_bases{ ‘C' }; $nucleotide_bases{ 'C' } = undef; Removing entries from a hash

66 #! /usr/bin/perl -w # The ‘slicing_hashes' program – extract a certain subset among a hash %gene_counts = ( Human => 31000, 'Thale cress' => 26000, 'Nematode worm' => 18000, 'Fruit fly' => 13000, Yeast => 6000, 'Tuberculosis microbe' => Human, “Fruit fly”, 'Tuberculosis microbe' }; print Slicing hashes

67 #! /usr/bin/perl -w # The 'bases' program - a hash of the nucleotide bases. %nucleotide_bases = ( A => Adenine, T => Thymine, G => Guanine, C => Cytosine ); $sequence = 'CTATGCGGTA'; print "\nThe sequence is $sequence, which expands to:\n\n"; while ( $sequence =~ /(.)/g ) { print "\t$nucleotide_bases{ $1 }\n"; } Working with hash entries: a complete example

68 The sequence is CTATGCGGTA, which expands to: Cytosine Thymine Adenine Thymine Guanine Cytosine Guanine Thymine Adenine Results from bases...

69 #! /usr/bin/perl -w # The 'genes' program - a hash of gene counts. use constant LINE_LENGTH => 60; %gene_counts = ( Human => 31000, 'Thale cress' => 26000, 'Nematode worm' => 18000, 'Fruit fly' => 13000, Yeast => 6000, 'Tuberculosis microbe' => 4000 ); Processing every entry in a hash

70 print '-' x LINE_LENGTH, "\n"; while ( ( $genome, $count ) = each %gene_counts ) { print "`$genome' has a gene count of $count\n"; } print '-' x LINE_LENGTH, "\n"; foreach $genome ( sort keys %gene_counts ) { print "`$genome' has a gene count of $gene_counts{ $genome }\n"; } print '-' x LINE_LENGTH, "\n"; The genes program, cont.

'Human' has a gene count of 'Tuberculosis microbe' has a gene count of 4000 'Fruit fly' has a gene count of 'Nematode worm' has a gene count of 'Yeast' has a gene count of 6000 'Thale cress' has a gene count of 'Fruit fly' has a gene count of 'Human' has a gene count of 'Nematode worm' has a gene count of 'Thale cress' has a gene count of 'Tuberculosis microbe' has a gene count of 4000 'Yeast' has a gene count of Results from genes...

72 How to sort by the values ?

73 Exercise Protein sequences

74 FASTA format >P53_HUMAN (P04637) Cellular tumor antigen p53 (Tumor suppressor p53) (Phosphoprotein p53) (Antigen NY-CO-13) - Homo sapiens (Human). MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD

75 Read a FASTA file #!/usr/bin/perl -w my ( $line, $queryname, $queryseq ); while ( $line = <> ) { if ( $line =~ />(.+?)\s.+/) { $queryname = $1 ; } else { chomp $line; $queryseq = $queryseq. $line; }

76 Exercise Read more then one sequence Store the protein names and sequences from disorder.fa by 2 array Show all of protein names and sequences. Show the number of proteins and residues. ($len = length $seq;)

77 Exercise Read more then one sequence Store the protein names and sequences from disorder.fa by a hash Show the protein names and sequences sorted by protein name Find the longest sequence