Download presentation
Presentation is loading. Please wait.
Published byGladys Bennett Modified over 9 years ago
1
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp http://bioinf.gen.tcd.ie/GE3M25/programming
2
Computer Programming for Biologists recap project scalar variables built-in functions exercises Overview
3
Computer Programming for Biologists Topics covered in the first class: 1.Unix 2.Perl Recap
4
Computer Programming for Biologists 1. Unix details commands: mkdir, cd, ls, pwd, rm, chmod command parameters (ls -l) command line extension with TAB key command history (arrow up or down) special treatment of spaces (quotes or backslash) information and help in manual pages (man ls) Recap
5
Computer Programming for Biologists 2. Perl details variables ($repeat, $message) statements ($repeat = 4;) print function (print ‘Hello world!’;) newline (print “Hello world!\n”;) x operator (print $message x 4;) reading user input from command line ($repeat = shift; or $repeat = <>;) Recap
6
Computer Programming for Biologists Overall goal: development of a Perl script for sequence analysis Input: file with sequences in FASTA format and command line options Output: number and length of sequences GC content base / aminoacid composition reverse complement translations virtual enzyme digest Course project
7
>124249405 ATGCCACCGAAGTTCGACCCCAACGAGATCAAGGTCGTATACCTGAGGTGCACCAGAGGTGAAGTCGGTG CCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTGTCTCCAAAAAAGGTTGGTGATGACATTGC CAAGGCAACGGGTGACTGGAAGGGCCTGAGGATTACAGTGAAACTGACCATTGAGAACAGACAGGCCCAG AACAGAAAAACATTAAACACAATGGGAATATCACTTTTGATGAGATCGTCAACATTGCTCGACAGATGCG CTAGTTAA >124249383 ATGCCACCGAAGTTCGACCCCAACGAGATCAAGGTCGTATACCTGAGGTGCACCAGAGGTGAAGTCGGTG CCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTGTCTCCAAAAAAGGTTGGTGATGACATTGC CAAGGCAACGGGTGACTGGAAGGGCCTGAGGATTACGGTGAAACTGACCATTGAGAACAGACAGGCCCAG AACAGAAAAACATTAAACACAATGGGAATATCACTTTTGATGAGATCGTCAACATTGCTCGACAGATGCG CTAGTTAA >110350667 ATGCCACCGAAGTTCGACCCCAACGAGATCAAGGTCGTATACCTGAGGTGCACCAGAGGTGAAGTCGGTG CCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTGTCTCCAAAAAAGGTTGGTGATGACATTGC CAAGGCAACGGGTGACTGGAAGGGCCTGAGGATTACGGTGAAACTGACCATTGAGAACAGACAGGCCCAG AACAGAAAAACATTAAACACAATGGGAATATCACTTTTGATGAGATCGTCAACATTGCTCGACAGATGCG CTAGTTAA Computer Programming for Biologists FASTA Format headers (starting with ‘>’ followed by sequence ID)
8
>124249405 ATGCCACCGAAGTTCGACCCCAACGAGATCAAGGTCGTATACCTGAGGTGCACCAGAGGTGAAGTCGGTG CCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTGTCTCCAAAAAAGGTTGGTGATGACATTGC CAAGGCAACGGGTGACTGGAAGGGCCTGAGGATTACAGTGAAACTGACCATTGAGAACAGACAGGCCCAG AACAGAAAAACATTAAACACAATGGGAATATCACTTTTGATGAGATCGTCAACATTGCTCGACAGATGCG CTAGTTAA >124249383 ATGCCACCGAAGTTCGACCCCAACGAGATCAAGGTCGTATACCTGAGGTGCACCAGAGGTGAAGTCGGTG CCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTGTCTCCAAAAAAGGTTGGTGATGACATTGC CAAGGCAACGGGTGACTGGAAGGGCCTGAGGATTACGGTGAAACTGACCATTGAGAACAGACAGGCCCAG AACAGAAAAACATTAAACACAATGGGAATATCACTTTTGATGAGATCGTCAACATTGCTCGACAGATGCG CTAGTTAA >110350667 ATGCCACCGAAGTTCGACCCCAACGAGATCAAGGTCGTATACCTGAGGTGCACCAGAGGTGAAGTCGGTG CCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTGTCTCCAAAAAAGGTTGGTGATGACATTGC CAAGGCAACGGGTGACTGGAAGGGCCTGAGGATTACGGTGAAACTGACCATTGAGAACAGACAGGCCCAG AACAGAAAAACATTAAACACAATGGGAATATCACTTTTGATGAGATCGTCAACATTGCTCGACAGATGCG CTAGTTAA sequence (split across multiple lines) Computer Programming for Biologists FASTA Format
9
Computer Programming for Biologists Example usage: sequence length and GC content Course project command output
10
Computer Programming for Biologists Example usage: base composition and reverse complement Course project
11
Computer Programming for Biologists Example usage: translation and digest Course project
12
Computer Programming for Biologists Basic elements for programming: hold information allow changing of information organize data complex constructs possible special operations for data handling Variables
13
Computer Programming for Biologists Three different types in Perl: 1.scalar 2.array 3.hash Variables
14
Computer Programming for Biologists 1) Scalars: Content: number or string of characters Variable name starts with dollar sign ($) followed by letter or number, can contain underscore Variables
15
Computer Programming for Biologists Variables
16
Computer Programming for Biologists Variables
17
Computer Programming for Biologists Special scalars: $_default scalar $a, $bspace holders for comparisons $0name of program $!system error messages See man perlvar for many more special variables Variables
18
Computer Programming for Biologists Practical session: Go to http://bioinf.gen.tcd.ie/GE3M25/class2 and try the ‘Recap’ exercises Scalars
19
Built-in functions for scalars lc (change letters in string to lower case) uc (change letters in string to upper case) chop (remove last character) chomp (remove last character if it’s whitespace reverse (reverse list or string) length (calculate length of a string) split (split a string into a list) substr (extract parts of a string) tr (translation of text) Computer Programming for Biologists
20
Built-in functions for scalars Different ways of using functions: $out = uc($in); $out = uc $in; $out = uc; # works on default variable $_ Combination of functions: $out = uc(reverse($in)); $out = uc reverse $in; Computer Programming for Biologists
21
Built-in functions online help: http://perldoc.perl.org/ more help available on the command line: man perlfunc overview of all built-in functions perldoc -f command information on a specific command only Computer Programming for Biologists
22
Practical session: Go to http://bioinf.gen.tcd.ie/GE3M25/class2 and try the ‘Functions’ exercises Built-in functions
23
Computer Programming for Biologists 1)Write a program that takes some text from the command line, prints it out in capital letters and also reports the length of the text, e.g.: caps.pl ‘Hello World!’ returns: HELLO WORLD! (length: 12) 2)Write a program that takes a DNA sequence from the command line and prints out the reverse complement. Make sure that it works both with small and capital letters, e.g. revcomp.pl aatTTgggcca returns: TGGCCCAAATT Excercises
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.