9/28/2015BCHB524 - 2015 - Edwards Basic Python Review BCHB524 2015 Lecture 8.

Slides:



Advertisements
Similar presentations
Chapter 6 Lists and Dictionaries CSC1310 Fall 2009.
Advertisements

10/1/2014BCHB Edwards Python Modules and Basic File Parsing BCHB Lecture 10.
10/6/2014BCHB Edwards Sequence File Parsing using Biopython BCHB Lecture 11.
Web-Applications: TurboGears II BCHB Lecture 26 12/03/2014BCHB Edwards.
Lecture 8: Basic concepts of subroutines. Functions In perl functions take the following format: – sub subname – { my $var1 = $_[0]; statements Return.
Python programs How can I run a program? Input and output.
“Everything Else”. Find all substrings We’ve learned how to find the first location of a string in another string with find. What about finding all matches?
Relational Databases: Basic Concepts BCHB Lecture 21 11/12/2014BCHB Edwards.
9/16/2015BCHB Edwards Introduction to Python BCHB Lecture 5.
10/20/2014BCHB Edwards Advanced Python Concepts: Modules BCHB Lecture 14.
Ch. 10 For Statement Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2012.
9/14/2015BCHB Edwards Introduction to Python BCHB Lecture 4.
Statistical significance of alignment scores Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering.
9/23/2015BCHB Edwards Advanced Python Data Structures BCHB Lecture 7.
Dictionaries.   Review on for loops – nested for loops  Dictionaries (p.79 Learning Python)  Sys Module for system arguments  Reverse complementing.
9/21/2015BCHB Edwards Python Data Structures: Lists BCHB Lecture 6.
11/4/2015BCHB Edwards Advanced Python Concepts: Object Oriented Programming BCHB Lecture 17.
GE3M25: Computer Programming for Biologists Python, Class 5
Web-Applications: TurboGears II BCHB Lecture 26 12/7/2015BCHB Edwards.
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015.
11/9/2015BCHB Edwards Advanced Python Concepts: OOP & Inheritance BCHB Lecture 18.
9/11/2015BCHB Edwards Introduction to Python BCHB Lecture 3.
For Loop GCSE Computer Science – Python. For Loop The for loop iterates over the items in a sequence, which can be a string or a list (we will discuss.
DAY 3. ADVANCED PYTHON PRACTICE SANGREA SHIM TAEYOUNG LEE.
Sequence File Parsing using Biopython
Introduction to Python
Relational Databases: Basic Concepts
Advanced Python Idioms
Introduction to Python
Introduction to Programming
Advanced Python Concepts: Modules
Advanced Python Data Structures
Python Modules and Basic File Parsing
Python Modules and Basic File Parsing
Advanced Python Data Structures
Introduction to Python
Introduction to Python
Psuedo Code.
Advanced Python Concepts: Object Oriented Programming
Advanced Python Concepts: OOP & Inheritance
Sequence File Parsing using Biopython
Introduction to Python
Basic Python Review BCHB524 Lecture 8 BCHB524 - Edwards.
Advanced Python Concepts: OOP & Inheritance
Advanced Python Concepts: Object Oriented Programming
Python Data Structures: Lists
Introduction to Python
6. Dictionaries and sets Rocky K. C. Chang 18 October 2018
Advanced Python Concepts: Exceptions
Introduction to Python
Advanced Python Data Structures
Advanced Python Concepts: Modules
Advanced Python Concepts: OOP & Inheritance
Relational Databases: Basic Concepts
Relational Databases: Basic Concepts
Introduction to Python
Advanced Python Idioms
Python Basics with Jupyter Notebook
Basic Python Review BCHB524 Lecture 8 BCHB524 - Edwards.
Introduction to Python
Python Data Structures: Lists
Advanced Python Concepts: Exceptions
Python Data Structures: Lists
Advanced Python Idioms
Advanced Python Concepts: Modules
Python Modules and Basic File Parsing
Advanced Python Concepts: Object Oriented Programming
Sequence File Parsing using Biopython
“Everything Else”.
Presentation transcript:

9/28/2015BCHB Edwards Basic Python Review BCHB Lecture 8

9/28/2015BCHB Edwards Python Data-Structures Mutable and changeable storage of many items Lists - Access by index or iteration Dictionaries - Access by key or iteration Sets - Access by iteration, membership test Files - Access by iteration, as string Lists of numbers (range) Strings → List (split), List → String (join) Reading sequences, parsing codon table. 2

9/28/2015BCHB Edwards Class Review Exercises 1. DNA sequence length * 2. Are all DNA symbols valid? * 3. DNA sequence composition * 4. Pretty-print codon table ** 5. Compute codon usage ** 6. Read chunk format sequence from file * 7. Parse and print NCBI taxonomy names ** 3

9/28/2015BCHB Edwards DNA Sequence Length Write a program to determine the length of a DNA sequence provided in a file. 4

9/28/2015BCHB Edwards DNA Sequence Length 5 # Import the required modules import sys # Check there is user input if len(sys.argv) < 2: print "Please provide a DNA sequence file on the command-line." sys.exit(1) # Assign the user input to a variable seqfile = sys.argv[1] # and read the sequence seq = ''.join(file(seqfile).read().split()) # Compute the sequence length seqlen = len(seq) # Output a summary of the user input and the result print "Input DNA sequence:",seq print "Input DNA sequence length:",seqlen

9/28/2015BCHB Edwards Valid DNA Symbols Write a program to determine if a DNA sequence provided in a file contains any invalid symbols. 6

9/28/2015BCHB Edwards DNA Composition Write a program to count the proportion of each symbol in a DNA sequence, provided in a file. 7

9/28/2015BCHB Edwards8 Pretty-print codon table Write a program which takes a codon table file (standard.code) as input, and prints the codon table in the format shown. Hint: Use 3 (nested) loops though the nucleotide values

Pretty-print codon table 9/28/2015BCHB Edwards9 # read codons from a file def readcodons(codonfile): f = open(codonfile) data = {} for l in f: sl = l.split() key = sl[0] value = sl[2] data[key] = value f.close() b1 = data['Base1'] b2 = data['Base2'] b3 = data['Base3'] aa = data['AAs'] st = data['Starts'] codons = {} init = {} n = len(aa) for i in range(n): codon = b1[i] + b2[i] + b3[i] codons[codon] = aa[i] init[codon] = (st[i] == 'M') return codons,init

Pretty-print codon table 9/28/2015BCHB Edwards10 # Import the required modules import sys # Check there is user input if len(sys.argv) < 2: print "Please provide a codon-table on the command-line." sys.exit(1) # Assign the user input to variables codonfile = sys.argv[1] # Call the appropriate functions to get the codon table and the sequence codons,init = readcodons(codonfile) # Loop through the nucleotides (position 2 changes across the row). # Bare print starts a new line for n1 in 'TCAG': for n3 in 'TCAG': for n2 in 'TCAG': codon = n1+n2+n3 print codon,codons[codon], if init[codon]: print "i ", else: print " ", print print

9/28/2015BCHB Edwards Codon usage Write a program to compute the codon usage of gene whose DNA sequence provided in a file. Assume translation starts with the first symbol of the provided gene sequence. Use a dictionary to count the number of times each codon appears, and then output the codon counts in amino-acid order. 11

9/28/2015BCHB Edwards Chunk format sequence Write a program to compute the sequence composition from a DNA sequence file in "chunk" format. Download these files from the data-directory SwissProt_Format_Ns.seq SwissProt_Format.seq Check that your program correctly reads these sequences Download and check these files from the data- directory, too: chunk.seq, chunk_ns.seq 12

9/28/2015BCHB Edwards Taxonomy names Write a program to list all the scientific names from a NCBI taxonomy file. Download the names.dmp file from the data- directory Look at the file and figure out how to parse it Read the file, line by line, and print out only those names that represent scientific names of species. 13

9/28/2015BCHB Edwards Exercise 1 a) Modify your DNA translation program to translate in each forward frame (1,2,3) b) Modify your DNA translation program to translate in each reverse translation frame too. c) Modify your translation program to handle 'N' symbols in the third position of a codon If all four codons represented correspond to the same amino-acid, then output that amino-acid. Otherwise, output 'X'. 14