Download presentation
Presentation is loading. Please wait.
1
Introduction to Python
BCHB524 Lecture 5 BCHB524 - Edwards
2
Outline Homework #2 Solutions Homework #1 Notes DNA as a string
Extracting codons in DNA Counting in-frame codons in DNA Reverse Complement Program Input/Output raw_input, command-line arguments standard-input, standard-output, redirection BCHB524 - Edwards
3
Homework #1 Notes Python programs: Writeup: Upload .py files
Don't paste into comment box Don't paste into your writeup Writeup: Upload .txt files, Text document preferred BCHB524 - Edwards
4
Homework #1 Notes Multiple submissions: Grading: OK, but…
…I'll ignore all except the last one Make each (re-)submission complete Grading: Random grading order Comments Grading "curve" BCHB524 - Edwards
5
Review Printing and execution Variables and basic data-types:
integers, floats, strings Arithmetic with, conversion between String characters and chunks, string methods Functions, using/calling and defining: Use in any expression Parameters as input, return for output Control Flow: if statements – conditional execution for statements – iterative execution BCHB524 - Edwards
6
DNA as a string seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg" seqlen = len(seq) # set i to 0, 3, 6, 9, ..., 36 for i in range(0,seqlen,3): # extract the codon as a string codon = seq[i:i+3] print codon print "Number of Met. amino-acids", seq.count("atg") BCHB524 - Edwards
7
DNA as a string What about upper and lower case?
ATG vs atg? Differences between DNA and RNA sequence? Substitute U for each T? How about ambiguous nucleotide symbols? What should we do with ‘N’ and other ambiguity codes (R, Y, W, S, M, K, H, B, V, D)? Strings don’t know any biology! BCHB524 - Edwards
8
DNA as a string seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg" def inFrameMet(seq): seqlen = len(seq) count = 0 for i in range(0,seqlen,3): codon = seq[i:i+3] if codon.upper() == "ATG": count = count + 1 return count print "Number of Met. amino-acids", inFrameMet(seq) BCHB524 - Edwards
9
DNA as a string input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg" def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) comp = complements[i] return comp def reverseComplement(seq): newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print "Reverse complement:", reverseComplement(input_seq) BCHB524 - Edwards
10
DNA as a string input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg" def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print "Reverse complement:", reverseComplement(input_seq) BCHB524 - Edwards
11
Creating reusable programs
Need to get input data and options from the user …often us, but sometimes others, or us later. Sometimes, want completely new inputs …but often, want the same or similar input. Sometimes, typing the input is OK …but often, want to use data in a file. Sometimes, output to the screen is OK …but often, want the result to go into a file. BCHB524 - Edwards
12
Interactive input input_seq = raw_input("Type your codon: ") def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print "Reverse complement:", reverseComplement(input_seq) BCHB524 - Edwards
13
Command-line input import sys input_seq = sys.argv[1] def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print "Reverse complement:", reverseComplement(input_seq) BCHB524 - Edwards
14
Interactive and file input
import sys input_seq = sys.stdin.read() def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print "Reverse complement:", reverseComplement(input_seq) BCHB524 - Edwards
15
File input only import sys seq_file = sys.argv[1] # MAGIC: open file, read contents, and remove whitespace input_seq = ''.join(open(seq_file).read().split()) def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print "Reverse complement:", reverseComplement(input_seq) BCHB524 - Edwards
16
Input Summary raw_input provides interactive values from the user (also copy-and-paste) sys.stdin.read() provides interactive or file-based values from the user (also copy-and-paste) sys.argv[1] provides command-line values from the user (also copy-and-paste) value can be a filename that provides user-input Terminal standard-input redirection "<" can be used to send a file's contents to raw_input or sys.stdin.read() BCHB524 - Edwards
17
Output is easy… Just use print, right?
Print statements go to the terminal's standard-output. We can redirect to a file using ">" Errors still get printed to the terminal. We can also link programs together – standard-output to standard-input using "|" Also, cat just writes its file to standard out BCHB524 - Edwards
18
Connect reverse complement w/ codon counting…
Create and test rc.py from earlier slides: Sequence from standard-input Reverse complement sequence to standard-output Create and test codons.py from earlier slides: Count to standard-output Place example sequence in file: test.seq Execute: cat test.seq | python rc.py | python codons.py BCHB524 - Edwards
19
In general Windows and OS X have similar facilities
cmd in windows, terminal in OS X Powerful mechanism for making reusable programs No knowledge of python required for use! Most bioinformatics software is used from the command-line w/ command-line arguments: Files provide sequence data, etc. I'll promote this style of program I/O. BCHB524 - Edwards
20
Exercise 1 Use NCBI Probe (“google NCBI Probe”) to look up PCR markers for your favorite gene Write a command-line program to compute the reverse complement sequence for the forward and reverse primer. BCHB524 - Edwards
21
Exercise 2 Write a command-line program to test whether a PCR primer is a reverse complement palindrome. Such a primer might fold and self-hybridize! Test your program on at least the following primers: TTGAGTAGACGCGTCTACTCAA TTGAGTAGACGTCGTCTACTCAA ATATATATATATATAT ATCTATATATATGTAT BCHB524 - Edwards
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.