1 Essential Computing for Bioinformatics Bienvenido Vélez UPR Mayaguez Lecture 5 High-level Programming with Python Part II: Container Objects Reference:

Slides:



Advertisements
Similar presentations
Introduction to perl programming: the minimum to know! Bioinformatic and Comparative Genome Analysis Course HKU-Pasteur Research Centre - Hong Kong, China.
Advertisements

Uses of Cloned Genes sequencing reagents (eg, probes) protein production insufficient natural quantities modify/mutagenesis library screening Expression.
The genetic code.
Center for Biological Sequence Analysis The Technical University of Denmark DTU Chromatin and Gene Expression in E. coli Dave Ussery Biological Sequence.
Center for Biological Sequence Analysis Prokaryotic gene finding Marie Skovgaard Ph.D. student
Restriction Enzymes Lecture 15: 1 11/20/ Definition: enzymes that recognize specific double-stranded sequences and hydrolyze the phosphodiester.
MARC: Developing Bioinformatics Programs July 2009 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Essential Computing for Bioinformatics 1 Lecture.
 -GLOBIN MUTATIONS AND SICKLE CELL DISORDER (SCD) - RESTRICTION FRAGMENT LENGTH POLYMORPHISMS (RFLP)
ATG GAG GAA GAA GAT GAA GAG ATC TTA TCG TCT TCC GAT TGC GAC GAT TCC AGC GAT AGT TAC AAG GAT GAT TCT CAA GAT TCT GAA GGA GAA AAC GAT AAC CCT GAG TGC GAA.
Supplementary Fig.1: oligonucleotide primer sequences.
Gene Mutations Worksheet
Transcription & Translation Worksheet
Crick’s early Hypothesis Revisited. Or The Existence of a Universal Coding Frame Axel Bernal UPenn Center for Bioinformatics Jean-Louis Lassez Coastal.
Today… Genome 351, 8 April 2013, Lecture 3 The information in DNA is converted to protein through an RNA intermediate (transcription) The information in.
Figure S1. Sequence alignment of yeast and horse cyt-c (Identity~60%), green highly conserved residues. There are 40 amino acid differences in the primary.
Dictionaries.
GENE MUTATIONS aka point mutations. DNA sequence ↓ mRNA sequence ↓ Polypeptide Gene mutations which affect only one gene Transcription Translation © 2010.
IGEM Arsenic Bioremediation Possibly finished biobrick for ArsR by adding a RBS and terminator. Will send for sequencing today or Monday.
 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students.
Nature and Action of the Gene
Biological Dynamics Group Central Dogma: DNA->RNA->Protein.
These materials were developed with funding from the US National Institutes of Health grant #2T36 GM to the Pittsburgh Supercomputing Center 1 
1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max.
Introduction to Python for Biologists Lecture 2 This Lecture Stuart Brown Associate Professor NYU School of Medicine.
Math 15 Introduction to Scientific Data Analysis Lecture 10 Python Programming – Part 4 University of California, Merced Today – We have A Quiz!
Undifferentiated Differentiated (4 d) Supplemental Figure S1.
Supplemental Table S1 For Site Directed Mutagenesis and cloning of constructs P9GF:5’ GAC GCT ACT TCA CTA TAG ATA GGA AGT TCA TTT C 3’ P9GR:5’ GAA ATG.
Lecture 10, CS5671 Neural Network Applications Problems Input transformation Network Architectures Assessing Performance.
Fig. S1 siControl E2 G1: 45.7% S: 26.9% G2-M: 27.4% siER  E2 G1: 70.9% S: 9.9% G2-M: 19.2% G1: 57.1% S: 12.0% G2-M: 30.9% siRNF31 E2 A B siRNF31 siControl.
PART 1 - DNA REPLICATION PART 2 - TRANSCRIPTION AND TRANSLATION.
TRANSLATION: information transfer from RNA to protein the nucleotide sequence of the mRNA strand is translated into an amino acid sequence. This is accomplished.
 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students.
NSCI 314 LIFE IN THE COSMOS 4 - The Biochemistry of Life on Earth Dr. Karen Kolehmainen Department of Physics CSUSB
Prodigiosin Production in E. Coli Brian Hovey and Stephanie Vondrak.
Passing Genetic Notes in Class CC106 / Discussion D by John R. Finnerty.
Supplementary materials
Dictionaries. A “Good morning” dictionary English: Good morning Spanish: Buenas días Swedish: God morgon German: Guten morgen Venda: Ndi matscheloni Afrikaans:
Suppl. Figure 1 APP23 + X Terc +/- Terc +/-, APP23 + X Terc +/- G1Terc -/-, APP23 + X G1Terc -/- G2Terc -/-, APP23 + X G2Terc -/- G3Terc -/-, APP23 + and.
Structure and Function of DNA DNA Replication and Protein Synthesis.
RA(4kb)- Atggagtccgaaatgctgcaatcgcctcttctgggcctgggggaggaagatgaggc……………………………………………….. ……………………………………………. ……………………….,……. …tactacatctccgtgtactcggtggagaagcgtgtcagatag.
Example 1 DNA Triplet mRNA Codon tRNA anticodon A U A T A U G C G
Name of presentation Month 2009 SPARQ-ed PROJECT Mutations in the tumor suppressor gene p53 Pulari Thangavelu (PhD student) April Chromosome Instability.
DNA, RNA and Protein.
Bienvenido Vélez UPR Mayaguez Using Molecular Biology to Teach Computer Science 1 These materials were developed with funding from the US National Institutes.
Ji-Yoon Park Nanoparticle-Based Theorem Proving.
The response of amino acid frequencies to directional mutation pressure in mitochondrial genomes is related to the physical properties of the amino acids.
Ms. Hatch, What are we doing today?
Using Molecular Biology to Teach Computer Science
Fundamentals of Protein Structure
Nanoparticle-based Theorem Proving
Modelling Proteomes.
Supplementary information Table-S1 (Xiao)
Sequence – 5’ to 3’ Tm ˚C Genome Position HV68 TMER7 Δ mt. Forward
Python.
Supplemental Table 3. Oligonucleotides for qPCR
Laboratory Encounters in Plant Genomics
GENE MUTATIONS aka point mutations © 2016 Paul Billiet ODWS.
Supplementary Figure 1 – cDNA analysis reveals that three splice site alterations generate multiple RNA isoforms. (A) c.430-1G>C (IVS 6) results in 3.
Huntington Disease (HD)
DNA By: Mr. Kauffman.
DNA and RNA.
Gene architecture and sequence annotation
PROTEIN SYNTHESIS RELAY
Molecular engineering of photoresponsive three-dimensional DNA
Fundamentals of Protein Structure
Laboratory Encounters in Plant Genomics
Python.
Station 2 Protein Synethsis.
6.096 Algorithms for Computational Biology Lecture 2 BLAST & Database Search Manolis Piotr Indyk.
Shailaja Gantla, Conny T. M. Bakker, Bishram Deocharan, Narsing R
Presentation transcript:

1 Essential Computing for Bioinformatics Bienvenido Vélez UPR Mayaguez Lecture 5 High-level Programming with Python Part II: Container Objects Reference: How to Think Like a Computer Scientist: Learning with Python (Ch 3-6)

2 Outline Lists Matrices Tuples Dictionaries

3 List Values [10, 20, 30, 40] ['spam', 'bungee', 'swallow'] ['hello', 2.0, 5, [10, 20]] [] Lists can be heterogeneous and nested The empty list

4 Generating Integer Sequences >>> range(1,5) [1, 2, 3, 4] >>> range(10) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> range(1, 10, 2) [1, 3, 5, 7, 9] In General range(first,last+1,step)

5 Accessing List Elements >> words=['hello', 'my', 'friend'] >> words[1] 'my' >> words[1:3] ['my', 'friend'] >> words[-1] 'friend' >> 'friend' in words True >> words[0] = 'goodbye' >> print words ['goodbye', 'my', 'friend'] slices single element negative index Testing List membership Lists are mutable

6 More List Slices Slicing operator always returns a new list >> numbers = range(1,5) >> numbers[1:] [1, 2, 3, 4] >> numbers[:3] [1, 2] >> numbers[:] [1, 2, 3, 4]

7 Modifying Slices of Lists >>> list = ['a', 'b', 'c', 'd', 'e', 'f'] >>> list[1:3] = ['x', 'y'] >>> print list ['a', 'x', 'y', 'd', 'e', 'f'] >>> list[1:3] = [] >>> print list ['a', 'd', 'e', 'f'] >>> list = ['a', 'd', 'f'] >>> list[1:1] = ['b', 'c'] >>> print list ['a', 'b', 'c', 'd', 'f'] >>> list[4:4] = ['e'] >>> print list ['a', 'b', 'c', 'd', 'e', 'f'] Inserting slices Deleting slices Replacing slices

8 Traversing Lists ( 2 WAYS) for in : i = 0 while i ): = [i] i = i + 1 Which one do you prefer? Why?

9 Traversal Examples for number in range(20): if number % 2 == 0: print number for fruit in ['banana', 'apple', 'quince']: print 'I like to eat ' + fruit + 's!'

10 Python Sequence Types Type Description Elements Mutable StringType Character string Characters only no UnicodeType Unicode character string Unicode characters only no ListType List Arbitrary objects yes TupleType Immutable List Arbitrary objects no XRangeType return by xrange() Integers no BufferType Bufferreturn by buffer()arbitrary objects of one typeyes/no

11 Operations on Sequences Operator/Function Action Action on Numbers [... ], (... ), '... 'creation s + t concatenation addition s * n repetition n times multiplication s[i] indexation s[i:k] slice x in s membership x not in sabsence for a in s traversal len(s) length min(s) return smallest element max(s) return greatest element

12 Exercises Return the list of codons in a DNA sequence for a given frame Return the lists of restriction sites for an enzyme in a DNA sequence Return the list of restriction sites for a lists of enzymes in a DNA sequence Design and implement Python functions to satisfy the following contracts:

13 Dictionaries Dictionaries are mutable unordered collections which may contain objects of different sorts. The objects can be accessed using a key.

14 A Codon -> AminoAcid Dictionary >> code = { ’ttt’: ’F’, ’tct’: ’S’, ’tat’: ’Y’, ’tgt’: ’C’,... ’ttc’: ’F’, ’tcc’: ’S’, ’tac’: ’Y’, ’tgc’: ’C’,... ’tta’: ’L’, ’tca’: ’S’, ’taa’: ’*’, ’tga’: ’*’,... ’ttg’: ’L’, ’tcg’: ’S’, ’tag’: ’*’, ’tgg’: ’W’,... ’ctt’: ’L’, ’cct’: ’P’, ’cat’: ’H’, ’cgt’: ’R’,... ’ctc’: ’L’, ’ccc’: ’P’, ’cac’: ’H’, ’cgc’: ’R’,... ’cta’: ’L’, ’cca’: ’P’, ’caa’: ’Q’, ’cga’: ’R’,... ’ctg’: ’L’, ’ccg’: ’P’, ’cag’: ’Q’, ’cgg’: ’R’,... ’att’: ’I’, ’act’: ’T’, ’aat’: ’N’, ’agt’: ’S’,... ’atc’: ’I’, ’acc’: ’T’, ’aac’: ’N’, ’agc’: ’S’,... ’ata’: ’I’, ’aca’: ’T’, ’aaa’: ’K’, ’aga’: ’R’,... ’atg’: ’M’, ’acg’: ’T’, ’aag’: ’K’, ’agg’: ’R’,... ’gtt’: ’V’, ’gct’: ’A’, ’gat’: ’D’, ’ggt’: ’G’,... ’gtc’: ’V’, ’gcc’: ’A’, ’gac’: ’D’, ’ggc’: ’G’,... ’gta’: ’V’, ’gca’: ’A’, ’gaa’: ’E’, ’gga’: ’G’,... ’gtg’: ’V’, ’gcg’: ’A’, ’gag’: ’E’, ’ggg’: ’G’ } >>

15 A DNA Sequence >>> cds = "atgagtgaacgtctgagcattaccccgctggggccgtatatcggcgcacaaa tttcgggtgccgacctgacgcgcccgttaagcgataatcagtttgaacagctttaccatgcggtg ctgcgccatcaggtggtgtttctacgcgatcaagctattacgccgcagcagcaacgcgcgctggc ccagcgttttggcgaattgcatattcaccctgtttacccgcatgccgaaggggttgacgagatca tcgtgctggatacccataacgataatccgccagataacgacaactggcataccgatgtgacattt attgaaacgccacccgcaggggcgattctggcagctaaagagttaccttcgaccggcggtgatac gctctggaccagcggtattgcggcctatgaggcgctctctgttcccttccgccagctgctgagtg ggctgcgtgcggagcatgatttccgtaaatcgttcccggaatacaaataccgcaaaaccgaggag gaacatcaacgctggcgcgaggcggtcgcgaaaaacccgccgttgctacatccggtggtgcgaac gcatccggtgagcggtaaacaggcgctgtttgtgaatgaaggctttactacgcgaattgttgatg tgagcgagaaagagagcgaagccttgttaagttttttgtttgcccatatcaccaaaccggagttt caggtgcgctggcgctggcaaccaaatgatattgcgatttgggataaccgcgtgacccagcacta tgccaatgccgattacctgccacagcgacggataatgcatcgggcgacgatccttggggataaac cgttttatcgggcggggtaa" >>>

16 CDS Sequence -> Protein Sequence >>> def translate(cds, code):... prot = ""... for i in range(0,len(cds),3):... codon = cds[i:i+3]... prot = prot + code[codon]... return prot >>> translate(cds, code) ’MSERLSITPLGPYIGAQ*’

17 Dictionary Methods and Operations Table 9.3. Dictionary methods and operations Method or Operation Action d[key] get the value of the entry with key key in d d[key] = val set the value of entry with key key to val del d[key] delete entry with key key d.clear() removes all entries len(d) number of items d.copy() makes a shallow copya d.has_key(key) returns 1 if key exists, 0 otherwise d.keys() gives a list of all keys d.values() gives a list of all values d.items() returns a list of all items as tuples (key,value) d.update(new) adds all entries of dictionary new to d d.get(key [, otherwise]) returns value of the entry with key key if it exists otherwise returns otherwise d.setdefaults(key [, val]) same as d.get(key), but if key does not exists sets d[key] to val d.popitem() removes a random item and returns it as tuple