Advanced Python Data Structures

Slides:



Advertisements
Similar presentations
Chapter 6 Lists and Dictionaries CSC1310 Fall 2009.
Advertisements

Python programs How can I run a program? Input and output.
Lists in Python.
9/16/2015BCHB Edwards Introduction to Python BCHB Lecture 5.
Lists and the ‘ for ’ loop. Lists Lists are an ordered collection of objects >>> data = [] >>> print data [] >>> data.append("Hello!") >>> print data.
Built-in Data Structures in Python An Introduction.
10/20/2014BCHB Edwards Advanced Python Concepts: Modules BCHB Lecture 14.
8/29/2014BCHB Edwards Introduction to Python BCHB Lecture 2.
9/14/2015BCHB Edwards Introduction to Python BCHB Lecture 4.
9/23/2015BCHB Edwards Advanced Python Data Structures BCHB Lecture 7.
Dictionaries.   Review on for loops – nested for loops  Dictionaries (p.79 Learning Python)  Sys Module for system arguments  Reverse complementing.
9/28/2015BCHB Edwards Basic Python Review BCHB Lecture 8.
9/21/2015BCHB Edwards Python Data Structures: Lists BCHB Lecture 6.
11/4/2015BCHB Edwards Advanced Python Concepts: Object Oriented Programming BCHB Lecture 17.
GE3M25: Computer Programming for Biologists Python, Class 5
11/9/2015BCHB Edwards Advanced Python Concepts: OOP & Inheritance BCHB Lecture 18.
9/11/2015BCHB Edwards Introduction to Python BCHB Lecture 3.
Lists/Dictionaries. What we are covering Data structure basics Lists Dictionaries Json.
DAY 3. ADVANCED PYTHON PRACTICE SANGREA SHIM TAEYOUNG LEE.
Introduction to Python
Introduction to Python
COMPSCI 107 Computer Science Fundamentals
Advanced Python Idioms
Introduction to Python
Introduction to Python
Advanced Python Concepts: Modules
Advanced Python Data Structures
Containers and Lists CIS 40 – Introduction to Programming in Python
Introduction to Python
Department of Computer Science,
Introduction to Python
Introduction to Python
Introduction to Python
Lecture 10 Data Collections
Advanced Python Concepts: Object Oriented Programming
Advanced Python Concepts: OOP & Inheritance
While Loops BIS1523 – Lecture 12.
Topics Introduction to File Input and Output
Bryan Burlingame Halloween 2018
Basic Python Review BCHB524 Lecture 8 BCHB524 - Edwards.
Advanced Python Concepts: OOP & Inheritance
Coding Concepts (Basics)
Advanced Python Concepts: Object Oriented Programming
Recitation Outline C++ STL associative containers Examples
Python Data Structures: Lists
Introduction to Python
Advanced Python Concepts: Exceptions
Introduction to Python
Advanced Python Data Structures
Advanced Python Concepts: Modules
Advanced Python Concepts: OOP & Inheritance
Relational Databases: Basic Concepts
Relational Databases: Basic Concepts
Introduction to Python
Advanced Python Idioms
Basic Python Review BCHB524 Lecture 8 BCHB524 - Edwards.
Introduction to Python
CHAPTER 4: Lists, Tuples and Dictionaries
Python Data Structures: Lists
Advanced Python Concepts: Exceptions
Introduction to Python
Introduction to Python
Python Data Structures: Lists
Advanced Python Idioms
Advanced Python Concepts: Modules
Bryan Burlingame Halloween 2018
Lists and the ‘for’ loop
Advanced Python Concepts: Object Oriented Programming
Topics Introduction to File Input and Output
Files and Dictionaries
Presentation transcript:

Advanced Python Data Structures BCHB524 Lecture 7 BCHB524 - Edwards

Outline Review of list data-structures Advanced Data-structures Dictionaries, Sets, Files Reading, parsing files (codon tables) Exercises BCHB524 - Edwards

Data-structures: Lists Compound data-structure: Many objects in order numbered from 0 [] indicates list. Item access and iteration Same as for string, "l[i]" for item i "for item in l" for each item of the list. List modification items can be changed, added, or deleted. Range is a list String ↔ List BCHB524 - Edwards

Python Data-structures: Dictionaries Compound data-structure, stores any number of arbitrary key-value pairs. Keys and/or value can be different types Can be empty Values can be accessed by key Keys, values, or pairs can be accessed by iteration Values can be changed Key, value pairs can be added Key, value pairs can be deleted BCHB524 - Edwards

Dictionaries: Syntax and item access # Simple dictionary d = {'a': 1, 'b': 2, 'acdef': 3} print d # Access value using its key print d['a'] # Change value associated with a key d['acdef'] = 5 print d # Add value by assigning to a dictionary key d['newkey'] = 10 print d BCHB524 - Edwards

Dictionaries: Iteration # Initialize d = {'a': 1, 'b': 2, 'acdef': 5, 'newkey': 10} # keys from d print d.keys() # values from d print d.values() # key-value pairs from d print d.items() # Iterate through the keys of d for k in d.keys():     print k, print # Iterate through the key-value pairs of d for k,v in d.items():     print k,"=",v, print BCHB524 - Edwards

Dictionaries: Different from lists? # Initialize d = {} # Add some values, integer keys! d[0] = 1 d[1] = 2 d[10] = 1000 # See how the dictionary looks print d # Test whether a key is in the dictionary print "Is key 15 in d?",d.has_key(15) # Access value with key 15 with default -1 print "Value for key 15, or -1:",d.get(15,-1) # Access value with key 15 - error! print "Value for key 15:",d[15] d = {} d[0] = 1 d[1] = 2 d[10] = 1000 print d d.has_key(15) d.get(15,-1) BCHB524 - Edwards

Python Data-structures: Sets Compound data-structure, stores any number of arbitrary distinct data-items. Data-items can be different types Can be empty Items can be accessed by iteration only. Items can be tested for membership. Items can be added Items can be deleted BCHB524 - Edwards

Sets: Add and Test Elements # Make an empty set s = set() print s # Add an element, and then a list of elements s.add('a') s.update(['b','c','d']) print s # Test for membership print "e is in s",('e' in s) print "e is not in s",('e' not in s) print "c is in s",('c' in s) s = set() s s.add('a') s.update(['b','c','d']) 'e' in s False 'e' not in s True 'c' in s BCHB524 - Edwards

Python Data-structures: Files Read strings from file, or Write strings to file. Get access to lines as strings by iteration. …or get the entire contents of the file as a string Write by printing strings to file. MUST open and close files: Need to indicate whether we want to read or write. BCHB524 - Edwards

Files: Reading # Open a file, store "handle" in f f = open('anthrax_sasp.nuc') # MAGIC! print ''.join(f.read().split()) # Close the file.  f.close() # Slowly, now... f = open('anthrax_sasp.nuc') # Store the entire file's contents in s (as string) s = f.read() print s # Split s at whitespace sl = s.split() print sl # Join split s with nothing in between jl = ''.join(sl) print jl # Close the file f.close() f = open('anthrax_sasp.nuc') print f.read().split() f.close() s = f.read() sl = s.split() jl = ''.join(sl) BCHB524 - Edwards

Files: Reading # Open a file f = open('anthrax_sasp.nuc') # Iterate line-by-line for line in f:     print line # Close the file f.close() # Open a file f = open('anthrax_sasp.nuc') # Iterate line-by-line, and accumulate the sequence seq = "" for line in f:     seq += line.strip() print "The sequence is",seq # Close the file f.close() f = open('anthrax_sasp.nuc') print f.read().split() f.close() s = f.read() sl = s.split() jl = ''.join(sl) BCHB524 - Edwards

DNA Translation First read a codon table from a file Codon table from NCBI's on-line taxonomy resource Read line by line and use initial word to store 3rd word appropriately. AAs = FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG Starts = ---M---------------M---------------M---------------------------- Base1 = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG Base2 = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG Base3 = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG BCHB524 - Edwards

DNA Translation f = open('standard.code') data = {} for l in f:     sl = l.split()     key = sl[0]     value = sl[2]     data[key] = value     f.close() b1 = data['Base1'] b2 = data['Base2'] b3 = data['Base3'] aa = data['AAs'] st = data['Starts'] codons = {} init = {} n = len(aa) for i in range(n):     codon = b1[i] + b2[i] + b3[i]     codons[codon] = aa[i]     init[codon] = (st[i] == 'M') f = open('standard.code') data = {} for l in f: sl = l.split() key = sl[0] value = sl[2] data[key] = value f.close() b1 = data['Base1'] b2 = data['Base2'] b3 = data['Base3'] aa = data['AAs'] st = data['Starts'] codons = {} init = {} n = len(aa) for i in range(n): codon = b1[i] + b2[i] + b3[i] codons[codon] = aa[i] init[codon] = (st[i] == 'M') BCHB524 - Edwards

DNA Translation f = open('anthrax_sasp.nuc') seq = ''.join(f.read().split()) f.close() seqlen = len(seq) aaseq = [] for i in range(0,seqlen,3):     codon = seq[i:i+3]     aa = codons[codon]     aaseq.append(aa) print ''.join(aaseq) f = open('standard.code') data = {} for l in f: sl = l.split() key = sl[0] value = sl[2] data[key] = value f.close() b1 = data['Base1'] b2 = data['Base2'] b3 = data['Base3'] aa = data['AAs'] st = data['Starts'] codons = {} init = {} n = len(aa) for i in range(n): codon = b1[i] + b2[i] + b3[i] codons[codon] = aa[i] init[codon] = (st[i] == 'M') BCHB524 - Edwards

Exercise 1 Using just the concepts introduced so far, find as many ways as possible to code DNA reverse complement (at least 3!) You may use any built-in function or string or list method. You may use only basic data-types and lists and dictionaries. Compare and critique each technique for robustness, speed, and correctness. BCHB524 - Edwards

Exercise 2 Write a program that takes a codon table file (such as standard.code from the lecture) and a file containing nucleotide sequence (anthrax_sasp.nuc) as command-line arguments, and outputs the amino-acid sequence. Modify your program to indicate whether or not the initial codon is consistent with the codon table's start codons. Use NCBI's taxonomy resource to look up and download the correct codon table for the anthrax bacterium. Re-run your program using the correct codon table. Is the initial codon of the anthrax SASP gene a valid translation start site? BCHB524 - Edwards

Homework 4 Due Monday, October 2nd. Submit using Canvas Use only the techniques introduced so far. Make sure you can run the programs demonstrated in lecture(s). Exercises 1, 2 from Lecture 7 BCHB524 - Edwards