Python Comprehension and Generators Peter Wad Sackett
List comprehension 1 Creating new list with other list as basis primes = [2, 3, 5, 7] doubleprimes = [2*x for x in primes] The same as doubleprimes = list() for x in primes: doubleprimes.append(2*x) Any expression (the part before the for loop) can be used. Example: A tab separated line of numbers are read from a file, convert the numbers from strings to floats. for line in datafile: numbers = [float(no) for no in line.split()] # Do something with the list of numbers
List comprehension 2 Filtering with comprehension – using if odd = [no for no in range(20) if no % 2 == 1] numbers = [1, 3, -5, 7, -9, 2, -3, -1] positives = [no for no in numbers if no > 0] Nested for loops in comprehension Example: Creating all combinations in tuples of two numbers, where no number is repeated in a combination. combi = [(x, y) for x in range(10) for y in range(10) if x != y] Flatten a list of lists (matrix) into a simple list matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] flatList = [no for row in matrix for no in row] A list does not have to form the basis of the comprehension – any iterable will do, like sets or dicts.
Set and dict comprehension It works like list comprehension, just use {} instead of [] Create a set of all codons nuc = [’A’, ’T’, ’C’, ’G’] codons = { x+y+z for x in nuc for y in nuc for z in nuc } Turning a dict inside out myDict = {'a': 1, 'b': 2, 'c': 3} reverseDict = {value:key for key, value in myDict.items()} Result: {1: 'a', 2: 'b', 3: 'c'} This only works if the values of the dictionary are immutable.
Generators Generators are your own defined iterators, like range. Generators look like functions, but they keep the state of their variables between calls, and they use yield instead of return. Also calling them again resumes execution after the yield statement. Generators deal with possibly memory issue as values are generated in the fly. Example: range(10) returns the numbers between 0 and 9, both inclusive, myrange(10) returns the numbers between 1 and 10. def myrange(number): result = 1 while result <= number: yield result result += 1 for i in myrange(10): print(i) More info: http://www.programiz.com/python-programming/generator
Example: Generating a random gene sequence import random def randomgene(minlength, maxlength): yield 'ATG' counter = 2 while counter < maxlength: codon = random.choice('ATCG') + random.choice('ATCG') + random.choice('ATCG') if codon in ['TGA', 'TAG', 'TAA']: if counter >= minlength: yield codon return else: counter += 1 yield random.choice(['TGA', 'TAG', 'TAA']) # Finally using it print(''.join(randomgene(40,50)))
Example: Generating a random gene sequence, take 2 import random def randomgene(minlength, maxlength): if minlength < 2 or minlength > maxlength: raise ValueError(’Wrong minlength and/or maxlength') yield 'ATG' stopcodons = ('TGA', 'TAG', 'TAA') countdown = random.randrange(minlength, maxlength+1) - 2 while countdown > 0: codon = random.choice('ATCG') + random.choice('ATCG') + random.choice('ATCG') if codon not in stopcodons: yield codon countdown -= 1 yield random.choice(stopcodons) # Finally using it print(''.join(randomgene(40,50)))