CS190/295 Programming in Python for Life Sciences: Lecture 7 Instructor: Xiaohui Xie University of California, Irvine
Classes and Object Oriented Programming (OOP)
Introduction We've seen Python useful for –Simple scripts –Module design This lecture discusses Object Oriented Programming –Better program design –Better modularization
What is an object? An object is an active data type that knows stuff and can do stuff. More precisely, an object consists of: 1.A collection of related information. 2.A set of operations to manipulate that information.
Objects The information is stored inside the object in instance variables. The operations, called methods, are functions that “ live ” inside the object. Collectively, the instance variables and methods are called the attributes of an object.
Example: Multi-Sided Dice A normal die is a cube with six faces, each with a number from one to six. Some games use special dice with a different number of sides. Let ’ s design a generic class MSDie to model multi-sided dice.
Example: Multi-Sided Dice Each MSDie object will know two things: –How many sides it has. –It ’ s current value When a new MSDie is created, we specify n, the number of sides it will have.
Example: Multi-Sided Dice We have three methods that we can use to operate on the die: –roll – set the die to a random value between 1 and n, inclusive. –setValue – set the die to a specific value (i.e. cheat) –getValue – see what the current value is.
Example: Multi-Sided Dice >>> die1 = MSDie(6) >>> die1.getValue() 1 >>> die1.roll() >>> die1.getValue() 5 >>> die2 = MSDie(13) >>> die2.getValue() 1 >>> die2.roll() >>> die2.getValue() 9 >>> die2.setValue(8) >>> die2.getValue() 8
Example: Multi-Sided Dice Using our object-oriented vocabulary, we create a die by invoking the MSDie constructor and providing the number of sides as a parameter. Our die objects will keep track of this number internally as an instance variable. Another instance variable is used to keep the current value of the die. We initially set the value of the die to be 1 because that value is valid for any die. That value can be changed by the roll and setRoll methods, and returned by the getValue method.
Example: Multi-Sided Dice # msdie.py # Class definition for an n-sided die. import random class MSDie: def __init__(self, sides): self.sides = sides self.value = 1 def roll(self): self.value = random.randrange(1, self.sides+1) def getValue(self): return self.value def setValue(self, value): self.value = value
Example: Multi-Sided Dice Class definitions have the form class : Methods look a lot like functions! Placing the function inside a class makes it a method of the class, rather than a stand-alone function. The first parameter of a method is always named self, which is a reference to the object on which the method is acting.
Example: Multi-Sided Dice Suppose we have a main function that executes die1.setValue(8). Just as in function calls, Python executes the following four-step sequence: –main suspends at the point of the method application. Python locates the appropriate method definition inside the class of the object to which the method is being applied. Here, control is transferred to the setValue method in the MSDie class, since die1 is an instance of MSDie.
Example: Multi-Sided Dice –The formal parameters of the method get assigned the values supplied by the actual parameters of the call. In the case of a method call, the first formal parameter refers to the object: self = die1 value = 8 –The body of the method is executed.
Example: Multi-Sided Dice –Control returns to the point just after where the method was called. In this case, it is immediately following die1.setValue(8). Methods are called with one parameter, but the method definition itself includes the self parameter as well as the actual parameter.
Example: Multi-Sided Dice The self parameter is a bookkeeping detail. We can refer to the first formal parameter as the self parameter and other parameters as normal parameters. So, we could say setValue uses one normal parameter.
Object Oriented Design (OOD) Object Oriented Design focuses on –Encapsulation: dividing the code into a public interface, and a private implementation of that interface –Polymorphism: the ability to overload standard operators so that they have appropriate behavior based on their context –Inheritance: the ability to create subclasses that contain specializations of their parents
Example: Atom class class atom: def __init__(self,symbol,x,y,z): self.symbol = symbol self.position = (x,y,z) def getsym(self): # a class method return self.symbol def __repr__(self): # overloads printing return '%s %10.4f %10.4f %10.4f' % (self.getsym(), self.position[0], self.position[1],self.position[2]) >>> at = atom(‘C’,0.0,1.0,2.0) >>> print at ‘C’ >>> at.getsym() 'C'
Atom class Overloaded the default constructor Defined class variables (symbol, position) that are persistent and local to the atom object Good way to manage shared memory: –instead of passing long lists of arguments, encapsulate some of this data into an object, and pass the object. –much cleaner programs result Overloaded the print operator We now want to use the atom class to build molecules...
Molecule Class class molecule: def __init__(self,name='Generic'): self.name = name self.atomlist = [] def addatom(self,atom): self.atomlist.append(atom) def __repr__(self): str = 'This is a molecule named %s\n' % self.name str = str+'It has %d atoms\n' % len(self.atomlist) for atom in self.atomlist: str = str + `atom` + '\n' return str
Using Molecule Class >>> mol = molecule('Water') >>> at = atom(‘O’,0.,0.,0.) >>> mol.addatom(at) >>> mol.addatom(atom(‘H’,0.,0.,1.)) >>> mol.addatom(atom(‘H’,0.,1.,0.)) >>> print mol This is a molecule named Water It has 3 atoms O H H Note that the print function calls the atoms print function –Code reuse: only have to type the code that prints an atom once; this means that if you change the atom specification, you only have one place to update.
Inheritance class organic_molecule(molecule): def countCarbon(self): n = 0 for atom in self.atomlist: if atom.getsym()==‘C’ : n=n+1 __init__, __repr__, and __addatom__ are taken from the parent class (molecule) Added a new function countCarbon() to count the number of carbons Another example of code reuse –Basic functions don't have to be retyped, just inherited –Less to rewrite when specifications change
Public and Private Data Currently everything in atom/molecule is public, thus we could do something really stupid like >>> at = atom(‘C’,0.,0.,0.) >>> at.position = 'Grape Jelly' that would break any function that used at.poisition We therefore need to protect the at.position and provide accessors to this data –Encapsulation or Data Hiding –accessors are "gettors" and "settors" Encapsulation is particularly important when other people use your class
Public and Private Data, Cont. In Python anything with two leading underscores is private __a, __my_variable Anything with one leading underscore is semi-private, and you should feel guilty accessing this data directly. _b –Sometimes useful as an intermediate step to making data private
Encapsulated Atom class atom: def __init__(self,symbol,x,y,z): self.symbol = symbol self.__position = (x,y,z) #position is private def getposition(self): return self.__position def setposition(self,x,y,z): self.__position = (x,y,z) #typecheck first! def translate(self,x,y,z): x0,y0,z0 = self.__position self.__position = (x0+x,y0+y,z0+z)
Why Encapsulate? By defining a specific interface you can keep other modules from doing anything incorrect to your data By limiting the functions you are going to support, you leave yourself free to change the internal data without messing up your users –Write to the Interface, not the the Implementation –Makes code more modular, since you can change large parts of your classes without affecting other parts of the program, so long as they only use your public functions
Classes that look like arrays Overload __getitem__(self,index) to make a class act like an array class molecule: def __getitem__(self,index): return self.atomlist[index] >>> mol = molecule('Water') #defined as before >>> for atom in mol: #use like a list! print atom >>> mol[0].translate(1.,1.,1.) Previous lectures defined molecules to be arrays of atoms. This allows us to use the same routines, but using the molecule class instead of the old arrays. An example of focusing on the interface!
Classes that look like functions Overload __call__(self,arg) to make a class behave like a function class gaussian: def __init__(self,exponent): self.exponent = exponent def __call__(self,arg): return math.exp(-self.exponent*arg*arg) >>> func = gaussian(1.) >>> func(3.)
Other things to overload __setitem__(self,index,value) –Another function for making a class look like an array/dictionary –a[index] = value __add__(self,other) –Overload the "+" operator –molecule = molecule + atom __mul__(self,number) –Overload the "*" operator –zeros = 3*[0]
Other things to overload, cont. __len__(self) –Overload the len() command –natoms = len(mol) __getslice__(self,low,high) –Overload slicing –glycine = protein[0:9] __cmp__(self,other): –On comparisons (<, ==, etc.) returns -1, 0, or 1, like C's strcmp
Acknowledgement Some of slides are from Richard Muller and John Zelle