Relational Databases: Object Relational Mappers – SQLObject II BCHB Lecture 23 11/20/2013BCHB Edwards
11/20/2013BCHB Edwards2 Relational Databases Store information in a table Rows represent items Columns represent items' properties or attributes NameContinentRegionSurface AreaPopulationGNP BrazilSouth America IndonesiaAsiaSoutheast Asia IndiaAsiaSouthern and Central Asia ChinaAsiaEastern Asia PakistanAsiaSouthern and Central Asia United StatesNorth America
11/20/2013BCHB Edwards3... as Objects Objects have data members or attributes. Store objects in a list or iterable. Abstract away details of underlying RDBMS c1 = Country() c1.name = 'Brazil' c1.continent = 'South America' c1.region = 'South America' c1.surfaceArea = c1.population = c1.gnp = # initialize c2,..., c6 countryTable = [ c1, c2, c3, c4, c5, c6 ] for cnty in countryTable: if cnty.population > : print cnty.name, cnty.population
11/20/2013BCHB Edwards4 Taxonomy Database, from scratch Specify the model Tables: Taxonomy and Name Populate basic data-values in the Taxonomy table from “small_nodes.dmp” Populate the Names table from “small_names.dmp” Insert basic data-values Insert relationship with Taxonomy table Fix Taxonomy parent relationship Fix Taxonomy derived information Use in a program…
11/20/2013BCHB Edwards5 Taxonomy Database: model.py from sqlobject import * import os.path, sys dbfile = 'small_taxa.db3' def init(new=False): # Magic formatting for database URI conn_str = os.path.abspath(dbfile) conn_str = 'sqlite:'+ conn_str # Connect to database sqlhub.processConnection = connectionForURI(conn_str) if new: # Create new tables (remove old ones if they exist) Taxonomy.dropTable(ifExists=True) Name.dropTable(ifExists=True) Taxonomy.createTable() Name.createTable()
11/20/2013BCHB Edwards6 Taxonomy Database: model.py # model.py continued… class Taxonomy(SQLObject): taxid = IntCol(alternateID=True) scientific_name = StringCol() rank = StringCol() parent = ForeignKey("Taxonomy") class Name(SQLObject): taxonomy = ForeignKey("Taxonomy") name = StringCol() name_class = StringCol()
11/20/2013BCHB Edwards7 Taxonomy Database structure TaxonomyName taxonomy: 4 parent: 2 Foreign Key: id number of some other row taxonomy parent parent: 2 taxonomy: 4
11/20/2013BCHB Edwards8 Populate Taxonomy table: load_taxa.py import sys from model import * init(new=True) # Read in the taxonomy nodes, populate taxid and rank h = open(sys.argv[1]) for l in h: l = l.strip('\t|\n') sl = l.split('\t|\t') taxid = int(sl[0]) rank = sl[2] t = Taxonomy(taxid=taxid, rank=rank, scientific_name=None, parent=None) h.close()
11/20/2013BCHB Edwards9 Populate Name table: load_names.py import sys from model import * init() # Read in the names, populate name, class, and id of # taxonomy row h = open(sys.argv[1]) for l in h: l = l.strip('\t|\n') sl = l.split('\t|\t') taxid = int(sl[0]) name_class = sl[3] name = sl[1] t = Taxonomy.byTaxid(taxid) n = Name(name=name, name_class=name_class, taxonomy=t) h.close()
11/20/2013BCHB Edwards10 Fix up the Taxonomy table: fix_taxa.py import sys from model import * init() # Read in the taxonomy nodes, get self and parent taxonomy objects, # and fix the parent field appropriately h = open(sys.argv[1]) for l in h: l = l.strip('\t|\n') sl = l.split('\t|\t') taxid = int(sl[0]) parent_taxid = int(sl[1]) t = Taxonomy.byTaxid(taxid) p = Taxonomy.byTaxid(parent_taxid) t.parent = p h.close() # Find all scientific names and fix their taxonomy objects' scientific # name files appropriately for n in Name.select(Name.q.name_class == 'scientific name'): n.taxonomy.scientific_name = n.name
11/20/2013BCHB Edwards11 Back to the Taxonomy example Each taxonomy entry can have multiple names Many names can point (ForeignKey) to a single taxonomy entry name → taxonomy is easy... taxonomy → list of names requires a select statement from model import * init() hs = Taxonomy.byTaxid(9606) for n in Name.select(Name.q.taxonomy==hs): print n.name
11/20/2013BCHB Edwards12 Taxonomy Database structure TaxonomyName taxonomy: 4 parent: 2 Foreign Key: id number of some other row taxonomy parent parent: 2 taxonomy: 4
11/20/2013BCHB Edwards13 Taxonomy table relationships This relationship (one-to-many) is called a multiple join. Related joins (many-to-many) too... class Taxonomy(SQLObject): # other data members names = MultipleJoin("Name") children = MultipleJoin("Taxonomy",joinColumn='parent_id') from model import * init() hs = Taxonomy.byTaxid(9606) for n in hs.names: print n.name for c in hs.children: print c.scientific_name
11/20/2013BCHB Edwards14 SQLObject Exceptions What happens when the row isn't in the table? from model import * try: hs = Taxonomy.get(7921) hs = Taxonomy.byTaxid(9606) except SQLObjectNotFound: # if row id 7921 / Tax id 9606 is not in table... results = Taxonomy.selectBy(taxid=9606) if results.count() == 0: # No rows satisfy the constraint! try: first_item = results[0] except IndexError: # No first item in the results
11/20/2013BCHB Edwards15 Example Program import sys from model import * init() try: taxid = int(sys.argv[1]) except IndexError: print >>sys.stderr, "Need a taxonomy id argument" sys.exit(1) except ValueError: print >>sys.stderr, "Taxonomy id should be an intenger" sys.exit(1) #Get taxonomy row try: t = Taxonomy.byTaxid(taxid) except SQLObjectNotFound: print >>sys.stderr, "Taxonomy id",taxid,"does not exist" sys.exit(1) for n in t.names: print "Organism",t.scientific_name,"has name",n.name for c in t.children: print "Organism",t.scientific_name,"has child",c.scientific_name,c.taxid print "Organism",t.scientific_name,"has parent",t.parent.scientific_name,t.parent.taxid
11/20/2013BCHB Edwards16 Example Program # Continued... # Iterate up through the taxonomy tree from t, to find its genus r = t g = None while r != r.parent: if r.rank == 'genus': g = r break r = r.parent if g == None: print "Organism",t.scientific_name,"has no genus" else: print "Organism",t.scientific_name,"has genus",g.scientific_name
11/20/2013BCHB Edwards17 Exercises Write a python program using SQLObject to find the taxonomic lineage of a user-supplied organism name. Make sure you use the small_taxa.db3 file from the course data-folder