The GUS 3.0 Perl Object Layer CBIL Jonathan Schug June
Outline Overview Objects Using Objects Superclasses GA - GusApplication Plugins
A Programmer's View of GUS3.0 The application programmer interacts with GUS via the GusApplication (GA) Perl program. The GA is a general framework for connecting to GUS30. Specific tasks are performed by individual plugins. Plugins use either table-specific classes or SQL access. Low-level database access is provided by DBI classes. RADTESSDoTS CoreSRes DBI Plugin Class SuperClasses SQL GusApplication
A GUS3.0 Table Primary key Table-specific attributes GUS overhead attributes Parents - pointed to by this table Children - point to this table
A GUS3.0 Perl Object Layer Class GUS30/DoTS/Clone.pm package DoTS::Clone; use strict; use GUS30::DoTS::gen::Clone_gen; use vars = qw (DoTS::Clone_gen); 1; Relies on _gen class for accessor methods. This is stub for hand-edited domain-specific methods.
The _gen Class - I GUS30/DoTS/gen/Clone_gen.pm package DoTS::Clone_gen; use strict; use GUS30::dbiperl_utils::RelationalRow; use vars = qw (RelationalRow); sub setDefaultParams {... } Inherits from RelationalRow setDefaultParams to determine if versionable and updateable.
The _gen Class - II GUS30/DoTS/gen/Clone_gen.pm sub setCloneId {... } sub getCloneId {... } sub setLibraryId {... } sub getLibraryId {... } sub setImageId {... } sub getImageId {... } sub setDbestCloneUid {... } sub getDbestCloneUid {... } sub setWashuName {... } sub getWashuName {... } sub setGdbId {... } sub getGdbId {... } sub setMgiId {... } sub getMgiId {... } sub setDbestLength {... } sub getDbestLength {... } sub setWashuLength {... } sub getWashuLength {... } There is an accessor for each column. Note the case change and loss of underscores.
The _gen Class - III GUS30/DoTS/gen/Clone_gen.pm sub setModificationDate {... } sub getModificationDate {... } sub setUserRead {... } sub getUserRead {... } sub setUserWrite {... } sub getUserWrite {... } sub setGroupRead {... } sub getGroupRead {... } sub setGroupWrite {... } sub getGroupWrite {... } sub setOtherRead {... } sub getOtherRead {... } sub setOtherWrite {... } sub getOtherWrite {... } sub setRowUserId {... } sub getRowUserId {... } sub setRowGroupId {... } sub getRowGroupId {... } sub setRowProjectId {... } sub getRowProjectId {... } sub setRowAlgInvocationId {... } sub getRowAlgInvocationId {... } There is an accessor for each column. Note the case change and loss of underscores.
Hand Edited Methods Edit main class file, e.g., GUS30/DoTS/Clone.pm Typically placed in GUS30/DoTS/hand_edited/ Symlink in GUS30/DoTS. Mostly used in DoTS section. DoTS/AAFeature.pm:4 DoTS/AASequence.pm:2 DoTS/Assembly.pm:76 DoTS/AssemblySequence.pm:29 DoTS/Evidence.pm:6 DoTS/GeneFeature.pm:4 DoTS/Gene.pm:9 DoTS/IndexWordSimLink.pm:2 DoTS/NAFeature.pm:5 DoTS/NASequence.pm:4 DoTS/RNAFeature.pm:4 DoTS/RNA.pm:3 DoTS/Similarity.pm:9 DoTS/SimilaritySpan.pm:6 DoTS/SplicedNASequence.pm:1 DoTS/TranslatedAAFeature.pm:3 DoTS/TranslatedAAFeatureSegment.pm:2 DoTS/TranslatedAASequence.pm:1 DoTS/VirtualSequence.pm:1
Creating Objects # get the class use GUS30::DoTS::Clone; … # create new object my $clone_gus = DoTS::Clone->new({ washu_length => 5, }); # adjust a column value $clone_gus->setDbestUid(‘A123456’); # print some values. print $clone_gus->getDbestUid, “\n”; print $clone_gus->toXML, “\n”; # submit to database $clone_gus->submit;
Connecting Objects use GUS30::DoTS::Clone; use GUS30::DoTS::CloneLibrary; My $clone_lib_gus = DoTS::CloneLibrary->new({…}); While (<>) { chomp; = split /\t/; my $clone_gus = DoTS::Clone->new({…}); # this $clone_lib_gus->addChild($clone_gus); # or this $clone_gus->setParent($clone_lib_gus); } $clone_lib_gus->submit;
Retrieving Objects Use GUS30::DoTS::CloneLibrary; My $clone_lib_gus = DoTS::CloneLibrary->new({ clone_library_id => }); If ($clone_lib_gus->retrieveFromDB) { $clone_lib_gus->set…(…); $clone_lib_gus->submit; print “found it!\n”; } Else { print “did not find any unique row!\n”; }
Traversing Object Relations - I Use GUS30::DoTS::CloneLibrary; Use GUS30::DoTS::Clone; My $clone_lib_gus = DoTS::CloneLibrary({ clone_library_id => }); If ($clone_lib_gus->retrieveFromDB) { = $clone_lib_gus->getChildren(‘DoTS.Clone’,1); foreach { … }
Traversing Object Relations - II Use GUS30::DoTS::CloneLibrary; Use GUS30::DoTS::Clone; My $clone_lib = DoTS::Clone->new({ clone_id => }); If ($clone_gus->retrieveFromDB) { my $clone_lib_gus = $clone_gus->getParent(‘DoTS.CloneLibrary’,1);... }
Deleting Objects Use GUS30::DoTS::CloneLibrary; Use GUS30::DoTS::Clone; My $clone_lib_gus = DoTS::CloneLibrary({ clone_library_id => }); If ($clone_lib_gus->retrieveFromDB) { $clone_lib_gus->markDeleted; = $clone_lib_gus->getChildren(‘DoTS.Clone’,1); foreach { $_->markDeleted; } $clone_lib_gus->submit; } Recursively deletes children as well.
The Object Cache A cache of objects is maintained so that getParents and getChildren always return the same instance of a row. Cache is limited in size to avoid large memory requirements. Cache is cleared with undefPointerCache method on object or plugin Cache size is increased with setMaximumNumberOfObjects method.
Dbiperl_utils Support and base classes for object classes. RelationalRow DbiRow DbiTable DbiDatabase
RelationalRow.pm Contains 176 methods in these categories: –Accessors for default overhead values –Accessors for debugging and verbose modes –Pointer cache maintenance –Class information –Parent/child information –XML management –Deletion marking –Submission management –Similarity and Evidence management Isa DbiRow
DbiRow.pm Contains 43 methods in these categories: –Get/Set methods to support class-specific accessors –Accessors for table and class names –Attribute information –Tracking attribute value changes –retrieveFromDB –IdentityInsert management –Get DbHandle, MetaHandle, and Database
DbiTable.pm 76 methods for –Various table names –Attribute information –Relations information –Primary keys and ids –Others
DbiDatabase.pm 103 methods covering these areas: –Database handles –Login information –Database and section names –Transaction management –Table and view names –Object cache –Counters
Overhead Columns Contain information about: History Ownership Access permissions Data provenance Who manages these columns?
GusApplication (GA) Purpose is to standardize database access application Provides: –Database login –Default ownership and permissions –Algorithm and parameter tracking –Command line access
Algorithms & Stuff Algorithm AlgorithmImplementation AlgorithmInvocation AlgorithmParamKey AlgorithmParamKeyType AlgorithmParam Tracks what programs implementing what algorithms were run with what parameters. GA populates these tables.
GA Usage ga [ ] [ ] is one of –+create : creates Algorithm, AlgorithmImplementation, and AlgorithmParamKey –+update : creates AlgorithmImplementation and AlgorithmParamKey –+history : lists invocations –+run : runs the plugin (default) –From hierarchical namespace, e.g., Utils::UpdateGusFromXML –Defined by plugin plus some generic GA options. –E.g., --file data.tab --commit --verbose
Plugins A plugin is just a package that inherits from GUS30::GA_plugins::Plugin. package = qw(GUS30::GA_plugins::Plugin); It must have two methods: –new - to create and initialize the plugin object –run - perform actions of plugin
The new Method Must initialize certain important plugin attributes: sub new { my $Class = shift; my $m = bless {}, $Class; $m->setUsage(‘what this algorithm does’); $m->setVersion('2.0'); $m->setRequiredDbVersion({ Core => ‘3’, DoTS => ‘3’ }); $m->setDescription(‘what is new in implementation); $m->setEasyCspOptions(…); # command line options return $m }
Command Line Options A hash describing a parameter: –h => hint for user –t => parameter data type (boolean, string, integer, float) –d => default value –l => is a list if true –e => list of legal reg-exps –r => required if true –o => command line flag E.g., { h => 'start label ordinals with this value', t => 'integer', d => 0, o => 'FirstOrdinal', },
GA-Supplied Comand-line Options GA adds these options: –commit –verbose –debug –user –group –project –comment –database –server –implementation –algoinvo Pink ones also read from config file.gus30.cfg
Example: TESS::LoadMultinomialLabelSet TESS::MultinomialLabelSetTESS::MultinomialLabel Task is to maintain entries in these two tables MultinomialLabelSet stores sets of labels for multinomial observations, e.g., DNA, AA, or dimer gaps. Can also be DNA or AA dimers, trimers, etc. MultinomialLabel stores individual names.