Presentation is loading. Please wait.

Presentation is loading. Please wait.

Controller View (web) Model Model T HE E U P ATH DB / GUS-WDK S EARCH S TRATEGY S YSTEM Cristina Aurrecoechea 1, Brian P. Brunk 2, Steve Fischer 2, Xin.

Similar presentations


Presentation on theme: "Controller View (web) Model Model T HE E U P ATH DB / GUS-WDK S EARCH S TRATEGY S YSTEM Cristina Aurrecoechea 1, Brian P. Brunk 2, Steve Fischer 2, Xin."— Presentation transcript:

1 Controller View (web) Model Model T HE E U P ATH DB / GUS-WDK S EARCH S TRATEGY S YSTEM Cristina Aurrecoechea 1, Brian P. Brunk 2, Steve Fischer 2, Xin Gao 2, Omar S. Harb 2, Mark Heiges 1, Jessica C. Kissinger 1, Eileen T. Kraemer 1, Cary Pennington 1, David S. Roos 2, Chris Ross 1, Christian J. Stoeckert 2 & Charles Treatman 2 1 Univ. Georgia, Athens GA, & 2 Univ. Pennsylvania, Philadelphia PA User perspectives on Strategies Computer-human interaction (CHI) studies during prototyping drove the design, and showed high user enthusiasm. Usage stats show 3-fold increase in use of Booleans in two months since release. User feedback very positive. WDK Implementation Runs on any relational database schema Model: configured by you in XML. Abstracts DB to high level Records (Genes, ORFs, etc) Also specifies queries and returned columns Automated sanity testing Can talk to processes (BLAST) via a WS Framework View: Tomcat, JSP, tag library, JavaScript, Ajax, CSS You embed JSP tags in your site and style them w/ CSS Controller: Struts WDK Upcoming features Add genes to a “basket” to generate a report, add to a strategy as a step or send to a tool (e.g., multiple sequence alignment) Web services access to queries Assign weights to results from individual steps for improved filtering Transform a set of one type into another type based on genome span relations The EuPathDB suite of genome database web sites recently introduced a graphical search interface that motivates users to undertake dynamic computational experiments, exploring relationships across datasets to identify biologically meaningful genes and other entities. For example, users seeking novel therapeutic targets may wish to prioritize putative enzymes that distinguish pathogens from their hosts, and are expressed during appropriate developmental stages. Strategies are initiated by running one of 80+ queries, and extended by adding additional searches, linked via Boolean operators represented graphically as Venn diagrams. Sub-strategies allow modular construction and tree structures, and searches may be extended using filters (e.g. by strain or species) and transforms (e.g. orthologs). A graphical display makes the overall logic obvious, and facilitates revision of individual steps, with changes propagated forward through the strategy. Users may name and save their strategies, creating protocols that can be shared with colleagues. (See, e.g., http://plasmodb.org/plasmo/im.do?s=2aa0454db6a6cca0.) The strategy system has been subjected to extensive usability studies, and deployed on all EuPathDB databases (CryptoDB, GiardiaDB, PlasmoDB, ToxoDB, TrichDB and TriTrypDB). Although these sites have offered text-based Boolean operations for many years, usability analysis indicated that most users were not taking full advantage of that feature. Following release of the graphical Search Strategy system, the number of searches per visit dramatically increased. Response from our user community has been extremely positive, as investigators have discovered the power of combining datasets and making dynamic adjustments to define optimal parameters and highlight biologically-relevant relationships. With the accelerating growth in diversity and scale of available datasets, the potential for exploiting interrelationships increases dramatic­ally, and we expect this interface to have a significant impact in bringing “genomic thinking” to a broad audience. This system was developed using the GUS Web Development Kit (WDK), a schema-independent middleware system for generating genomics websites The EuPathDB suite of genome database web sites recently introduced a graphical search interface that motivates users to undertake dynamic computational experiments, exploring relationships across datasets to identify biologically meaningful genes and other entities. For example, users seeking novel therapeutic targets may wish to prioritize putative enzymes that distinguish pathogens from their hosts, and are expressed during appropriate developmental stages. Strategies are initiated by running one of 80+ queries, and extended by adding additional searches, linked via Boolean operators represented graphically as Venn diagrams. Sub-strategies allow modular construction and tree structures, and searches may be extended using filters (e.g. by strain or species) and transforms (e.g. orthologs). A graphical display makes the overall logic obvious, and facilitates revision of individual steps, with changes propagated forward through the strategy. Users may name and save their strategies, creating protocols that can be shared with colleagues. (See, e.g., http://plasmodb.org/plasmo/im.do?s=2aa0454db6a6cca0.) The strategy system has been subjected to extensive usability studies, and deployed on all EuPathDB databases (CryptoDB, GiardiaDB, PlasmoDB, ToxoDB, TrichDB and TriTrypDB). Although these sites have offered text-based Boolean operations for many years, usability analysis indicated that most users were not taking full advantage of that feature. Following release of the graphical Search Strategy system, the number of searches per visit dramatically increased. Response from our user community has been extremely positive, as investigators have discovered the power of combining datasets and making dynamic adjustments to define optimal parameters and highlight biologically-relevant relationships. With the accelerating growth in diversity and scale of available datasets, the potential for exploiting interrelationships increases dramatic­ally, and we expect this interface to have a significant impact in bringing “genomic thinking” to a broad audience. This system was developed using the GUS Web Development Kit (WDK), a schema-independent middleware system for generating genomics websites The EuPathDB suite of databases covers genomic and functional genomics datasets for a variety of eukaryotic pathogens. Shown here is PlasmoDB, which contains the genus Plasmodium, including P.falciparum, the malaria parasite. Use Case Use data in PlasmoDB to find parasite (Plasmodium) drug target genes This panel shows a schematic of a strategy, using queries and booleans. The actual strategy is built below. Transferases (E.C.) [union] Kinase activity (GO) [intersect] ---------------------------------------------------------------------------  [intersect] present in Haemosporida, not Mammals [intersect] not under diversifying selection (SNPs) [transform] orthology to any Plasmodium genes Run a query (choose from menu) 2 Add a step (another query) Add more steps… Build a Strategy 3 1 4 Revise steps at any time…. Changes propagate forward. A strategy can integrate data from genome annotation, expression, SNPs, proteomics, etc. Nest strategies to add complexity. View results from all or any species. Use orthology to transform results to other species. Download customized reports of results. Choose from many available columns. Sort and move columns. Dynamically revise, add or delete steps. Email a strategy link tocollegaues. It’s Easy to Build a Strategy… Genomics Database WDK Engine Query Cache Genomics Data Denormalized For Query Speed Genomics Data Denormalized For Query Speed Genomics Data User Login and Search History WDK Model (Java Objects) WDK Model (Java Objects) WDK Model (XML) (XML) WDK Query Engine(Java) Engine(Java) Web Services Framework JavaBeans (JSP compatible) JavaBeans JSP Tag Library Struts controller WDK Sanity Test …Strategies are Powerful Save and browse strategies. Challenge: exploit the power of integrated genome annotation, expression data, proteomics data, SNPs, etc. Strategies… A Graphical Query Interface for Genomics Databases Solution: Strategies… A Graphical Query Interface for Genomics Databases # Nested Strategy P.f. transcript expr. at 24 hours +/- 8 [union] P.f. transcript expr. in Trophozoites [union] P.f. protein expr. in Trophozoites JSP and CSS = You provide = WDK provides = Optional Different types of strategies: Genes, Isolates, SNPs, Transcript assemblies, Chromosomes, Array Elements, ORFs, etc. Strategies Web Dev Kit (WDK) www.gusdb.org/wdk www.gusdb.org/wdk EuPathDB is an NIAID Bioinformatics Resource Center Supported by NIAID Contract No. HHSN266200400037C and The Bill & Melinda Gates Foundation Processes (eg, BLAST)


Download ppt "Controller View (web) Model Model T HE E U P ATH DB / GUS-WDK S EARCH S TRATEGY S YSTEM Cristina Aurrecoechea 1, Brian P. Brunk 2, Steve Fischer 2, Xin."

Similar presentations


Ads by Google