The CIPRES Portal: Current Status and Future Plans Mark A. Miller Principal Investigator, Biology San Diego Supercomputer Center
CIPRES Software Libraries were created to enable fine-grained communication between programs
CIPRES Portal V 1.0 was created to expose these libraries to the user community
The CIPRES Portal V 1.X parses uploaded input files and provides users with appropriate tool selections for the data set. Results are stored temporarily for download.
Portal 1 Tools Portal 1 Features PAUP (limited options) RAxML +bootstrapping; (“Black box” options) GARLI (limited options) MrBayes (limited options) ClustalW RecIDCM3 boosting for PAUP and RAxML ReST accessible Supports Nexus, Phylip, and Hennig86 formats. Portal 1 Features
Problem CORBA architecture has a very high overhead for adding new data types/services This makes it hard to expose to new command line options, even when they are already available from the command line tool.
CIPRES Portal V 2: New design goals Decrease the overhead by creating a new architecture that can still have “knowledge” about the data, but the data is treated as text strings, and not as CORBA objects. Add new user requested features: Access to most or all native command line options Add new tools more quickly Provide personal user space for storing results
CIPRES Portal V 2.0 was built on a generic portal architecture called The Workbench Framework
All command line parameters can be set.
All command line parameters can be set.
With release of Portal v 2.0: Rate of job submission doubles Rate of Garli use increases 9-fold Rate of MrBayes use increases 5-fold
To expose Command Line Tools quickly, the Workbench Framework uses the PISE XML standard…. <?xml version="1.0" encoding="ISO-8859-1" ?> <!DOCTYPE pise SYSTEM "http://www.phylo.org/dev/rami/PARSER/pise.dtd" [ <!ENTITY nucdbs SYSTEM "http://www.phylo.org/dev/rami/XMLDIR/nucdbs.xml"> <!ENTITY protdbs SYSTEM "http://www.phylo.org/dev/rami/XMLDIR/protdbs.xml"> <!ENTITY blastDBpath SYSTEM "http://www.phylo.org/dev/rami/XMLDIR/blastDBpath.xml"> <!ENTITY fastaDBpath SYSTEM "http://www.phylo.org/dev/rami/XMLDIR/fastaDBpath.xml"> <!ENTITY blocksDBpath SYSTEM "http://www.phylo.org/dev/rami/XMLDIR/blocksDBpath.xml"> <!ENTITY nucDBfasta SYSTEM "http://www.phylo.org/dev/rami/XMLDIR/nucDBfasta.xml"> <!ENTITY protDBfasta SYSTEM "http://www.phylo.org/dev/rami/XMLDIR/protDBfasta.xml"> ]> <pise> <head> <title>TFASTY</title> <version>34t10d3</version> <description>Compare PS to Translated NS Or NS-DB</description> <authors>W. Pearson</authors> <reference>Pearson, W. R. (1999) Flexible sequence similarity searching with the FASTA3 program package. Methods in Molecular Biology</reference> <reference>W. R. Pearson and D. J. Lipman (1988), Improved Tools for Biological Sequence Analysis, PNAS 85:2444-2448</reference> <reference> W. R. Pearson (1998) Empirical statistical estimates for sequence similarity searches. In J. Mol. Biol. 276:71-84</reference> <reference>Pearson, W. R. (1996) Effective protein sequence comparison. In Meth. Enz., R. F. Doolittle, ed. (San Diego: Academic Press) 266:227-258</reference> <category>Protein Sequence</category>
The webtooldev server was created so anyone can create and test new interfaces…
The scalable interface generator allows us to add new tools quickly…
Problem Many users cannot complete their jobs in the 72 hour limit
Under current development…..connect the portal to scalable resources Workbench Framework The TeraGrid resources are faster, and they are scalable. The CIPRES Portal has been awarded 400,000 Tergrid cpu hours, the CIPRES Portal is now a TeraGrid Science Gateway project. We are building the infrastructure to support this access. We have implemented parallel MrBayes, Garli, and RAxML. We are working with developers to deploy restart options, so if a job times out, it can be restarted. TeraGrid (group allocation)