Bioinformatics Data and the Grid: The GeneGrid Data Manager Noel Kelly
GeneGrid Architecture Workflow Definition GDM Service GeneGrid Environment GeneGrid Portal GeneGrid Workflow Status GeneGrid Application Management Registry GeneGrid Workflow Manager Service GeneGrid Data Manager Registry GDM Service GeneGrid Process Manager Service GeneGrid Input &Results Parameters GDM Service BeSC GAM Service GAM Service GDM Service iGAP GAM Service GDM Service Blast EMBL DB TMHMM mpiBlast SwissProt DB SignalP EBI SDSC SwissProt Database EMBL Database
GeneGrid Data Manager Objectives Integrate specialised public biological data into the Grid Integrate proprietary data into the Grid Access and Storage of User Input Parameters Experiment Tracking Access and Storage of Experiment Results
GeneGrid Databases GeneGrid Workflow Definition Database Xindice 1.0 Collection GeneGrid Workflow Status Database GeneGrid Results & Input Parameter Database File System
Biological Databases Structured File EMBL Bank SwissProt TrEMBL TrEMBL_new GenBank DDJB ENSEMBL Fusion Proprietary Amtec Proprietary Structured File MySQL Oracle T.B.C.
Public Biological Data Integration GeneGrid Data Manager Service Using BioPERL modules JDBC Driver PERL Scripts SwissProt
Public Biological Data Integration GeneGrid Data Manager Service BeSC Perl Script Record SwissProt EBI
Fusion Antibodies Commercial Use Case Fasta File BlastP MQNSHSGVNQLGGVFVNGRPLPDSTRQKIVELAHSGARPCDISRILQVSNGCVSKILGRY………… Blast Format Blast Format SwissProt Query Blast Formatter Accession Numbers Multiple Fasta Records TMHMM Multiple TMHMM Format Multiple TMHMM Format SignalP Eliminator Fasta Records Multiple SignalP Format Multiple SignalP Format Bl2Seq Eliminator Fasta Records
Fusion Use Case – GDM Perspective BlastP SwissProt Query Blast Formatter TMHMM SignalP Eliminator Bl2Seq Eliminator
Multiple Accession Numbers Querying SwissProt Accession Numbers Task Params Fasta Record GeneGrid Data Manager Service (for SwissProt) GeneGrid Data Manager Service (for GRIP) Multiple Accession Numbers SwissProt GRIP
Fusion Use Case – GeneGrid Perspective BlastP SwissProt Query Blast Formatter TMHMM SignalP Eliminator Bl2Seq Eliminator
Executing Bioinformatics Applications Result File Task Params GeneGrid Application Manager Service (for SignalP) GeneGrid Data Manager Service (for GRIP) Multiple Fasta Records GRIP Input File
GeneGrid Landmarks 1 year through a 2 year project Successfully integrated a number of bioinformatics applications Successfully integrated a number of bioinformatics data sets Number of papers accepted at various conferences (Computing & Bioinformatics) International collaboration with EOL project (SDSC)
GeneGrid at All Hands A practical Workflow Implementation for a Grid Based Virtual Bioinformatics Laboratory Session 4.4, Thur 2nd Sep, 14:10 -15:50 Bioinformatics Application Integration and Management in GeneGrid: Experiments and Experiences Session 6.4, Fri 3rd Sep, 11:05 – 13:10
GeneGrid Demonstrations Tuesday, 1st September 18:15 – 20:15 Thursday, 2nd September 10:00 – 11:30 17:30 – 19:30 Friday, 3rd September 13:00 – 14:30