AHM 2002 Tutorial on Scientific Data Mediation Example 1
Clusfavor NCBI: GeneBank BLAST MatInspector Accession Number Sequence to search wrap1 1)an external program to build a model or 2)back to blast to find additional matches, or 3)to clustal to determine a consensus sequence which is then sent to blast. wrap2 pwrap1 pwrap2 wrap3 pwrap3 wrap4 Resulting sequences & similarity scores The top match : wrap2.xml : wrap3.xml : wrap4.xml bin/matSearch/matsearch.pl SCENARIO WORKFLOW
CLUSFAVOR CLUSFAVOR- CLUSter and Factor Analysis with Varimax Orthogonal RotationCLUSFAVOR A standalone program whose output consists of several clusters of named sequences that have similar expression characteristics in the current experiment. GOAL: Given a gene expression data, to end up with another set of related sequences from which to build a model. INPUT: gene expression data OUTPUT: collection of clustered cDNA fragments
NCBI GeneBank GOAL: Given the name (or, better, the accession number) of a cDNA string from the clusfavor results, do a name lookup in GenBank to obtain the cDNA sequence.GenBank INPUT: The accession number or the name of a cDNA string OUTPUT: cDNA sequence for the input cDNA string
BLAST Basic Local Alignment Search Tool_ BLASTBLAST A set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or DNA. INPUT: Output cDNA sequence from GeneBank. OUPUT: A set of similar sequences.
MatInspector V2.2MatInspector V2.2 based on TRANSFAC TRANSFAC MatInspector - Matrix Inspector TRANSFAC - The Transcription Factor Database Search for potential transcription factor binding sites in your own sequences and detect consensus matches in nucleotide sequence data using the TRANSFAC 4.0 matrices.