Linux Platform Download the source tar ball from the BLAST source code link ncbi-blast src.tar.gz Compilation cd /BLASTdirectory/c++ ./configure --without-debug --with-mt --with- build-root=ReleaseMT cd ReleaseMT/build make all_r After compilation: Run Perl update_blastdb.pl database_name to download a selected database (ex. htgs, refseq_rna) Conduct a test of BLAST’s installation for some standard nucleotide similarity search Type./blastdbcmd -db database_name -entry nm_ outfmt "%f" -out test_query.txt ― blastdbcmd takes a selected database (-db), a search string parameter (-entry), output format ( –outfmt), and output file (-out) ― Finds a sequence from -db based upon search criteria, and then place the sequence into output file using the give format Type./blastn -query my_query.txt -db refseq_rna - out ouput.txt ― blastn takes a sequence input file (-query), a selected database (-db), and output file (-out) ― Runs a nucleotide query search on the given –db, then save its results in output file Compile and Run BLAST Locally from Source Code Department of Computer Science College of Mathematics & Science Preston Cofield Advisor: Gang Qian ABSTRACT: BLAST is a widely-used search tool for homology detection in large biological sequence databases. In this presentation, we provide a guidance of locating BLAST source code on NCBI and download it to a local computer. We will then show the compilation and execution of BLAST programs on both Linux and Windows platforms. Using BLAST, locally, allows study of the structure and algorithms of the BLAST source programs so that comparison research on improving search performance on biological sequence databases can be conducted. Introduction Basic Local Alignment Search Tool (BLAST) [1,2,3] is a popular search algorithm in bioinformatics, useful in analyzing homologous comparisons between biological sequences BLAST can be run in two ways: 1. A Web interface provided by the National Center for Biotechnology Information (NCBI) 2. Running BLAST on a local computer Running BLAST locally offers great flexibility to its users BLAST’s Source code link: ftp://ftp.ncbi.nlm.nih.gov/blast/execut ables/blast+/LATEST/ Databases Download link: ftp://ftp.ncbi.nlm.nih.gov/blast/db/ Windows Platform Download an MSI from the download link Windows (32-bit x86, MSI installer) After installation: Windows OS needs the ability to run Perl scripts Run Perl update_blastdb.pl database_name to download a selected database (ex. htgs, refseq_rna) All BLAST programs are ran from the command prompt Perform a test of BLAST’s installation for some standard nucleotide similarity search Create a new OS environment variable holding the full path to the BLAST’s bin ― Facilitates inputting BLAST commands In an open command prompt, enter the BLAST directory: Type blastdbcmd -db database_name -entry nm_ outfmt "%f" -out test_query.txt ― blastdbcmd takes a selected database (-db), a search string parameter (-entry), output format ( – outfmt), and output file (-out) ― Finds a sequence from -db based upon search criteria, and then place the sequence into output file using the give format Type blastn -query my_query.txt -db refseq_rna -out ouput.txt ― blastn takes a sequence input file (-query), a selected database (-db), and output file (-out) ― Runs a nucleotide query search on the given –db, then save its results in output file Results and Conclusion Linux Platform Creation of an output file containing the search results from the blastn function Windows Platform Creation of an output file containing the search results from the blastn function Since BLAST can be compiled, and run locally by the user, the user gains the capability to further study, and improve upon BLAST’s heuristic algorithms References [1] BLAST Main Web site: [2] Altschul S, Gish W, Miller W, Myers E and Lipman D. Basic local alignment search tool. J. Molecular Biology 1990; 215(3): [3] Altschul S, Madden T, Schäffer A, Zhang J, Zhang Z, Miller W and Lipman D. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 1997; 25(17):