Display of Near Optimal Sequence Alignments M. Smoot1, W. Pearson2 and S. Guerlain1 University of Virginia Department of Systems and Information Engineering, School of Engineering and Applied Science (1) Department of Biochemistry and Molecular Genetics, School of Medicince (2) Charlottesville, VA 22904-4747 Contact: {mes5k,wrp,guerlain}@virginia.edu www.sys.virginia.edu/hci 1. Introduction 2. Current Display The biological meaningfulness of alignments generated by optimal sequence alignment algorithms (e.g. Smith-Waterman) has been questioned for years. It has been speculated that the biologically optimal alignment of two sequences lies somewhere near the algorithmically optimal solution. While various algorithms have been proposed for generating near optimal solutions, none of these methods provide a mechanism for effectively finding the most meaningful alignment. The primary problem is that there is often a large number of similar solutions generated. We have developed a web based software system for the creation and display of near optimal or alternative alignments of two protein or DNA sequences. The tool is designed to help investigators identify the most biologically meaningful alignment of two sequences. Hard to compare sequences at the base/aa level. Hard to know which alignments are most important. Hard to know how many alignments exists 3. System Prototype Web Based Interface. Alignments displayed sequentially, like a movie. Mulitple optimal and near optimal alignments generated. Multiple combinations of scoring matrices, gap parameters and neighborhood thresholds allow users to generate comprehensive sets of alignments. User specifiable and controllable highlighting. Steady display keeps aligned regions in the same screen location so that the parts of the alignments that are relatively invariant appear that way. Works with DNA and Protein. Allows investigators to use expertise instead of relying on algorithms. Followed a user-centered design process in developing an interface. System prototype developed using C++, Java and Perl. Can fetch sequences from NCBI, upload sequences or allow pasted sequences. Will integrate annotation information from external databases (NCBI, SwissProt, etc.) Evaluation in progress. Movie controls Shades of orange indicate variation of different regions of alignment. Display option selection Different Highlights, including user configurable highlights. Sequence and alignment information Alignment generation information Prototype available at: www.sys.virginia.edu/hci/research.html 5. Acknowledgements This work was supported by the University of Virginia Biotechnology Training Program (sponsored by NIH)