Vorlesung Grundlagen der Bioinformatik

Slides:



Advertisements
Similar presentations
Bioinformatics Methods Course Multiple Sequence Alignment Burkhard Morgenstern University of Göttingen Institute of Microbiology and Genetics Department.
Advertisements

Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.
Multiple Sequence Alignment Dynamic Programming. Multiple Sequence Alignment VTISCTGSSSNIGAG  NHVKWYQQLPG VTISCTGTSSNIGS  ITVNWYQQLPG LRLSCSSSGFIFSS.
Multiple Alignment Anders Gorm Pedersen Molecular Evolution Group
Bioinformatics Multiple sequence alignments Scoring multiple sequence alignments Progressive methods ClustalW Other methods Hidden Markov Models Lecture.
Blast outputoutput. How to measure the similarity between two sequences Q: which one is a better match to the query ? Query: M A T W L Seq_A: M A T P.
Gene Prediction: Similarity-Based Approaches (selected from Jones/Pevzner lecture notes)
6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.
Introduction to Bioinformatics Burkhard Morgenstern Institute of Microbiology and Genetics Department of Bioinformatics Goldschmidtstr. 1 Göttingen, March.
1 “INTRODUCTION TO BIOINFORMATICS” “SPRING 2005” “Dr. N AYDIN” Lecture 4 Multiple Sequence Alignment Doç. Dr. Nizamettin AYDIN
©CMBI 2005 Sequence Alignment In phylogeny one wants to line up residues that came from a common ancestor. For information transfer one wants to line up.
Heuristic alignment algorithms and cost matrices
Multiple Sequence Alignment Algorithms in Computational Biology Spring 2006 Most of the slides were created by Dan Geiger and Ydo Wexler and edited by.
Finding approximate palindromes in genomic sequences.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2005.
What you should know by now Concepts: Pairwise alignment Global, semi-global and local alignment Dynamic programming Sequence similarity (Sum-of-Pairs)
Performance Optimization of Clustal W: Parallel Clustal W, HT Clustal and MULTICLUSTAL Arunesh Mishra CMSC 838 Presentation Authors : Dmitri Mikhailov,
Alignment of large genomic sequences Fragment-based alignment approach (DIALIGN) useful for alignment of genomic sequences. Possible applications: Detection.
Multiple sequence alignments and motif discovery Tutorial 5.
Algorithms Dr. Nancy Warter-Perez June 19, May 20, 2003 Developing Pairwise Sequence Alignment Algorithms2 Outline Programming workshop 2 solutions.
Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.
Dynamic Programming. Pairwise Alignment Needleman - Wunsch Global Alignment Smith - Waterman Local Alignment.
CECS Introduction to Bioinformatics University of Louisville Spring 2003 Dr. Eric Rouchka Lecture 3: Multiple Sequence Alignment Eric C. Rouchka,
Phylogenetic Tree Construction and Related Problems Bioinformatics.
Sequence comparison: Local alignment
Sequencing a genome and Basic Sequence Alignment
Chapter 5 Multiple Sequence Alignment.
Developing Pairwise Sequence Alignment Algorithms
Sequence Alignment and Phylogenetic Prediction using Map Reduce Programming Model in Hadoop DFS Presented by C. Geetha Jini (07MW03) D. Komagal Meenakshi.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Multiple Sequence Alignment May 12, 2009 Announcements Quiz #2 return (average 30) Hand in homework #7 Learning objectives-Understand ClustalW Homework#8-Due.
Protein Sequence Alignment and Database Searching.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
Pairwise Sequence Alignment BMI/CS 776 Mark Craven January 2002.
Lecture 6. Pairwise Local Alignment and Database Search Csc 487/687 Computing for bioinformatics.
The Basic Local Alignment Search Tool (BLAST) Rapid data base search tool (1990) Idea: (1) Search for high scoring segment pairs.
Sequencing a genome and Basic Sequence Alignment
Multiple sequence alignments Introduction to Bioinformatics Jacques van Helden Aix-Marseille Université (AMU), France Lab.
Bioinformatics Multiple Alignment. Overview Introduction Multiple Alignments Global multiple alignment –Introduction –Scoring –Algorithms.
Multiple alignment: Feng- Doolittle algorithm. Why multiple alignments? Alignment of more than two sequences Usually gives better information about conserved.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Sequence Alignment Only things that are homologous should be compared in a phylogenetic analysis Homologous – sharing a common ancestor This is true for.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
COT 6930 HPC and Bioinformatics Multiple Sequence Alignment Xingquan Zhu Dept. of Computer Science and Engineering.
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Techniques for Protein Sequence Alignment and Database Searching (part2) G P S Raghava Scientist & Head Bioinformatics Centre, Institute of Microbial Technology,
A Hardware Accelerator for the Fast Retrieval of DIALIGN Biological Sequence Alignments in Linear Space Author: Azzedine Boukerche, Jan M. Correa, Alba.
Burkhard Morgenstern Institut für Mikrobiologie und Genetik Grundlagen der Bioinformatik Multiples Sequenzalignment Juni 2007.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Burkhard Morgenstern Institut für Mikrobiologie und Genetik Molekulare Evolution und Rekonstruktion von phylogenetischen Bäumen WS 2006/2007.
DNA, RNA and protein are an alien language
1 Multiple Sequence Alignment(MSA). 2 Multiple Alignment Number of sequences >2 Global alignment Seek an alignment that maximizes score.
1 Repeats!. 2 Introduction  A repeat family is a collection of repeats which appear multiple times in a genome.  Our objective is to identify all families.
Introduction to Sequence Alignment. Why Align Sequences? Find homology within the same species Find clues to gene function Practical issues in experiments.
Techniques for Protein Sequence Alignment and Database Searching G P S Raghava Scientist & Head Bioinformatics Centre, Institute of Microbial Technology,
Multiple Sequence Alignment Dr. Urmila Kulkarni-Kale Bioinformatics Centre University of Pune
Database Scanning/Searching FASTA/BLAST/PSIBLAST G P S Raghava.
Multiple Sequence Composition Alignment
Multiple sequence alignment (msa)
The ideal approach is simultaneous alignment and tree estimation.
Sequence comparison: Local alignment
Multiple Sequence Alignment
Sequence Alignment 11/24/2018.
Pairwise sequence Alignment.
Sequence Based Analysis Tutorial
Sequence alignment, Part 2
Basic Local Alignment Search Tool (BLAST)
Introduction to Bioinformatics
Pairwise Sequence Alignment
Presentation transcript:

Vorlesung Grundlagen der Bioinformatik

Information from a Single Sequence Alone Sequence alignment in molecular data analysis:

Information from a Single Sequence Alone Multi-Organism High Quality Sequences Sequence alignment in molecular data analysis: (M. Brudno)

Tools for multiple sequence alignment seq1 T Y I M R E A Q Y E seq2 T C I V M R E A Y E seq3 Y I M Q E V Q Q E seq4 Y I A M R E Q Y E

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 Y - I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 Y - I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 Y - I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 Y - I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 Y - I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E Functionally important regions more conserved than non-functional regions

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 Y - I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E Functionally important regions more conserved than non-functional regions Local sequence conservation indicates functionality!

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 - Y I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E Astronomical Number of possible alignments!

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V - M R E A Y E seq3 - Y I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E Astronomical Number of possible alignments!

Tools for multiple sequence alignment seq1 T Y I - M R E A Q Y E seq2 T C I V M R E A - Y E seq3 - Y I - M Q E V Q Q E seq4 Y – I A M R E - Q Y E Which one is the best ???

Tools for multiple sequence alignment Questions in development of alignment programs: (1) What is a good alignment? objective function (`score) (2) How to find a good alignment? optimization algorithm First question far more important !

Tools for multiple sequence alignment Most important scoring scheme for multiple alignment: Sum-of-pairs score for global alignment.

Divide-and-Conquer Alignment (DCA) J. Stoye, A. Dress (Bielefeld) Approximate optimal global multiple alignment Divide sequences into small sub-sequences Use MSA to calculate optimal alignment for sub- sequences Concatenate sub-alignments

Divide-and-Conquer Alignment (DCA)

Tools for multiple sequence alignment Problems with traditional approach: Results depend on gap penalty Heuristic guide tree determines alignment; alignment used for phylogeny reconstruction Algorithm produces global alignments.

First step in sequence comparison: alignment global alignment (Needleman and Wunsch, 1970; Clustal W) atctaatagttaatactcgtccaagtat atctgtattactaaacaactggtgctacta

First step in sequence comparison: alignment global alignment (Needleman and Wunsch, 1970; Clustal W) atc--taatagttaat--actcgtccaagtat ||| || || | || ||| || | | || atctgtattact-aaacaactggtgctacta-

First step in sequence comparison: alignment global alignment (Needleman and Wunsch, 1970; Clustal W) atc--taatagttaat--actcgtccaagtat ||| || || | || ||| || | | || atctgtattact-aaacaactggtgctacta- local alignment (Smith and Waterman, 1983) atctaatagttaatactcgtccaagtat gcgtgtattactaaacggttcaatctaacat

First step in sequence comparison: alignment global alignment (Needleman and Wunsch, 1970; Clustal W) atc--taatagttaat--actcgtccaagtat ||| || || | || ||| || | | || atctgtattact-aaacaactggtgctacta- local alignment (Smith and Waterman, 1983) atctaatagttaatactcgtccaagtat gcgtgtattactaaacggttcaatctaacat

First step in sequence comparison: alignment global alignment (Needleman and Wunsch, 1970; Clustal W) atc--taatagttaat--actcgtccaagtat ||| || || | || ||| || | | || atctgtattact-aaacaactggtgctacta- local alignment (Smith and Waterman, 1983) atc--taatagttaatactcgtccaagtat || || | || gcgtgtattact-aaacggttcaatctaacat

New question: sequence families with multiple local similarities Neither local nor global methods appliccable

New question: sequence families with multiple local similarities Alignment possible if order conserved

The DIALIGN approach Morgenstern, Dress, Werner (1996), PNAS 93, Combination of global and local methods Assemble multiple alignment from gap-free local pair-wise alignments (,,fragments)

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaa--gagtatcacccctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaa--gagtatcacc cctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgc-ttag cagtgcgtgtattactaac gg-ttcaatcgcg caaa--gagtatcacc cctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgc-ttag cagtgcgtgtattactaac gg-ttcaatcgcg caaa--gagtatcacc cctgaattgaataa Consistency!

The DIALIGN approach atc------TAATAGTTAaactccccCGTGC-TTag cagtgcGTGTATTACTAAc GG-TTCAATcgcg caaa--GAGTATCAcc CCTGaaTTGAATaa

The DIALIGN approach Score of an alignment: Define score of fragment f: l(f) = length of f s(f) = sum of matches (similarity values) P(f) = probability to find a fragment with length l(f) and at least s(f) matches in random sequences that have the same length as the input sequences. Score w(f) = -ln P(f)

The DIALIGN approach Score of an alignment: Define score of fragment f: Define score of alignment as sum of scores of involved fragments No gap penalty!

The DIALIGN approach Score of an alignment: Goal in fragment-based alignment approach: find Consistent collection of fragments with maximum sum of weight scores

The DIALIGN approach atctaatagttaaaccccctcgtgcttagagatccaaac cagtgcgtgtattactaacggttcaatcgcgcacatccgc Pair-wise alignment:

The DIALIGN approach atctaatagttaaaccccctcgtgcttagagatccaaac cagtgcgtgtattactaacggttcaatcgcgcacatccgc Pair-wise alignment: recursive algorithm finds optimal chain of fragments.

The DIALIGN approach atctaatagttaaaccccctcgtgcttag agatccaaac cagtgcgtgtattactaac ggttcaatcgcgcacatccgc-- Pair-wise alignment: recursive algorithm finds optimal chain of fragments.

The DIALIGN approach atctaatagttaaaccccctcgtgcttag agatccaaac cagtgcgtgtattactaac ggttcaatcgcgcacatccgc-- Optimal pairwise alignment: chain of fragments with maximum sum of weights found by dynamic programming: Standard fragment-chaining algorithm Space-efficient algorithm

The DIALIGN approach Multiple alignment: atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach Multiple alignment: atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaccctgaattgaagagtatcacataa (1) Calculate all optimal pair-wise alignments

The DIALIGN approach Multiple alignment: atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa (1) Calculate all optimal pair-wise alignments

The DIALIGN approach Multiple alignment: atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa (1) Calculate all optimal pair-wise alignments

The DIALIGN approach Fragments from optimal pair-wise alignments might be inconsistent

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaa--gagtatcacccctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaa--gagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach Fragments from optimal pair-wise alignments might be inconsistent 1. Sort fragments according to scores 2. Include them one-by-one into growing multiple alignment – as long as they are consistent (greedy algorithm, comparable to rucksack problem)

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa Consistency problem

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa Consistency problem

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa Upper and lower bounds for alignable positions

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaa--gagtatcacccctgaattgaataa Upper and lower bounds for alignable positions

The DIALIGN approach atc------taatagt taaactcccccgtgcttag Cagtgcgtgtattact aacggttcaatcgcg caaa--gagtatcacccctgaattgaataa Upper and lower bounds for alignable positions

The DIALIGN approach atc------taata-----gttaaactcccccgtgcttag Cagtgcgtgtatta-----ctaacggttcaatcgcg caaa--gagtatcacccctgaattgaataa Upper and lower bounds for alignable positions

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa Upper and lower bounds for alignable positions site x = [i,p] (sequence i, position p)

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa Upper and lower bounds for alignable positions Calculate upper bound b l (x,i) and lower bound b u (x,i) for each x and sequence i

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa Upper and lower bounds for alignable positions b l (x,i) and b u (x,i) updated for each new fragment in alignment

The DIALIGN approach Consistency bounds are to be updated for each new fragment that is included in to the growing Alignment Efficient algorithm (Abdeddaim and Morgenstern, 2002)

The DIALIGN approach Advantages of segment-based approach: Program can produce global and local alignments! Sequence families alignable that cannot be aligned with standard methods

Program input Program usage: > dialign2-2 [options] = multi-sequence file in FASTA-format

Program output DIALIGN ************* Program code written by Burkhard Morgenstern and Said Abdeddaim contact: Published research assisted by DIALIGN 2 should cite: Burkhard Morgenstern (1999). DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15, For more information, please visit the DIALIGN home page at program call:./dialign2-2 -nt -anc s Aligned sequences: length: ================== ======= 1) dog_il ) bla 200 3) blu 200 Average seq. length: Please note that only upper-case letters are considered to be aligned.

Program output Alignment (DIALIGN format): =========================== dog_il4 1 cagg GTTTGA atctgataca ttgc bla 1 ctga GC CAAGTGGGAA blu 1 ttttgatatg agaaGTGTGA aacaagctat cctatattGC TAAGTGGCAG dog_il ATGGCACT GGGGTGAATG AGGCAGGCAG CAGAATGATC bla 17 ggtgtgaata catgggtttc cagtaccttc tgaggtccag agtacc---- blu 51 ccctggcttt ctATGTGCAC AGAATGGGAG GAAAGTGCCT GCTAGTGAGC dog_il4 63 GTACTGCAGC CCTGAGCTTC CACTGGCCCA TGTTGGTATC CTTGTATTTT bla TTTCCCA TGTGCTCCAT GGTGGAATGG blu 101 CAGGGACTCA GAGAGAATGG AGTATAGGGG TCAGGGCat dog_il4 113 TCCGCCCCTT CCCAGCACca gcattatcct ---GGGATTG GAGAAGGGGG bla 90 ACCACTCCTT CTCAGCACaa caaagcccaa gaaGGTGTTG CGTTCTAGAC blu GGGGTGG CCTTAGGCTC

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atctaatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaagagtatcacccctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaa--gagtatcacccctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaacggttcaatcgcg caaa--gagtatcacc cctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgcttag cagtgcgtgtattactaac ggttcaatcgcg caaa--gagtatcacc cctgaattgaataa

The DIALIGN approach atc------taatagttaaactcccccgtgc-ttag cagtgcgtgtattactaac gg-ttcaatcgcg caaa--gagtatcacc cctgaattgaataa

The DIALIGN approach atc------TAATAGTTAaactccccCGTGC-TTag cagtgcGTGTATTACTAAc GG-TTCAATcgcg caaa--GAGTATCAcc CCTGaaTTGAATaa--

The DIALIGN approach atc------taatagttaaactcccccgtgc-ttag cagtgcgtgtattactaac gg-ttcaatcgcg caaa--gagtatcacc cctgaattgaataa

Alignment of large genomic sequences Fragment-based alignment approach useful for alignment of genomic sequences. Possible applications: Detection of regulatory elements Identification of pathogenic microorganisms Gene prediction

DIALIGN alignment of human and murine genomic sequences

DIALIGN alignment of tomato and Thaliana genomic sequences

Alignment of large genomic sequences Gene-regulatory sites identified by mulitple sequence alignment (phylogenetic footprinting)

Alignment of large genomic sequences

Performance of long-range alignment programs for exon discovery (human - mouse comparison)

Performance of long-range alignment programs for exon discovery (thaliana - tomato comparison)