C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [2] 24 Oct Walter Pirovano - Genome analysis Papers Consensus sequences improve PSI-BLAST through mimicking profile-profile alignments Dariusz Przybylski and Burkhard Rost Nucleic Acids Research 2007 Heads or Tails: A Simple Reliability Check for Multiple Sequence Alisngments Giddy Landan and Dan Graur Molecular Biology and Evolution 2007
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [3] 24 Oct Walter Pirovano - Genome analysis 1 st paper
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [4] 24 Oct Walter Pirovano - Genome analysis BLAST and PSI-BLAST BLAST is a sequence-sequence method: Sequence (query) – Sequence (nr database) PSI-BLAST is a profile-sequence method: RUN 1: just like normal BLAST RUN 2: Profile (query) – Sequence (nr database)
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [5] 24 Oct Walter Pirovano - Genome analysis Accuracy vs. Speed the usual dilemma … Sequence – Sequence Profile – Sequence Profile – Profile ACCURACY SPEED
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [6] 24 Oct Walter Pirovano - Genome analysis Consensus sequences - 1 “1-D semplification of the sequence profile” Compromise between accuracy and speed A C D.. Y Profile Sequence 1 F A T N M G T S D P P T Sequence 2 F V T N M N N S D G P T Consensus F * T N M * * S D * P T
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [7] 24 Oct Walter Pirovano - Genome analysis Consensus sequences - 2 How can we display consensus sequences? Replace the complete sequence by the consensus sequence (100%) Replace only local parts by consensus segments (top 50% & low 50%) Tests on: Sequence – Consensus Consensus – Consensus Profile – Consensus
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [8] 24 Oct Walter Pirovano - Genome analysis Method
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [9] 24 Oct Walter Pirovano - Genome analysis Evaluation of results Ability to identify functionally related proteins Correctly align them based on structural alignments Function is more conserved than Structure
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [10] 24 Oct Walter Pirovano - Genome analysis Functional evaluation: SCOP folds superfamilies families classes
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [11] 24 Oct Walter Pirovano - Genome analysis Structural evaluation: 3D model quality querytemplate MAGFWILMLGKSLL Making the model: simply copy coordinates Test model quality through LGA superposition (query model with query structure) ‘Golden standard’: structural alignment of known structure of query & template with MAMMOTH
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [12] 24 Oct Walter Pirovano - Genome analysis Final sets for alignment test Set 1: most related, non-trivial pairs (no. = 1647) Set 2: more difficult, most diverged (no. = 5551)
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [13] 24 Oct Walter Pirovano - Genome analysis Results functional analysis
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [14] 24 Oct Walter Pirovano - Genome analysis Results structural analysis
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [15] 24 Oct Walter Pirovano - Genome analysis 2 nd paper
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [16] 24 Oct Walter Pirovano - Genome analysis There are quite some multiple alignment methods.... PRALINE... but what about accuracy?
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [17] 24 Oct Walter Pirovano - Genome analysis Benchmarking: usual on structural alignments. There are several alignment benchmarks, such as BAliBASE, HOMSTRAD or SABMARK But they can only tell us the alignment quality on their predefined sets Alignment methods need to define quality and consistency criteria.
CENTRFORINTEGRATIVE BIOINFORMATICSVU E [18] 24 Oct Walter Pirovano - Genome analysis Heads-or-Tails method ?