Download presentation
Presentation is loading. Please wait.
Published byOsborne Gregory Modified over 9 years ago
1
Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]
2
Overview Orthology & Paralogy Definitions and examples Ways to determine an ortholog Pre-calculations: resources Alignment & Assembly Differences Key programs for each Jalview example
3
Homologs Have common origins but may or may not have common activity. Homologous or not?: Often determined by arbitrary threshold level of similarity determined by alignment
4
Homologs …have common ancestry, but the way they are related can vary (i.e. the reasons they have diverged into different sequences can vary) orthologs - Homologs produced by speciation. They tend to have similar function. paralogs - Homologs produced by gene duplication. They tend to have differing functions.
5
Orthologous or paralogous homologs Early globin gene mouse ß -chain gene -chain gene cattle ß human ß mouse ß human cattle Orthologs ( ) Orthologs ( ß ) Paralogs (cattle) Homologs Gene Duplication Orthologs – diverged after speciation – tend to have similar function Paralogs – diverged after gene duplication – some functional divergence occurs Therefore, for linking similar genes between species, or performing “annotation transfer”, identify orthologs
6
True or False? A1x is the ortholog in species x of A1y? A1x is a paralog of A2x? A1x is a paralog of A2y?
7
Identifying Gene/Protein Relationships from Phylogenies Orthologs – Homologs produced by speciation – Gene phylogeny matches organismal phylogeny Paralogs – Homologs produced by gene duplication. – Multiple copies of homologs in a given species or evidence that gene duplication involved through phylogenetic analysis – Lack of match to organismal phylogeny
8
Gene Orthology: How to detect? Most : Identify reciprocal best BLAST hits (EGO, COGs,…) Example Problem: If making comparisons between human and bovine, for example, the bovine gene dataset is still quite incomplete Therefore, current best hit may be a paralog now and the true ortholog not yet sequenced cattlehumancattle mouse
9
2 Forms in 1 Species ++++ Slides from Jonathan Eisen
10
2 Forms in 1 Species - Gene Loss Gene duplicated in common ancestor ++++ + Loss
11
Unusual Distribution Pattern + +
12
Unusual Distribution - Gene Loss + + Gene present in ancestor Gene lost here
13
Unusual Distribution - Evolutionary Rate Variation -? + + Gene too diverged to be found
14
Ortholog guess via synteny ACB AC?
15
Syntenic blocks
16
Alignments and Assemblies Alignment ALL sequences from SAME region Therefore can be useless for non-overlapping contigs PCR probes/oligos Good for paralog/orthologs Basis for phylogeny More dissimilar sequences Assembly: Good for near identical sequences Read Length Short Read [Next Gen Sequencing] Long Read [Sanger and 3 rd Gen sequencing?] Reference? De-novo Guided [reference sequence]
17
ensEMBL calculations http://www.ensembl.org demo
18
OMA Browser http://omabrowser.org demo
19
Alignment Implicit statement Each residue in an aligned sequence derived from the last common ancestor [LCA] Therefore ok to only look at conserved regions or mask non- conserved regions Especially for phylogeny
20
Alignment Tools Faster but less accurate (some better with gaps) Muscle ClustalW/X MAFFT Slow but more accurate *-Coffee T: original 3D: uses pdb as guide (structural) M: uses multiple methods Probcons
21
Alignment Edit Tools NEVER use a word processor or excel to edit alignments…… JalView (Java Alignment Viewer) Good for editing DAS capable
22
Figure Generation Trees Annotation Features Structures PDB ‘Standard’ Formats FASTA MSF CLUSTAL PILEUP BLC PFAM Distributed Annotation System Distributed Annotation System GFF Jalview Features Newick Secondary Structure Prediction Multiple Sequence Alignment Sequences Alignments Clickable HTML Images Line Art Analysis Consensus Conservation & Clustering Visualization Jalview Annotation
25
Jalview DAS Client Functionality DAS ANNOTATION SERVERS DAS ANNOTATION SERVERS Query matches ID to Authority Map to local reference frame Mouse over for feature name, links and scores Group features by source Type==colour Highlight start-end Select specific sources Filtered list Add user defined sources
26
Assemblers Many free options : examples below Long Reads STADEN - staden.sf.net NextGenSequencing Guided: Bowtie, Novoalign, MAQ Denovo: Velvet 3 rd Generation Sequencing ????
27
Post Assembly Correction Reads mapping to multiple places PCR amplification prior to mapping Tools and workflows available in our Galaxy platform demo
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.