1 Orthology and paralogy A practical approach Searching the primaries Searching the secondaries Significance of database matches DB Web addresses Software.

Slides:



Advertisements
Similar presentations
EVIDENCE OF EVOLUTION.
Advertisements

Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
 Species evolve with significantly different morphological and behavioural traits due to genetic drift and other selective pressures.  Example – Homologous.
Orthology, paralogy and GO annotation Paul D. Thomas SRI International.
Basics of Comparative Genomics Dr G. P. S. Raghava.
Evidence for Evolution
PHYLOGENY AND SYSTEMATICS
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Protein Modules An Introduction to Bioinformatics.
The Evidence for Evolution. Problem: How did the great diversity of life originate? Alternative Solutions: A. All living things were created at the same.
An Introduction to Bioinformatics Molecular Biology Databases.
Catalyst: If the answer is False, explain why.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Comparative Genomics of the Eukaryotes
Genome projects and model organisms Level 3 Molecular Evolution and Bioinformatics Jim Provan.
D.5: Phylogeny and Systematics
On line (DNA and amino acid) Sequence Information
Protein Evolution and Sequence Analysis Protein Evolution and Sequence Analysis.
Give me some proof! Evidence for Evolution. 1. Studies of Fossils What are Fossils? –Fossils are any trace of dead organisms.
Evidence Supporting Theory of Evolution (pages 126 – 133)
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Evidence for Evolution ORGANIZE YOUR THOUGHTS! EVIDENCE FOR EVOLUTION  The Fossil Record  Radiometric Dating  Morphology  Homology  Molecular Biology.
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
1 Review of Biological Database Utilization. 2 Biological Databases We will discuss: Usefulness to the bioinformaticist Database types Search methods.
AIM: How do comparative studies help trace evolution?
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Bioinformatics Overview, NCBI & GenBank JanPlan 2012.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Part I: Identifying sequences with … Speaker : S. Gaj Date
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you.
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
AP Biology Chapter 15.3 Evidence of Evolution Dodo bird.
Protein and RNA Families
PHYLOGENY AND SYSTEMATICS Chapter 25. Sedimentary rocks are the richest source of fossils  Fossils are the preserved remnants or impressions left by.
Using blast to study gene evolution – an example.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Bioinformatics and Computational Biology
Phylogeny & Systematics
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
BINF6201/8201: Molecular Sequence Analysis Dr. Zhengchang Su Office: 351 Bioinformatics Building Office hours: Tuesday and Thursday:
Chapter 26 Phylogeny and Systematics. Tree of Life Phylogeny – evolutionary history of a species or group - draw information from fossil record - organisms.
What is BLAST? Basic BLAST search What is BLAST?
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
GENBANK FILE FORMAT LOCUS –LOCUS NAME Is usually the first letter of the genus and species name, followed by the accession number –SEQUENCE LENGTH Number.
E VOLUTION OF E UKARYOTIC G ENOMES G ENE 342 Lecture 13 – Comparative genomics.
What is BLAST? Basic BLAST search What is BLAST?
Phylogeny and the Tree of Life
Genetics and Evolutionary Biology
Basics of Comparative Genomics
EVIDENCE OF EVOLUTION.
Genomes and Their Evolution
5.4 Cladistics.
Genome Annotation Continued
Catalyst: If the answer is False, explain why.
There are four levels of structure in proteins
Evidence for Evolution
BLAST.
Name causes of genetic drift and describe how they work?
EVIDENCE THAT SUPPORTS EVOLUTION
Phylogeny and Systematics
What do you with a whole genome sequence?
EVIDENCE FOR EVOLUTION
Structural evidence: Embryonic similarities Vestigial organs
Basics of Comparative Genomics
Basic Local Alignment Search Tool
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

1 Orthology and paralogy A practical approach Searching the primaries Searching the secondaries Significance of database matches DB Web addresses Software Web addresses

2 Why Search Databases? To find out if a new DNA sequence already is deposited in the databanks. To find proteins homologous to a putative coding ORF.

3 Why Search Databases? To find similar non-coding DNA stretches in the database, (for example: repeat elements, regulatory sequences). To locate false priming sites for a set of PCR oligonucleotides.

4 What Databases Are Available? DNA (nucleotide sequences): The big databases: Genbank, Embl, DDBJ an their weekly updates. These databases exchange information routinely. Genomic databases like the: Human (GDB), Mouse (MGB), Yeast (SGB), etc… Special databases: ESTs (expressed sequence tags) STSs (sequence-tagged sites) EPD (eukaryotic promoter database) REPBASE (repetitive sequence database) and many others.

5 What Databases Are Available? Protein (amino acid sequences): The big databases are: Swiss-Prot ( high level of annotation) PIR (protein identification resource) Translated databases like: SPTREMBL (translated EMBL) GenPept (translation of coding regions in GenBank) Special databases like: PDB(sequences derived from the 3D structure Brookhaven PDB)

6 Web Addresses – ch&DB=nucleotidehttp:// ch&DB=nucleotide – htmlhttp:// html –

7 Let us go

8 What is GenBank? Overview.htmlhttp:// Overview.html GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences …

9 Access to GenBank ew.html. ew.html GenBank is available for searching at NCBI via several methods.searching The GenBank database is designed to provide and encourage access within the scientific community to the most up to date and comprehensive DNA sequence information. Therefore, NCBI places no restrictions on the use or distribution of the GenBank data.

10 NCBI databases x.html Let us try a tutorial

11 Web Addresses – – – arrayexpress.htmlhttp:// arrayexpress.html

12 Homology and Analogy It is important to understand a concept that underpins sequence analysis - homology. The term homology is confounded and abused in the literature. Simply, sequences are said to be homologous if they are related by divergence from a common ancestor.

13 What Is Homology ? (from the Technion course) Similarity or likeness between properties in species. Before Darwin, homology was defined morphologically: Example:

14 Homology  Bats and butterflies fly, but are different.  Bats fly and whales swim, yet the bones in a bat's wing and a whale's flipper are strikingly alike.  Bats and butterflies wings are not homologous.  Bats wings and whales flippers are homologous.

15 Homology Interpretation from Darwin to 21st Century Darwin (1859) explained homology as the result of descent with modification from a common ancestor. Modern genetics: Homology information is in the genes. Two sequences are homologous if they are both similar and have a common ancestor.

16 When Does Similarity Imply Homology? Similarity by itself is not enough: for example, short sequences similarity could be random (result from different ancestors). Large enough similarities typically imply homology (and usually we do not have direct evidence on descent). Sequence similarity comes with a significance measure.

17 Homology and Analogy Understanding homology allows us to appreciate the concept of analogy; this is encountered in protein structures that share similar folds but have no demonstrable sequence similarity; or that share groups of catalytic residues with almost exactly equivalent spatial geometries, but otherwise have neither sequence nor structural similarity. Such relationships are thought to result from convergence to similar biological solutions from different evolutionary starting- points.

18 Homology and Analogy The essence of sequence analysis is the inference of homology. Homology is not a measure of similarity, but an absolute statement that sequences have a divergent rather than a convergent relationship. Thus, phrases that quantify homology are meaningless.

19 Orthology and Paralogy Homologous proteins may perform the same function in different species (orthologues) or different but related functions within one organism (paralogues). Comparison of orthologues allows study of molecular palaeontology, while paralogues have provided deeper insights into the underlying mechanisms of evolution.

20 Orthology and Paralogy Paralogues arose from single genes via successive duplication events. The duplicated genes followed separate evolutionary pathways, and new specificities evolved through variation and adaptation.

21 Complete genomes cgi?db=Genomehttp:// cgi?db=Genome Let us walk around among genomes

22 COGs Phylogenetic classification of proteins encoded in complete genomes Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in 43 complete genomes, representing 30 major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain. Proteins from two eukaryotic genomes (Drosophila melanogaster and Caenorhabditis elegans) were assigned to COGs and can be reached from each individual COG page.Drosophila melanogasterCaenorhabditis elegans

23 COGs Cognitor COG Help ml#tophttp:// ml#top »FTP ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Mycobacterium_leprae/