Presentation is loading. Please wait.

Presentation is loading. Please wait.

Draft sequencing and assembly of the genome of the world’s largest fish, the whale shark: Rhincodon typus Smith 1828 Timothy D. Read, Robert A. Petit III,

Similar presentations


Presentation on theme: "Draft sequencing and assembly of the genome of the world’s largest fish, the whale shark: Rhincodon typus Smith 1828 Timothy D. Read, Robert A. Petit III,"— Presentation transcript:

1 Draft sequencing and assembly of the genome of the world’s largest fish, the whale shark: Rhincodon typus Smith 1828 Timothy D. Read, Robert A. Petit III, Sandeep J. Joseph, Md. Tauqeer Alam, M. Ryan Well, Maida Ahmad, Ravila Bhimani, Jocelyn S. Vuong, Chad P. Haase, D. Harry Webb, Milton Tan, Alistair D.M. Dove

2 Introduction Largest extant species of fish, and largest non-mammalian vertebrate (largest confirmed individual about 42 long and 47,000 lbs) Sole member of its genus (Rhinchodon) and family (Rhincodontidae) First appeared 60 million years ago. Found in tropical oceans worldwide. Usually pelagic

3 Phylogeny Orectolobiformes
Sharks (Selachimorpha) are cartilaginous fish (more distantly related to tetrapods than bony fish, but more closely related than jawless fish) Whale sharks are in the order Orectolobiformes (carpet sharks) Phylogeny Squalimorphii Selachimorpha Lamniformes Orectolobiformes Carchariniformes

4 Phylogeny Rhincodontidae Orectolobidae Selachimorpha Hemiscyllidae
Stegostomatidae

5 Rationale Endangered – flagship species for marine conservation
Few publications on genetics/genomics Population genetics (conflicting conclusions) Mitochondrial genome (2014) Chondrichthyes as a whole – important model species for comparative studies of human evolution. First elasmobranch to have its complete nuclear genome published – elephant shark (Callorhinchus milii), while a cartilaginous fish, is in the Holocephali (the other cartilaginous fish group – chimaeras)

6 Methods Tissue samples collected postmortem from a male whale shark at the Georgia Aquarium in 2007. Liver and spleen 454 and Illumina technologies Low quality reads filtered out, remaining reads assembled using SOAPdenovo. K-mers for the de Bruijn graph Assembly statistics generated using a script from Assemblathon K-mer 63 was chosen because it had the largest contig (86,048 bp) and an NG50 similar to other top scores Final version – contigs below 200 bp excluded

7 Methods Proteins predicted de novo on assembled contigs using AUGUSTUS
BLASTP used to match proteins to the NCBI nr database KronaTools used to create taxonomic visualizations of results Annotated using INTERPRO profile database COG (core ortholog group) annotations – BLASTP against the KOG database Predicted whale shark proteome compared against proteomes from 11 other fishes Combined into one database and searched against itself - BLASTP

8 Methods Core genes (protein-coding gene clusters shared by all genomes in study) MUSCLE used to align core genes, protein alignments filtered using GBLOCKS (removed gaps and highly divergent regions) Core gene sequences concatenated Maximum likelihood – RAxML Phylogenomics

9 Results Genome assembly statistics: Genome size: 3.44 Gbp
Coding DNA: 10,400,226 bp (0.41%) DNA G+C: 1,059,229,091 bp (41.3%) Number of Scaffolds: 997,976 Scaffold N50 (bp): 5,425 Number of contigs: 1,213,000 Contig N50 (bp): 5,304 Protein coding genes: 19,384 Genes with function prediction: 5,380 (27.8%) Genes assigned to KOGs: 7,038 (36.3%)

10 Results Predicted proteins
Majority of proteins less than 200 amino acids Largest was 4,709 amino acids 14,736 (76%) of proteins had a BLASTP match >99% of best matches were eukaryotes 82% to Chordata 34% to Elephant Shark (best match) Ortholog analysis 1,846 ortholog groups with at least one protein member present from the eleven fish genomes Of these, 155 orthologs had exactly one in each group

11 Results Ortholog analysis (continued)
Phylogeny from concatenated core genes reflected the currently accepted phylogeny of fish 865 protein families present in other genomes that were missing in whale shark Highest amount of missing families from all eleven genomes, even the lamprey (764) and elephant shark (108) 543 missing in lamprey and whale shark but present in all others Draft nature, de novo annotation, or evolutionary divergence?

12 Results Larger genome than elephant shark
Bone deposition proteins (SCP and SIBLING proline-glutamine families – homolog in humans) missing in whale shark – makes sense, it’s a cartilaginous fish. Innate immunity protein Toll-like receptors 13 and 21 – ancient homolog maybe represented in whale shark – worth a look More work on this genome expected

13 Questions Phylogenomics vs. Phylogenetics
Ways to interpret/analyze evolutionary relationships using genomics, including discrepancies (e.g., lamprey vs. elephant shark vs. whale shark) Higher number of individuals per species sampled? Higher number of species per clade sampled?


Download ppt "Draft sequencing and assembly of the genome of the world’s largest fish, the whale shark: Rhincodon typus Smith 1828 Timothy D. Read, Robert A. Petit III,"

Similar presentations


Ads by Google