Toward Next Generation Biodiversity Research

Slides:



Advertisements
Similar presentations
After 13 years of scientist work predominatly in USA & UK the DNA sequence of the human genome was completed in 2003 Any ideas how they did it? What would.
Advertisements

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Tucson High School Biotechnology Course Spring 2010.
Metabarcoding 16S RNA targeted sequencing
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
9 Genomics and Beyond Brief Chapter Outline
Greg Phillips Veterinary Microbiology
Physical Mapping I CIS 667 February 26, Physical Mapping A physical map of a piece of DNA tells us the location of certain markers  A marker is.
Richard, Rochelle, Zohal, Angie
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
Utilizing Fuzzy Logic for Gene Sequence Construction from Sub Sequences and Characteristic Genome Derivation and Assembly.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
From T. MADHAVAN, & K.Chandrasekaran Lecturers in Zoology.. EXIT.
BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources Jeongheui Lim Korean BioInformation Center Korea Research.
Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital
De-novo Assembly Day 4.
Discussion on Metagenomic Data for ANGUS Course Adina Howe.
DNA Barcoding – Southern African Experience Michelle van der Bank.
Discovery of new biomarkers as indicators of watershed health and water quality Anamaria Crisan & Mike Peabody.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
June 11, 2013 Intro to Bioinformatics – Assembling a Transcriptome Tom Doak Carrie Ganote National Center for Genome Analysis Support.
Advancing Science with DNA Sequence Undergraduate Genomics in a Research University Environment A Collaborative Effort between the JGI and UC Merced M.
Biodiversity initiative: Integrating Taxonomy, Genomics and Biodiversity ++ = ????? Speaker: Benjamin Linard Alfried Vogler Team.
Chapter 21 Eukaryotic Genome Sequences
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
DNA Barcoding and the Consortium for the Barcode of Life Katie Ferrell, Project Manager National Museum of Natural History Smithsonian Institution
Bioinformatics and Computational Biology
Bioinformatics Lecture to accompany BLAST/ORF finder activity
Locating and sequencing genes
No reference available
Chapter 12 Assessment How could manipulating DNA be beneficial?
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
Canadian Bioinformatics Workshops
Introduction Biodiversity is important in an ecosystem because it allows the species living in that ecosystem to adapt to changes made in the environment.
Discussion on Genomic/Metagenomic Data for ANGUS Course Adina Howe.
Boundless Lecture Slides Free to share, print, make copies and changes. Get yours at Available on the Boundless Teaching Platform.
Microbial genomics.
Introduction to Genes and Genomes with Ensembl
Metagenomic Species Diversity.
Introduction to Bioinformatics Resources for DNA Barcoding
Gil McVean Department of Statistics
Quality Control & Preprocessing of Metagenomic Data
Seminar in Bioinformatics (236818)
EDNA analyze Wang Ying & Huang Junman.
Considerations for metagenomics data analysis and summary of workflows
A Fast Hybrid Short Read Fragment Assembly Algorithm
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
The African Soil Microbiology project
COURSE OF MICROBIOLOGY
B3- Olympic High School Bioinformatics
Teagasc/APC Sequencing Facility
Manipulating DNA Chapter 9
2nd (Next) Generation Sequencing
H = -Σpi log2 pi.
Metagenomics Microbial community DNA extraction
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
3.1 Genes Genes and hence genetic information is inherited from parents, but the combination of genes inherited from parents by each offspring will be.
3.1 Genes Essential idea: Every living organism inherits a blueprint for life from its parents. Genes and hence genetic information is inherited from.
BF nd (Next) Generation Sequencing
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Human Genome Project Seminal achievement. Scientific milestone.
Introduction to Bioinformatics
Genome resolved metagenomics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Campus and Phoenix Resources
Toward Accurate and Quantitative Comparative Metagenomics
Presentation transcript:

Toward Next Generation Biodiversity Research   Morne Du Plessis1,2, Monica Mwale1, Essa Suleman1, Emily Mitchell1, Kim Labuschagne1, Desire Dalton1,3 and Antoinette Kotze1,4 1 National Zoological Gardens of South Africa, Research and Scientific Services Department, Pretoria, 0001 2 Department of Biotechnology, University of the Western Cape, Bellville, 7530 3 Department of Zoology, University of Venda, Thohoyandou, South Africa 4 Genetics Department, University of the Free State, Bloemfontein, South Africa

Introduction Merging NGS + eDNA + Bioinformatics = Next Generation Biodiversity assessment Techniques – Variations of metagenomics approaches - Microbiome analysis – 16S - Barcoding - Animals – COI - Plants – rbcL, matk trnH-psbA, ITS - Shotgun sequencing – direct environmental sequencing - Transcriptome analysis – eDNA – Direct from environment (soil, plant matter, animal matter, water) Indirect from environ – Fecal matter – host and what they consume Indirect parasites or insects that feed on other animals (eg bloodfeed analysis)

The Technology – Why the hype Capillary electrophoresis – gold standard for seq avg 800bp generated per sequence NGS on ION S5 - massive parallel sequencing can generate 200bp / 400 bp / 600bp reads Can therefore generate up to 15 Gb of sequence data = 15 000 000 000 bp Additionally has a significant capacity for multiplexing

The Technology – How is this ridiculously large volume of data possible Chip contains millions of wells How it actually works https://biosci-batzerlab.biology.lsu.edu/Genomics/documentation/S5_vs_S5xl_LinkedIn_post.pdf http://www.anthonybaldor.com/semiconductor-sequencing-ion-torrent/

Data Analysis – What sort of data is generated and how much

The Technology – Getting maximum value DNA – Site 1 DNA – Site 2 DNA – Site 3 Shear DNA across all samples seperately Size select across all samples seperately Library prep across all samples seperately Add unique (barcode A) to Site 1 sample Add unique (barcode B) to Site 2 sample Add unique (barcode C) to Site 3 sample Barcodes = short seqs to uniquely tag your respective experiments Merge and NGS together Separate according to barcode Seqs from site 1 Seqs from site 2 Seqs from site 3

The Technology – What does the data look like

Data Analysis – What happens downstream Eg. Shotgun sequencing of environments (getting an idea of the diversity we might encounter) Perform QC of sequences Trim sequences Redo QC Optimize assembly of sequences Generate assemblies Merge into larger scaffolds Group by similarity Generate a reference database for comparison Align scaffolds to references Annotate the aligned hits Categorize Evaluate abundance and diversity

Data Analysis – Checking quality and cleaning Before Trim Sequences After

The Analysis - Assembly of sequence reads

The Analysis – Making biological sense Assembled Sequences Reference database Contig A Gene K from Org A Contig B Gene L from Org B Contig C Whole genome from Org C Mitochondrial genome from Org D Raw sequence data from Org E Annotated sequences Organism A Contig A = Organism C Contig B = Organism E Contig C =

The Analysis – Bioinformatics resources Parallel Processing NZG Bioinformatics server and storage server Also use the Centre for High Performance Computing (CHPC) - DST

What happens to all of the data The raw sequence data is typically stored on the NCBI SRA (sequence retrieval archive) system All assembled genes / molecules (eg. mitochondria) / genomes on NCBI nucleotide / genome database The incidental assembled barcode data will feed back into the relevant barcode projects There is an evolution of specialised databases eg. Qiita – managing microbial studies (Microbiomes) Also keep back-ups of all datasets on our server at NZG

Summary What do we have / what can we supply: Access to: NGS resources Environmental samples / sampling Bioinformatics resources Bioinformatics training for students Expertise in related studies and techniques What we need next: Understand requirements of SANBI in terms of the diversity assessment Evaluate which Next Generation strategies are possible and feasible Strengthening partnerships - shared environments and shared spp. Build bioinformatics capacity in terms of students Benchmarking the next generation strategies vs traditional Adequate mathematical and statistical models to accurately reflect biodiversity

Acknowledgements NRF SANBI DST NZG