The Genome Diversity in Africa Project

Slides:



Advertisements
Similar presentations
Analysis of imputed rare variants
Advertisements

Julia Krushkal 4/11/2017 The International HapMap Project: A Rich Resource of Genetic Information Julia Krushkal Lecture in Bioinformatics 04/15/2010.
SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Genetic Basis of Agronomic Traits Connecting Phenotype to Genotype Yu and Buckler (2006); Zhu et al. (2008) Traditional F2 QTL MappingAssociation Mapping.
Methods and challenges in the analysis of admixed human genomes Simon Gravel Stanford University.
The role of variation in finding functional genetic elements Andy Clark – Cornell Dave Begun – UC Davis.
The 1000 Genomes Project Gil McVean Department of Statistics, Oxford.
POPULATION GENETIC STRUCTURE AND NATURAL VARIATION OF THE MODEL PLANT, ARABIDOPSIS THALIANA, IN ITS NATIVE SOUTHERN RANGE EXTREME AC Brennan, B Méndez-Vigo,
Evolution of Populations
Sequencing TRAF1 in patients with rheumatoid arthritis Bruce C. Jobse Medical and Population Genetics Broad Institute.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
An Efficient Method of Generating Whole Genome Sequence for Thousands of Bulls Chuanyu Sun 1 and Paul M. VanRaden 2 1 National Association of Animal Breeders,
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Utilizing Genomics in genetic improvement Molecular genetics as a tool in wildlife breeding, management and conservation (An African Buffalo case study)
Alexis DereeperCIBA courses – Brasil 2011 Detection and analysis of SNP polymorphisms.
E XOME SEQUENCING AND COMPLEX DISEASE : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand 1.
AP Biology Evolution of Populations AP Biology Populations evolve  Natural selection acts on individuals  differential survival  “survival.
Gene Bank Biodiversity for Wheat Prebreeding
HW2: exome sequencing and complex disease Jacquemin Jonathan de Bournonville Sébastien.
The HapMap Project and Haploview
The International Consortium. The International HapMap Project.
Current Data And Future Analysis Thomas Wieland, Thomas Schwarzmayr and Tim M Strom Helmholtz Zentrum München Institute of Human Genetics Geneva, 16/04/12.
Analysis of Next Generation Sequence Data BIOST /06/2015.
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Date of download: 7/2/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Clinical Interpretation and Implications of Whole-Genome.
Million Veteran Program: Industry Day Genomic Data Processing and Storage Saiju Pyarajan, PhD and Philip Tsao, PhD Million Veteran Program: Industry Day.
Strengthening TB and HIV&AIDS Responses in East Central Uganda Strengthening Laboratory TB diagnostic capacity of peripheral laboratories in East Central.
Interpreting exomes and genomes: a beginner’s guide
Targeted Next Generation Sequencing (tNGS) in Anxiety Disorders
Common variation, GWAS & PLINK
Cancer Genomics Core Lab
Quality control for GWAS
Application, Forecast to 2022
Gil McVean Department of Statistics
Natural Variation and the Genetic Basis of
Population Structure and History in Sub-Saharan Africa
Signatures of Selection
Invest. Ophthalmol. Vis. Sci ;57(10): doi: /iovs Figure Legend:
The African Soil Microbiology project
Tell me the difference between and all that you know about…
Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
High level GWAS analysis
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Patterns of Linkage Disequilibrium in the Human Genome
Natural Selection The Mechanism of Evolution.
Validation of a Next-Generation Sequencing Pipeline for the Molecular Diagnosis of Multiple Inherited Cancer Predisposing Syndromes  Paula Paulo, Pedro.
Power to detect QTL Association
Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool  Luca Pagani, Toomas Kivisild, Ayele Tarekegn,
Beyond GWAS Erik Fransen.
Mechanisms of Evolution
Natural Selection The Mechanism of Evolution.
Tracing the Route of Modern Humans out of Africa by Using 225 Human Genome Sequences from Ethiopians and Egyptians  Luca Pagani, Stephan Schiffels, Deepti.
Scanning the environment: The global perspective on the integration of non-traditional data sources, administrative data and geospatial information Sub-regional.
Wisconsin Genomics Initiative
Genome Science Theme Seminar
Track the Split of Crocodile Sub Populations
Summary of the population genetics measures used in this study.
Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations  Marc Haber, Massimo Mezzavilla, Anders Bergström, Javier.
Structural Architecture of SNP Effects on Complex Traits
Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations  Marc Haber, Massimo Mezzavilla, Anders Bergström, Javier.
Genotype Imputation with Millions of Reference Samples
Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium  Christopher S. Carlson,
IUCN’s Global Priorities supporting BIOPAMA Trevor Sandwith
Analysis of protein-coding genetic variation in 60,706 humans
Costs of Operating Population-Based Cancer Registries: Results from Four Sub-Saharan African Countries Florence Tangka, PhD Senior Health Economist, Division.
Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms  Carl A. Anderson, Fredrik H. Pettersson,
Rare Variant Association Tests for Multiple Ancestries Using Common Controls Megan Sorenson July 29, 2019.
How Africa thinks & feels about vaccines, science and health
Presentation transcript:

The Genome Diversity in Africa Project Manjinder Sandhu Wellcome Trust Sanger Institute

Why study genetic variation in Africa? Genetic and linguistic diversity Has been subjected to various environmental selection forces over time and in different regions High disease burden- genetic determinants of disease poorly understood Few large scale genomic studies in the region

Complexities in the study of genomics in Africa Population structure and differentiation Gene flow from Europe/Asian/middle eastern populations Gene flow within Africa Differences in LD structure between populations Allelic differentiation between populations due to local adaptation

Key questions How do we best capture common and rare from African populations? What designs are best for large-scale genomic studies in Africa- how can these be optimised in terms of cost efficiency? How much of variation in African populations is private? Can data from different parts of Africa be combined? Can Africa specific reference panels improve imputation into African populations? How should such reference panels be curated in light of the variation between populations? Do we need an Africa specific chip to capture the variation across Africa?

African Genome Variation Project Aim: to study genetic variation in Africa to inform large scale studies in African genomics Study of 16 ethno-linguistic groups across SSA from populations relevant to medical genomics 100 individuals with dense (2.5M) genotype data in each Largest diversity panel from Africa so far

Genome Diversity in Africa Project APCDR – 14 centres in 10 countries across Africa H3A Africa Wellcome Trust Sanger Institute MRC, Uganda Trypanogen

Aims and objectives To characterise genetic variation in populations across Africa To study population structure and differentiation between populations with the objective of informing large-scale studies in African genomics Extend our understanding of our human origins, population history Global resource to help design, implement and interpret genomic studies in Africa and globally To develop capacity for analysis and storage of data in Africa

Scientific objectives and cross-cutting activities of the GDAP

Sampling to date 4x sequencing of 3 populations completed (320 samples) 2100 more in the pipeline Collections ongoing in East Africa GDAP 1000 G

Data processing and output Illumina Hiseq 4x Read mapping (BWA) Bam improvement (sample level) Populations r2 Concordance Baganda 0.96 0.99 Ethiopia 0.95 Zulu Variant calling (Unified Genotyper) VQSR filtering Genotype refinement (Beagle with 1000 G reference)

Functional mutations enriched among rare MAF

Population structure Moderate differentiation between Ethiopians and other populations Very little differentiation between Zulu and Baganda Fst differentiation Baganda Zulu Ethiopia 0.008 0.028 0.035

Preliminary results: sharing of variants Zulu Ethiopia 18.7 M N=100 0.5 M 17 M N=120 2.5 M 13.3% 1.6 M 9.4% 14 M 1.7 M 0.9 M Pop Novel variants Zulu 9% Baganda 7% Ethiopia 15% 2.4 M 12.6% Baganda 19 M N=100

Generating a new reference panel for imputation Several questions about how best to process data: Single population calling vs multiple population calling Can these approaches be combined? Best algorithms for genotype refinement Will this panel improve imputation compared to the 1000 Genomes Project panel?

Assessing the utility of extremely low coverage designs What coverage is ideal for large scale genomic studies in Africa How well can we capture common genetic variation with 2x, 1x and 0.5x coverage data in Africa Can we improve capture with better reference panel Trade off of sample size and accuracy

Next steps Large-scale sequencing effort- expand collections to central, West Africa and North Africa Characterise variation in populations representative of most regions in Africa, including HG populations Develop a large reference panel for imputation Develop a chip array specific to Africa

Acknowledgements WTSI South Africa Deepti Gurdasani Tommy Carstensen Elizabeth Young Cristina Pomilla Eleftheria Zeggini MRC Uganda Anatoli Kamali Janet Seeley Pontiano Kaleebu Oxford Jonathan Marchini Ayesha Motala Fraser Pirie Brenna Henn Eileen Hoal Ethiopia (WTSI) Chris Tyler Smith Luca Pagani Sanger pipeline teams All participants