Download presentation
Presentation is loading. Please wait.
1
The Genome Diversity in Africa Project
Manjinder Sandhu Wellcome Trust Sanger Institute
2
Why study genetic variation in Africa?
Genetic and linguistic diversity Has been subjected to various environmental selection forces over time and in different regions High disease burden- genetic determinants of disease poorly understood Few large scale genomic studies in the region
3
Complexities in the study of genomics in Africa
Population structure and differentiation Gene flow from Europe/Asian/middle eastern populations Gene flow within Africa Differences in LD structure between populations Allelic differentiation between populations due to local adaptation
4
Key questions How do we best capture common and rare from African populations? What designs are best for large-scale genomic studies in Africa- how can these be optimised in terms of cost efficiency? How much of variation in African populations is private? Can data from different parts of Africa be combined? Can Africa specific reference panels improve imputation into African populations? How should such reference panels be curated in light of the variation between populations? Do we need an Africa specific chip to capture the variation across Africa?
5
African Genome Variation Project
Aim: to study genetic variation in Africa to inform large scale studies in African genomics Study of 16 ethno-linguistic groups across SSA from populations relevant to medical genomics 100 individuals with dense (2.5M) genotype data in each Largest diversity panel from Africa so far
6
Genome Diversity in Africa Project
APCDR – 14 centres in 10 countries across Africa H3A Africa Wellcome Trust Sanger Institute MRC, Uganda Trypanogen
7
Aims and objectives To characterise genetic variation in populations across Africa To study population structure and differentiation between populations with the objective of informing large-scale studies in African genomics Extend our understanding of our human origins, population history Global resource to help design, implement and interpret genomic studies in Africa and globally To develop capacity for analysis and storage of data in Africa
8
Scientific objectives and cross-cutting activities of the GDAP
9
Sampling to date 4x sequencing of 3 populations completed (320 samples) 2100 more in the pipeline Collections ongoing in East Africa GDAP 1000 G
10
Data processing and output
Illumina Hiseq 4x Read mapping (BWA) Bam improvement (sample level) Populations r2 Concordance Baganda 0.96 0.99 Ethiopia 0.95 Zulu Variant calling (Unified Genotyper) VQSR filtering Genotype refinement (Beagle with 1000 G reference)
11
Functional mutations enriched among rare MAF
12
Population structure Moderate differentiation between Ethiopians and other populations Very little differentiation between Zulu and Baganda Fst differentiation Baganda Zulu Ethiopia 0.008 0.028 0.035
13
Preliminary results: sharing of variants
Zulu Ethiopia 18.7 M N=100 0.5 M 17 M N=120 2.5 M 13.3% 1.6 M 9.4% 14 M 1.7 M 0.9 M Pop Novel variants Zulu 9% Baganda 7% Ethiopia 15% 2.4 M 12.6% Baganda 19 M N=100
14
Generating a new reference panel for imputation
Several questions about how best to process data: Single population calling vs multiple population calling Can these approaches be combined? Best algorithms for genotype refinement Will this panel improve imputation compared to the 1000 Genomes Project panel?
15
Assessing the utility of extremely low coverage designs
What coverage is ideal for large scale genomic studies in Africa How well can we capture common genetic variation with 2x, 1x and 0.5x coverage data in Africa Can we improve capture with better reference panel Trade off of sample size and accuracy
16
Next steps Large-scale sequencing effort- expand collections to central, West Africa and North Africa Characterise variation in populations representative of most regions in Africa, including HG populations Develop a large reference panel for imputation Develop a chip array specific to Africa
17
Acknowledgements WTSI South Africa Deepti Gurdasani Tommy Carstensen
Elizabeth Young Cristina Pomilla Eleftheria Zeggini MRC Uganda Anatoli Kamali Janet Seeley Pontiano Kaleebu Oxford Jonathan Marchini Ayesha Motala Fraser Pirie Brenna Henn Eileen Hoal Ethiopia (WTSI) Chris Tyler Smith Luca Pagani Sanger pipeline teams All participants
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.