IBD Estimation in Pedigrees

Slides:



Advertisements
Similar presentations
A. Novelletto, F. De Rango Dept. Cell Biology, University of Calabria GENOTYPING CONCORDANT / DISCORDANT COUSIN PAIRS.
Advertisements

Tutorial #8 by Ma’ayan Fishelson. Computational Difficulties Algorithms that perform multipoint likelihood computations sum over all the possible ordered.
Software for Incorporating Marker Data in Genetic Evaluations Kathy Hanford U.S. Meat Animal Research Center Agricultural Research Service U.S. Department.
. Exact Inference in Bayesian Networks Lecture 9.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Tutorial #5 by Ma’ayan Fishelson. Input Format of Superlink There are 2 input files: –The locus file describes the loci being analyzed and parameters.
Genetic linkage analysis Dotan Schreiber According to a series of presentations by M. Fishelson.
Basics of Linkage Analysis
. Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
High resolution detection of IBD Sharon R Browning and Brian L Browning Supported by the Marsden Fund.
University of Connecticut
Parallel Genehunter: Implementation of a linkage analysis package for distributed memory architectures Michael Moran CMSC 838T Presentation May 9, 2003.
Tutorial #6 by Ma’ayan Fishelson Based on notes by Terry Speed.
Parametric and Non-Parametric analysis of complex diseases Lecture #8
Genotype Error Detection using Hidden Markov Models of Haplotype Diversity Ion Mandoiu CSE Department, University of Connecticut Joint work with Justin.
Tutorial #11 by Anna Tzemach. Background – Lander & Green’s HMM Recombinations across successive intervals are independent  sequential computation across.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Genotype Error Detection using Hidden Markov Models of Haplotype Diversity Justin Kennedy, Ion Mandoiu, Bogdan Pasaniuc CSE Department, University of Connecticut.
Reconstructing Genealogies: a Bayesian approach Dario Gasbarra Matti Pirinen Mikko Sillanpää Elja Arjas Department of Mathematics and Statistics
Haplotype Analysis based on Markov Chain Monte Carlo
Linkage Analysis in Merlin
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Introduction to QTL analysis Peter Visscher University of Edinburgh
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Population Stratification
Calculation of IBD State Probabilities Gonçalo Abecasis University of Michigan.
CS177 Lecture 10 SNPs and Human Genetic Variation
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Bayesian MCMC QTL mapping in outbred mice Andrew Morris, Binnaz Yalcin, Jan Fullerton, Angela Meesaq, Rob Deacon, Nick Rawlins and Jonathan Flint Wellcome.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
INTRODUCTION TO ASSOCIATION MAPPING
1 Haplotyping Algorithms Qunyuan Zhang Division of Statistical Genomics GEMS Course M Computational Statistical Genetics Mar. 29,
Recombination and Linkage
Estimating Genealogies from Marker Data Dario Gasbarra Matti Pirinen Mikko Sillanpää Elja Arjas Biometry Group Department of Mathematics and Statistics.
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
FINE SCALE MAPPING ANDREW MORRIS Wellcome Trust Centre for Human Genetics March 7, 2003.
1 Genes and MS in Tasmania, cont. Lecture 6, Statistics 246 February 5, 2004.
1 Haplotyping Algorithm Qunyuan Zhang Division of Statistical Genomics GEMS Course M Computational Statistical Genetics Mar. 6, 2008.
Calculation of IBD probabilities David Evans University of Oxford Wellcome Trust Centre for Human Genetics.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Guy Grebla1 Allegro, A new computer program for linkage analysis Guy Grebla.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.
Power in QTL linkage analysis
Regression Models for Linkage: Merlin Regress
Common variation, GWAS & PLINK
Gonçalo Abecasis and Janis Wigginton University of Michigan, Ann Arbor
Constrained Hidden Markov Models for Population-based Haplotyping
HMM in crosses and small pedigrees, cont.
Recombination (Crossing Over)
QTL Mapping in Humans Lon Cardon, Stacey Cherny Goncalo Abecasis
Power to detect QTL Association
Error Checking for Linkage Analyses
Calculation of IBD probabilities
Haplotype Reconstruction
Use of Homozygosity Mapping to Identify a Region on Chromosome 1 Bearing a Defective Gene That Causes Autosomal Recessive Homozygous Hypercholesterolemia.
Association Analysis Spotted history
Lecture 9: QTL Mapping II: Outbred Populations
Linkage Analysis Problems
X-chromosomal markers and FamLinkX
Tutorial #6 by Ma’ayan Fishelson
Gonçalo R. Abecasis, Janis E. Wigginton 
Presentation transcript:

IBD Estimation in Pedigrees Gonçalo Abecasis University of Oxford

3 Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those genes? Linkage analysis What are those genes? Association analysis

Relationship Checking

Where are those genes?

Tracing Chromosomes

Sometimes it is easy…

Sharing, or Not?

Data Polymorphic markers Task Eg. Microsatellite repeats, SNPs Allele frequency Location Task Phase markers Place recombinants

Complexity of the Problem For each meiosis In a pedigree with n non-founders, there are 2n meioses each with 2 possible outcomes For each location One for each of m markers Up to 4nm distinct outcomes

Elston-Stewart Algorithm Factorize likelihood by individual Each step assigns phase for all markers for one individual Complexity  n * 4m Small number of markers Large pedigrees With little inbreeding

Lander-Green Algorithm Factorize likelihood by marker Each step assigns phase For one marker For all individuals in the pedigree Complexity  m * 4n Large number of markers Assumes no interference Relatively small pedigrees

Markov-Chain Monte-Carlo Approximate solutions Explore only most likely outcomes Remove restrictions Pedigree size Number of markers Inbreeding Assuming no interference Computationally intensive

Popular Packages Elston-Stewart Algorithm Lander-Green Algorithm MCMC LINKAGE / FASTLINK (Lathrop et al, 1985) VITESSE (O’Connell and Weeks, 1995) Lander-Green Algorithm Genehunter (Kruglyak et al, 1995) Allegro (Gudbjartsson et al, 2000) MCMC Simwalk2 (Sobel et al, 1996) LOKI (Heath, 1998)

1. Enumerate Possibilities Enumerate gene-flow patterns Gene-flow pattern: Sets transmitted allele for each meiosis Implies founder allele for each individual

2. Founder Allele Sets For each gene flow pattern v Enumerate set A(G,v) All allele states a = [a1, …, a2f] Compatible with both: Gene flow v Genotypes G The likelihood is L(v|G) = 2-2nai f(ai) f(ai) is the frequency of allele ai

Three one alleles required. For example ... Genotypes Gene Flow Founder Alleles Four meioses. Three one alleles required. Likelihood = ½4 f(a1)3

Single Marker Probabilities We now have ... Likelihood for each gene flow pattern Conditional on genotypes Conditional on allele frequencies Conditional on a single marker Probability for each gene-flow pattern P(v) = L(v) / vL(v)

3. Allowing for Recombination Transition Probability T(vavb, ) = (1-)nr(Va,Vb)r(Va,Vb) Transition Matrix Location A Location B

Moving along chromosome Input Vector v of likelihoods at location A Matrix T of transition probabilities AB Output Vector v’ of likelihoods at location B Conditional on likelihoods at A For k vectors, requires k2 operations

Elston and Idury Algorithm Requires k log2 k operations

Moving Along Chromosome

Markov-Chains Single Marker Left Conditional Right Conditional Full Likelihood

MERLIN Fast multipoint calculations Non-parametric linkage analyses Error detection e.g., unlikely obligate recombinants Haplotyping most likely, exhaustive lists, sampling

Sparse Gene Flow Trees

Dense maps Computational challenge Computational advantages Require more memory Require Lander-Green algorithm Limited pedigree size Computational advantages Reduced recombination between markers Approximate solutions possible if steps with many recombinants are ignored

MERLIN: Example Pedigrees

MERLIN: Timings

MERLIN: Memory Usage

Command Line Options

Effect of Genotyping Error Modest levels are likely Up to 1% may be typical Mendelian inheritance checks Detect up to 30% of errors for SNPs Effect on power Linkage vs. Association SNPs vs. Microsatellites

Affected Sib Pair Sample

Unselected Sample

Association Analysis

Error Detection Genotype errors can introduce unlikely recombinants Change likelihood Replace (1-q) with q Test sensitivity of likelihood to each genotype Detects errors that have largest effect on linkage

Practical Exercise Lon Cardon Stacey Cherny