Imputation Matt Spangler University of Nebraska-Lincoln.

Slides:



Advertisements
Similar presentations
Linkage analysis Nasir Mehmood Roll No: Linkage analysis Linkage analysis is statistical method that is used to associate functionality of genes.
Advertisements

Why this paper Causal genetic variants at loci contributing to complex phenotypes unknown Rat/mice model organisms in physiology and diseases Relevant.
Understanding Conventional and Genomic EPDs
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
Matt Spangler University of Nebraska- Lincoln DEVELOPMENT OF GENOMIC EPD: EXPANDING TO MULTIPLE BREEDS IN MULTIPLE WAYS.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Linkage and Gene Mapping. Mendel’s Laws: Chromosomes Locus = physical location of a gene on a chromosome Homologous pairs of chromosomes often contain.
Chapter 10 Mendel and Meiosis.
Basics of Linkage Analysis
Bob Weaber, Ph.D. Cow-Calf Extension Specialist Assistant Professor Dept. of Animal Sciences and Industry
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Meiosis and Genetic Variation
From sequence data to genomic prediction
BrownBagger October National Program for Genetic Improvement of Feed Efficiency in Beef Cattle
Microevolution Chapter 18 contined. Microevolution  Generation to generation  Changes in allele frequencies within a population  Causes: Nonrandom.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Non-Linear Problems General approach. Non-linear Optimization Many objective functions, tend to be non-linear. Design problems for which the objective.
How Genomics is changing Business and Services of Associations Dr. Josef Pott, Weser-Ems-Union eG, Germany.
Introduction to Animal Breeding & Genomics Sinead McParland Teagasc, Moorepark, Ireland.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Non-Mendelian Genetics
An Efficient Method of Generating Whole Genome Sequence for Thousands of Bulls Chuanyu Sun 1 and Paul M. VanRaden 2 1 National Association of Animal Breeders,
Bovine Genomics The Technology and its Applications Gerrit Kistemaker Chief Geneticist, Canadian Dairy Network (CDN) Many slides were created by.
Genomics. Finding True Genetic Merit 2 Dam EPD Sire EPD Pedigree Estimate EPD TRUE Progeny Difference Mendelian Sampling Effect Adapted from Dr. Bob Weaber.
2007 Paul VanRaden Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2009 Mixing Different SNP Densities Mixing Different.
Gene Hunting: Linkage and Association
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
 Linked Genes Learning Objective DOT Point: predict the difference in inheritance patterns if two genes are linked Sunday, June 05,
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (1) J. R. O’Connell 1 and P. M. VanRaden 2 1 University of Maryland School of Medicine,
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Methods in genome wide association studies. Norú Moreno
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
INTRODUCTION TO ASSOCIATION MAPPING
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
Mendel, Genes and Gene Interactions §The study of inheritance is called genetics. A monk by the name of Gregor Mendel suspected that heredity depended.
Council on Dairy Cattle Breeding April 27, 2010 Interpretation of genomic breeding values from a unified, one-step national evaluation Research project.
2007 Paul VanRaden and Melvin Tooker* Animal Improvement Programs Laboratory 2010 Gains in reliability from combining subsets.
The International Consortium. The International HapMap Project.
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Ch. 10.4: Meiosis & Mendel’s Principles Objectives: 1.Summarize the chromosome theory of inheritance. 2.Explain how genetic linkage provides exceptions.
G.R. Wiggans 1, T. A. Cooper 1 *, K.M. Olson 2 and P.M. VanRaden 1 1 Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville,
2007 Paul VanRaden Animal Improvement Programs Laboratory, USDA Agricultural Research Service, Beltsville, MD, USA 2008 New.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD Select Sires‘ Holstein.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 2011 National Breeders.
Jack Ward American Hereford Association October 31, 2012.
Mendel & Genetic Variation Chapter 14. What you need to know! The importance of crossing over, independent assortment, and random fertilization to increasing.
Genome-Wides Association Studies (GWAS) Veryan Codd.
Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.
My vision for dairy genomics
Part 2: Genetics, monohybrid vs. Dihybrid crosses, Chi Square
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Recombination (Crossing Over)
6.2-Inheritance of Linked Genes
Genes may be linked or unlinked and are inherited accordingly.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Linkage: Statistically, genes act like beads on a string
Linkage Genes that are physically located on the same chromosome are said to be “linked”. Linked genes are said to be “mapped” to the same chromosome.
Using recombination to maintain genetic diversity
Genomic Evaluations.
Linkage Genes that are physically located on the same chromosome are said to be “linked”. Linked genes are said to be “mapped” to the same chromosome.
Genomic Selection in Dairy Cattle
Perspectives from Human Studies and Low Density Chip
Using Haplotypes in Breeding Programs
Recombinant Inbred Strains: Step 1: Initial Mendelian Cross
Presentation transcript:

Imputation Matt Spangler University of Nebraska-Lincoln

Imputation Imputation creates data that were not actually collected Imputation allows us to retain observations that would otherwise be left out of an analysis

Imputation—Not New  Imputation is common when data are analyzed with regression analysis  Like genome-wide association studies  Observations only included in the model if they have values for all variables (e.g. SNP)  If there are a lot of variables (e.g. SNP)the probability of missing values increases  If we have 50,000 SNP there is a very good chance some of them are missing or “not called”

General Methods for Imputation  Random  Assigns values randomly based on a desired distribution  Deterministic  Assigns values based on existing knowledge  Mean of all non-missing SNP at a locus from a population  Missing values replaced with a set of likely values

Example AnimalHerdSexBirth WeightCalving Difficulty 1MineM1003 2MineF802 3YoursM751 4YoursF701 5MineM115? 6MineF902 7YoursM701 8YoursM65?

Example AnimalHerdSexBirth WeightCalving Difficulty 1MineM1003 2MineF802 3YoursM751 4YoursF701 5MineM1153 6MineF902 7YoursM701 8YoursM651

Imputation--Genomics  Method of assigning missing genotypes based on actual genotypes of related animals  Requires relatives to have been genotyped with higher density assays  More animals genotyped with a higher density  Closely related animals genotyped with higher density  Allows other animals (younger) to be genotyped with lower density (cheaper) assays

Imputation--Genomics  Based on dividing the genotype into individual chromosomes (maternal and paternal contributions)  Missing SNP assigned by tracking inheritance from ancestors and/or descendants  Allows for the merger of various SNP densities by imputing SNP not on LD (or non-target) panels.

SNP Densities Not Static  50K was/is the backbone  HD; 770K  GGP-HD; 9K (8K in common with 50K)  GGP-LD; 77K (27K in common with 50K)  More changes coming  Current approaches to MBV require common SNP density

Practical Need  Alternative is to genotype every animal at high-density  As SNP platforms change, animals would have to be re- genotyped  Allows two opportunities:  Going from lower density to higher density  Allow wide-spread use of cheaper panels  Going from higher density to lower density  Allows use of large training sets built on the lower density  First use of imputation in NCE

Linkage The tendency of certain loci to be inherited together Loci that are close to each other on chromosome tend to stay together during meiosis Crossing over (recombination) breaks up linkage. Slide Courtesy of Bob Weaber

“Blocks” of Alleles Inherited Together paternal maternal Chromosome pair Sometimes entire chromosome inherited intact More often a crossover produces a new recombinant There may be two or more (rarely) crossovers

Inheritance Example paternal maternal Chromosome pair Consider a small window of 1 Mb (1% of the genome) Example Adapted from Garrick

Single Parent Commonality in Small Sections paternal maternal Chromosome pair Offspring mostly segregate green or red Green haplotype (paternal chromosome) Red haplotype (maternal chromosome)

…..TCACCGCTGAG….. …..CAGATAGGATT….. …..??G??????A??…. …..??T??????T??….. Sire (HD) Offspring (LD) Imputation ….CAGATAGGATT….. …..??T??????T??….. Offspring (Imputed)

…..TCACCGCTGAG….. …..CAGATAGGATT….. …..??C??????T??…. …..??T??????T??….. Sire (HD) Offspring (LD) Imputation—Parentage Check

…..TCACCGCTGAG….. …..CAGATAGGATT….. …..??G??????A??…. …..??T??????T??….. Sire (HD) Offspring (LD) Imputation MGS Dam

LD HD (770K) 50k chip Imputation Scenarios 80k All seek to reduce cost --Genotyping animals with cheaper panels --Leveraging existing training sets

Trait50KImputed 50K (GGP-LD) BWT WWT YWT MILK MARB REA HEREFORD DATA (N=1,081) Saatchi et al., 2013

HEREFORD DATA (N=1,081) Trait50KImputed 50K (GGP-HD) BWT WWT0.40 YWT0.20 MILK MARB REA Saatchi et al., 2013

Practical Concerns  When to shift to LD panels  Must develop adequate training population (50K or 80K) first  After migration, how to get HD to allow for imputation in the future  Genotype moderate to high accuracy sires at higher density  Without this imputation accuracy will erode overtime  Breeds subsidize this in some form

Optimize Future HD genotyping?  Likely to be ad-hoc right now  Once a sire reaches x accuracy regenotype with higher density  Need to move forward with an optimal approach  Ex: Does it make sense to do full sibs?  Conditional on limited resources ($) choice matters  Impute variants from sequence into existing SNP platforms

Conclusions  Imputation works and can decrease the cost of genotyping  Does not negate the need for a training set with higher density  Plan must be in place to ensure imputation accuracy persists overtime  Add “new” animals with higher density genotypes

Thank You!   