DNA Analysis Techniques for Molecular Genealogy Luke Hutchison Project Supervisor: Scott R. Woodward.

Slides:



Advertisements
Similar presentations
Population Genetics 3 We can learn a lot about the origins and movements of populations from genetics Did all modern humans come from Africa? Are we derived.
Advertisements

Genetic Linkage and Recombination
Lecture 39 Prof Duncan Shaw. Meiosis and Recombination Chromosomes pair upDNA replication Chiasmata form Recombination 1st cell division 2nd cell divisionGametes.
Crossing Over in Meiosis
Linkage analysis Nasir Mehmood Roll No: Linkage analysis Linkage analysis is statistical method that is used to associate functionality of genes.
Evolution of genomes.
Combinatorial Algorithms for Haplotype Inference Pure Parsimony Dan Gusfield.
Unit 5 Genetics Terry Kotrla, MS, MT(ASCP)BB. Terminology  Genes  Chromosomes  Autosome  Sex chromosome  Locus  Alleles  Homozygous  Heterozygous.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Basics of Linkage Analysis
A New Nonparametric Bayesian Model for Genetic Recombination in Open Ancestral Space Presented by Chunping Wang Machine Learning Group, Duke University.
Patterns of inheritance
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
FOR FRESHERS Mendelian Inheritance. Mendelian inheritance There are two alleles of a gene on different sister chromosomes. Dominant alleles trump recessive.
DNA marker analysis Mrs. Stewart Medical Interventions Central Magnet School.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
DATA ANALYSIS Module Code: CA660 Lecture Block 2.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
Reconstructing Genealogies: a Bayesian approach Dario Gasbarra Matti Pirinen Mikko Sillanpää Elja Arjas Department of Mathematics and Statistics
Pedigrees.
RFLP DNA molecular testing and DNA Typing
Intro Genetics and Meiosis
The ADSHEAD DNA Project What can it do for us ?. DNA Summary Some Issues Some Possible Targets Quick look at a bit of Technology.
Population Genetics 101 CSE280Vineet Bafna. Personalized genomics April’08Bafna.
Chromosomes, Mapping, and the Meiosis-Inheritance Connection
Chapter 13 Meiosis. What is Genetics? Genetics is the scientific study of heredity and variation Heredity is the transmission of traits from one generation.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
National Taiwan University Department of Computer Science and Information Engineering Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Mendel and Heredity What does segregation imply? This happens with your chromosomes We have 2 copies for each chromosome but can only give 1 copy to the.
Chromosomes, Mapping, and the Meiosis-Inheritance Connection Chapter 13.
1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004.
Simon Myers, Gil McVean Department of Statistics, Oxford Recombination and genetic variation – models and inference.
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
Models and their benefits. Models + Data 1. probability of data (statistics...) 2. probability of individual histories 3. hypothesis testing 4. parameter.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Chapter 13.  Living organisms are distinguished by their ability to reproduce their own kind.  Genetics: is the scientific study of heredity and variation.
Evolution within a species Aims: Must be able to state the observations and subsequent deductions that Darwin and Wallace based their theories on. Should.
KEY CONCEPT Molecular clocks provide clues to evolutionary history.
Overview: Variations on a Theme Living organisms are distinguished by their ability to reproduce their own kind Genetics is the scientific study of heredity.
GENETICS REVIEW. A physical trait that shows as a result of an organism’s particular genotype. PHENOTYPE.
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Chapter 14 - The Human Genome
Linkage & Recombination
9 Genes, chromosomes and patterns of inheritance.
Genetics – Study of heredity is often divided into four major subdisciplines: 1. Transmission genetics, deals with the transmission of genes from generation.
Simple-Sequence Length Polymorphisms SSLPs Short tandemly repeated DNA sequences that are present in variable copy numbers at a given locus. Scattered.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
Fixed Parameters: Population Structure, Mutation, Selection, Recombination,... Reproductive Structure Genealogies of non-sequenced data Genealogies of.
A Little Intro to Statistics What’s the chance of rolling a 6 on a dice? 1/6 What’s the chance of rolling a 3 on a dice? 1/6 Rolling 11 times and not getting.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Restriction Fragment Length Polymorphism. Definition The variation in the length of DNA fragments produced by a restriction endonuclease that cuts at.
KEY CONCEPT Molecular clocks provide clues to evolutionary history.
Chromosomal Basis of Inheritance Lecture 13 Fall 2008
Recombination (Crossing Over)
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
When doing GENETIC mapping, Molecular Markers can be used as a locus
KEY CONCEPT Molecular clocks provide clues to evolutionary history.
KEY CONCEPT Molecular clocks provide clues to evolutionary history.
KEY CONCEPT Molecular clocks provide clues to evolutionary history.
Chromosomes and Meiosis
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
KEY CONCEPT Molecular clocks provide clues to evolutionary history.
Outline Cancer Progression Models
Genome-Wide Association Studies: Present Status and Future Directions
Chromosomes and Meiosis
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Presentation transcript:

DNA Analysis Techniques for Molecular Genealogy Luke Hutchison Project Supervisor: Scott R. Woodward

Mission: The BYU Center for Molecular Genealogy To establish the world’s most comprehensive genetic and genealogical database.To establish the world’s most comprehensive genetic and genealogical database. To create tools for reconstruction of genealogies from DNATo create tools for reconstruction of genealogies from DNA To establish genetic links between families throughout the world.To establish genetic links between families throughout the world.

Molecular Genealogy: Process 100,000 DNA samples and genealogies are being collected from 500 different populations100,000 DNA samples and genealogies are being collected from 500 different populations Common ancestors and population structure are inferred [population and quantitative genetics]Common ancestors and population structure are inferred [population and quantitative genetics] A searchable database is being produced for DNA-based genealogical researchA searchable database is being produced for DNA-based genealogical research

?

Common Ancestor Unknown genealogy (Suffolk, England, ca. 1893: 33%; Glasgow, Scotland, ca. 1905: 12%;...)

Each individual carries within their DNA a record of who they are and how they are related to all other people.Each individual carries within their DNA a record of who they are and how they are related to all other people. You received all of your DNA from your two parents (50% from each).You received all of your DNA from your two parents (50% from each). Specific regions of DNA have properties that can:Specific regions of DNA have properties that can: Identify an individual Identify an individual Link them to a family Link them to a family Identify extended family groups (tribes or clans) Identify extended family groups (tribes or clans) What is the Basis of Molecular Genealogy?

3 major types of genetic data Y ChromosomeY Chromosome Males only, paternal inheritanceMales only, paternal inheritance Haploid, none or little recombinationHaploid, none or little recombination 0.51% of an individual's total genetic information0.51% of an individual's total genetic information Mitochondrial DNAMitochondrial DNA Both males and females, maternal inheritanceBoth males and females, maternal inheritance Haploid, none or little recombinationHaploid, none or little recombination % of an individual's total genetic information0.0006% of an individual's total genetic information Autosomal (Nuclear)Autosomal (Nuclear) Both males and females, inherited equally from both parentsBoth males and females, inherited equally from both parents Diploid, undergoes recombination at each generationDiploid, undergoes recombination at each generation >99% of your genetic information>99% of your genetic information

Y chromosome Mitochondrial Autosomal (nuclear)

Genotypic and Genealogical data

Sequence and Length polymorphisms

Types of DNA Data Extracted Pair of alleles (numbers of repeats) for a locus (e.g.. 121,123)Pair of alleles (numbers of repeats) for a locus (e.g.. 121,123) Linked loci (close together in chromosome)Linked loci (close together in chromosome) Unlinked loci (distant enough from each other to be genetically unrelated, due to the high probability of a crossover occurring between the markers; the presence of one does not imply the presence of the other)Unlinked loci (distant enough from each other to be genetically unrelated, due to the high probability of a crossover occurring between the markers; the presence of one does not imply the presence of the other)

Linked Loci: “Haplotypes” The probability of a crossover event occurring in the middle of a haplotype is low, since the loci are tightly linked.The probability of a crossover event occurring in the middle of a haplotype is low, since the loci are tightly linked. Haplotypes are therefore likely to be passed down intact for many generations.Haplotypes are therefore likely to be passed down intact for many generations.

Haplotyping Problem: Correct order of the genetic information in a pair is unknown (which allele came from which parental chromosome?): 121,123or123,121 ?Problem: Correct order of the genetic information in a pair is unknown (which allele came from which parental chromosome?): 121,123or123,121 ? The problem compounds for linked loci: 121|123121|123123|121} 142|144144|142142|144}... (x 2³=8) 115|119115|119115|119}The problem compounds for linked loci: 121|123121|123123|121} 142|144144|142142|144}... (x 2³=8) 115|119115|119115|119} Finding which alleles occur together on the same chromosome for linked loci (the haplotypes) is called hapotyping. The alignment is called the phase.Finding which alleles occur together on the same chromosome for linked loci (the haplotypes) is called hapotyping. The alignment is called the phase.

Properties of Haplotypes Populations which do not inter-breed each develop a distinctive distribution of haplotypes.Populations which do not inter-breed each develop a distinctive distribution of haplotypes. Haplotypes may eventually appear (due to mutation and/or crossover) that do not exist in any other populationHaplotypes may eventually appear (due to mutation and/or crossover) that do not exist in any other population Haplotypes give much more discerning power than alleles alone, since there are many possible haplotypes given a set of possible alleles at each locusHaplotypes give much more discerning power than alleles alone, since there are many possible haplotypes given a set of possible alleles at each locus

Haplotyping: A Cyclic Problem We could figure out the most likely phase for the alleles in a haplotype if we knew the haplotype distributions of the parent populationsWe could figure out the most likely phase for the alleles in a haplotype if we knew the haplotype distributions of the parent populations We could figure out the haplotype distributions of the parent populations if we knew the correct phase of the allelesWe could figure out the haplotype distributions of the parent populations if we knew the correct phase of the alleles ?

Haplotyping: A Cyclic Solution (1) First guess for phase probs: all equal (0.125)(1) First guess for phase probs: all equal (0.125) (3), (5),... Estimate phase probabilities based on the current estimate of population haplotype probabilities(3), (5),... Estimate phase probabilities based on the current estimate of population haplotype probabilities (2), (4),... Estimate population haplotype probabilities based on the current estimate of phase probabilities(2), (4),... Estimate population haplotype probabilities based on the current estimate of phase probabilities

Haplotyping: Results Convergence typically achieved in 3-7 iterationsConvergence typically achieved in 3-7 iterations Difficult to judge accuracy since nobody knows how to get the true correct answer!Difficult to judge accuracy since nobody knows how to get the true correct answer! Previous researchers’ attempts on simulated data: 50-60% accuracyPrevious researchers’ attempts on simulated data: 50-60% accuracy Our algorithm on (different) simulated data: 97%Our algorithm on (different) simulated data: 97% Our algorithm on real data (accuracy measured by 'spiking' with CEPH data): 88-93%Our algorithm on real data (accuracy measured by 'spiking' with CEPH data): 88-93%

How to Contact Us Phone: Mail:788 WIDB BYU Provo, UT Web: