Mapping analysis software Dr Ian Carr PhD. MCSD. Leeds Institute of Molecular Medicine St Jamess University Hospital
Autozygosity But! LA = local (common) ancestor LI = local inheritance
Autozygosity You only know part of the picture And What you dont know can be more important than what you do know DA = distant (common) ancestor DI = distant inheritance
Analysis New way New way Send DNA off with £300 per sample Send DNA off with £300 per sample Wait three weeks Wait three weeks Stare at a million uninformative SNPs worth of data and wonder what to do with it! Stare at a million uninformative SNPs worth of data and wonder what to do with it! Old way Old way Spend 1.5 years mapping a family with highly informative microsatellites Spend 1.5 years mapping a family with highly informative microsatellites Analyse data as you go Analyse data as you go Hope you find something! Hope you find something!
AutoSNPa What is it: What is it: Its one big database which draws pretty pictures Its one big database which draws pretty pictures There is no maths, because there is no complete knowledge of the system There is no maths, because there is no complete knowledge of the system Assumptions Assumptions All affecteds are consanguineous and have the same mutation and hence a common haplotype All affecteds are consanguineous and have the same mutation and hence a common haplotype
AutoSNPA: Pedigree one First family First family Results Results 135Mb region on chromosome 4 135Mb region on chromosome 4 Out come Out come To many genes: Move on. To many genes: Move on. 135Mb
AutoSNPA: Pedigree two Two families Two families New Results New Results 45Mb Region on chromosome 4 45Mb Region on chromosome 4 Out come Out come Still to many genes: Move on Still to many genes: Move on 45Mb
AutoSNPA: Pedigree three Three families Three families New new Results New new Results 4.5Mb region on chromosome 4 4.5Mb region on chromosome 4 Out come Out come 8 genes, one good candidate: Sequenced it and published. 8 genes, one good candidate: Sequenced it and published. 4.5Mb
The problem with AutoSNPa It requires a large family with multiple affected people who will give a DNA sample or a number of families with the same founder mutation. It requires a large family with multiple affected people who will give a DNA sample or a number of families with the same founder mutation. In reality large families are rare as hens teeth and a each family tends to have its own mutation. In reality large families are rare as hens teeth and a each family tends to have its own mutation.
IBDFinder What is it: What is it: Its another big database which draws pretty pictures Its another big database which draws pretty pictures Again no maths Again no maths Assumptions Assumptions The affecteds are consanguineous and most have mutations in the same gene. The affecteds are consanguineous and most have mutations in the same gene.
Disease has social stigma, so no pedigree data Disease has social stigma, so no pedigree data Most unrelated to each other. Most unrelated to each other. 2 have mutations in a different gene. 2 have mutations in a different gene. 2 have an IBD region of one SNP in the data set 2 have an IBD region of one SNP in the data set Molar pregnancies and IBDFinder Number of patients homozygous for the region 19p-tel19q-tel
Milk drinkers and IBDfinder The ability for adults to drink milk is relatively new and there are only a few genotypes that have the phenotype. Therefore most of us are homozygous for the LCT gene on chromosome 2 The ability for adults to drink milk is relatively new and there are only a few genotypes that have the phenotype. Therefore most of us are homozygous for the LCT gene on chromosome 2
Problems with IBDfinder DNA from affecteds is not always easy to come by. DNA from affecteds is not always easy to come by.
SAMPLE Shadow Autozygosity MaPping by Linkage Exclusion What is it: What is it: A program that finds disease genes without the DNA of an affected patient, only DNA from the parents and siblings of affecteds. A program that finds disease genes without the DNA of an affected patient, only DNA from the parents and siblings of affecteds. Assumptions: Assumptions: An inbreed family is 3 times more likely to have an unaffected kid than an affected one, none of whom will be homozygous for the disease causing allele. An inbreed family is 3 times more likely to have an unaffected kid than an affected one, none of whom will be homozygous for the disease causing allele.
Meckel-Gruber Syndrome (MKS3) DNA available from individuals with yellow symbols. No data from affected individuals DNA available from individuals with yellow symbols. No data from affected individuals
SAMPLE test data SAMPLE excludes most of the genome (~98%) and the remaining regions can be checked using microsatellites.
Problems with SAMPLE All the pedigree have to have a mutation in the same gene. All the pedigree have to have a mutation in the same gene. It works at the level of individual SNPs and does not consider extended haplotypes. It works at the level of individual SNPs and does not consider extended haplotypes.
Phaser What is it What is it A program that uses logic to determine the phase of the genotypes of the SNPs on each chromosome. A program that uses logic to determine the phase of the genotypes of the SNPs on each chromosome. It can then calculate how autozygous each person is, how related a pedigree is to another and to find common haplotypes in affecteds. It can then calculate how autozygous each person is, how related a pedigree is to another and to find common haplotypes in affecteds. Requirements Requirements It needs SNP data for parents and at less two children and ideally a number of pedigrees. It needs SNP data for parents and at less two children and ideally a number of pedigrees.
Meckel-Gruber Syndrome (MKS3) Phaser identifies segments of chromosomes present individuals allowing the user to analysis dominant and recessive diseases. Phaser identifies segments of chromosomes present individuals allowing the user to analysis dominant and recessive diseases.
Degree of relatedness
By knowing how related two pedigrees are, it is possible to judge how likely they are to have a common haplotype By knowing how related two pedigrees are, it is possible to judge how likely they are to have a common haplotype
The problem with Phaser It has not been tested exhaustively and so may not work! It has not been tested exhaustively and so may not work!
Sequence analysis Sanger sequencing mutation detection Sanger sequencing mutation detection Next generation clonal sequencing mutation detection Next generation clonal sequencing mutation detection
Genescreen Rapid detection and annotation of sequence variants Rapid detection and annotation of sequence variants
Annotation of simple mutations Single base mutations are automatically annotated with genomic, cDNA and protein information. Single base mutations are automatically annotated with genomic, cDNA and protein information.
Annotation of complex mutations Heterozygous indels are deconvoluted and annotated. This window also annotates indels and homozygous insertions and deletions Heterozygous indels are deconvoluted and annotated. This window also annotates indels and homozygous insertions and deletions
Exporting data Plain text, LOVD import file or a web page. Plain text, LOVD import file or a web page. The webpage is updatable and so acts a data display and data base. The webpage is updatable and so acts a data display and data base.
Clonal sequencing Nothing lasts for ever so the current sequencing project is to create a program that analysers Illumina sequence data. Nothing lasts for ever so the current sequencing project is to create a program that analysers Illumina sequence data. At the moment the base program analyses data at a rate of 3.6 billion bases an hour or 320Mb of data a minute. At the moment the base program analyses data at a rate of 3.6 billion bases an hour or 320Mb of data a minute.
Underlying data for a heterozygous base change
Underlying data for a heterozygous base pair insertion
All released programs can be obtained from: