PRIORITIZING REGIONS OF CANDIDATE GENES FOR EFFICIENT MUTATION SCREENING.

Slides:

Advertisements

Similar presentations

In Silico Primer Design and Simulation for Targeted High Throughput Sequencing I519 – FALL 2010 Adam Thomas, Kanishka Jain, Tulip Nandu.

Advertisements

Using Parallel Genetic Algorithm in a Predictive Job Scheduling

Statistics in Bioinformatics May 2, 2002 Quiz-15 min Learning objectives-Understand equally likely outcomes, Counting techniques (Example, genetic code,

A new method of finding similarity regions in DNA sequences Laurent Noé Gregory Kucherov LORIA/UHP Nancy, France LORIA/INRIA Nancy, France Corresponding.

Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.

Profiles for Sequences

By Angela Brooks and David Chapman Mentor: Dr. Garry Larson Molecular Medicine, City Of Hope Southern California Bioinformatics Institute 2004.

1.Generate mutants by mutagenesis of seeds Use a genetic background with lots of known polymorphisms compared to other genotypes. Availability of polymorphic.

Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.

Richard, Rochelle, Zohal, Angie

Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)

Using SSCP to Screen for Chicken B Histocompatibility Haplotypes.

Optimized Numerical Mapping Scheme for Filter-Based Exon Location in DNA Using a Quasi-Newton Algorithm P. Ramachandran, W.-S. Lu, and A. Antoniou Department.

Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.

PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)

The Age Of Genomics Rachel and Olga. THE AGE OF GENOMICS Outline HHow Genetics Became Genomics TThe Human Genome Project Begins TTechnology drives.

Statistics in Bioinformatics May 12, 2005 Quiz 3-on May 12 Learning objectives-Understand equally likely outcomes, counting techniques (Example, genetic.

RFLP DNA molecular testing and DNA Typing

Positional cloning: the rest of the story a a a a a a a a X.

Sequencing a genome and Basic Sequence Alignment

Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.

PCR Primer Design

Reading the Blueprint of Life

Whole Exome Sequencing for Variant Discovery and Prioritisation

Biotechnology SB2.f – Examine the use of DNA technology in forensics, medicine and agriculture.

What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.

DNA fingerprinting. DNA fingerprinting is used to determine paternity Look at the DNA of the mother, father and child Could these parents produce this.

DNA Technology.

Chapter 16 Gene Technology. Focus of Chapter u An introduction to the methods and developments in: u Recombinant DNA u Genetic Engineering u Biotechnology.

Restriction Nucleases Cut at specific recognition sequence Fragments with same cohesive ends can be joined.

발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.

1 The Interrupted Gene. Ex Biochem c3-interrupted gene Introduction Figure 3.1.

© 2012 Pearson Education, Inc. Lecture by Edward J. Zalisko PowerPoint Lectures for Campbell Biology: Concepts & Connections, Seventh Edition Reece, Taylor,

Genomes and Genomics.

Sequencing a genome and Basic Sequence Alignment

Chapter 21 Eukaryotic Genome Sequences

Fea- ture Num- ber Feature NameFeature description 1 Average number of exons Average number of exons in the transcripts of a gene where indel is located.

Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.

Molecular Testing and Clinical Diagnosis

E XOME SEQUENCING AND COMPLEX DISEASE : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand 1.

Copyright © 2009 Pearson Education, Inc. Chapter 14 The Genetic Code and Transcription Copyright © 2009 Pearson Education, Inc.

Chapter 7 Analyzing DNA and gene structure, variation and expression 1.Sequencing and genotyping DNA Standard/manual DNA sequencing using dideoxynucleotide.

Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources

Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.

KEY CONCEPT Biotechnology relies on cutting DNA at specific places.

Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.

Chapter 3 The Interrupted Gene.

Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.

A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.

Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.

Gene Technologies and Human ApplicationsSection 3 Section 3: Gene Technologies in Detail Preview Bellringer Key Ideas Basic Tools for Genetic Manipulation.

Welcome to the combined BLAST and Genome Browser Tutorial.

INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.

Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.

What is sequencing? Video: WlxM (Illumina video) WlxM.

Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.

The TDR Targets Database Prioritizing potential drug targets in complete genomes.

Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,

Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.

bacteria and eukaryotes

Results for all features Results for the reduced set of features

Molecular Approaches for Screening of Genetic Diseases

Alu insert, PV92 locus, chromosome 16

Section 3: Gene Technologies in Detail

Evaluating classifiers for disease gene discovery

KEY CONCEPT Entire genomes are sequenced, studied, and compared.

Applying principles of computer science in a biological context

Evaluating Classifiers for Disease Gene Discovery

Forensic DNA Sadeq Kaabi

Presentation transcript:

PRIORITIZING REGIONS OF CANDIDATE GENES FOR EFFICIENT MUTATION SCREENING

Outline  Abstract  Background  Materials and Methods  Results  Discussion  Conclusion

Abstract  Complete sequence of human genome has altered search process for disease-causing mutations  Previously, mostly rare diseases studied. Took years to analyze data  Now, rate-limiting step is screening patients and interpreting results  Tests hypothesis that disease-causing mutations are not uniformly distributed and can be predicted bioinformatically  Developed prioritization of annotated regions (PAR) technique

Abstract  Tested by analyzing 710 genes with 4,498 previously identified mutations  Nearly 50% of disease-associated genes found after analyzing only 9% of complete coding sequence  PAR found 90% of genes as containing at least one mutation using less than 40% of screening resources

Background  When screening for mutations, researchers usually focus on coding sequence  Not enough to show relationship between mutation and disease Ex. Age-related macular degeneration  Today’s techniques:  Single strand conformational polymorphism analysis (SSCP)  Denaturing high-performance liquid chromatography  Automated DNA sequencing

Background  SSCP  Compares conformational differences in strands of DNA of the same length (1)  Denaturing high-performance liquid chromatography  Compares two or more chromosomes as a mixture of denatured and reannealed PCR amplicons, revealing the presence of a mutation by the differential retention of homo- and heteroduplex DNA on reversed-phase chromatography supports under partial denaturation (2)

Background  Through own work, found disease-causing variations are not uniformly distributed throughout sequence Ex. Bardet-Biedl: Restrict to patients with retinitis pigmentosa with ulnar polydactyl Disease-causing mutations more likely lie in structural and functional regions

Materials and Methods  List of 710 genes obtained via OMIM  Cross-referenced with transcripts in Ensembl Release NCBI31  Gene structure and annotated protein domains obtained from Ensembl  Information on mutation locations obtained from OMIM  Secondary structure prediction performed by nnPredict

Materials and Methods  x = nucleotide position  W s = PAR window size  N x = No. distinct annotation elements  W(i) = PAR window function  A f (x,j) = annotation function for jth annotation at xth position  A s (x,j) = annotation score for jth annotation at xth position  A o (x,j) = annotation scalar offset  A m (j) = annotation multiplier for jth annotation feature

Materials and Methods

 Impractical to perform manually for every gene in candidate set  Graphic representation of gene structure of EFEMP1 gene and corresponding PAR values

Materials and Methods  Regions in each gene were identified that maximized PAR function  Primer pair positions selected consistent with default parameters of Primer3 until at least one mutation flanked

Materials and Methods  Other methods used for comparison  Serial Generates minimally overlapping primer pair positions for each exon with same PCR product size requirements Models traditional screening approach Examines complete coding sequence  Random Selects region from any transcript without replacement Continues to select with minimal overlap  Complete screening with laboratory information management system (LIMS)

Results - Efficiency  PAR  Found 90% of mutations with 60% coverage  Serial  Linear: 90% at 90%, 100% at 100%  Random:  Fell short of identifying 100% of mutations

Results

Results – Figure 2  PAR  819 mutations identified in 350 distinct genes using a single best PAR-selected region per gene  Corresponds to 18% of mutations in approximately half the transcripts  Of 1,908,911 nucleotides, PAR selected only 168,980  One mutation was identified in 50% of genes with only 9% of total transcript screened

Results

Results – Figure 3  Serial  Linear relationship between screening resource utilization and number of genes  PAR  Identified 90% of genes with 60% reduction in screening resources  Only one primer pair in each transcript was evaluated and nearly 40% of transcripts found to contain at least one mutation

Discussion  History of genetic screening  PCR  Lengthy clinical work  Therefore, always evaluated entire coding sequence in all patients  Explains current use of serial screening

Discussion  Changes  More common diseases being analyzed More available patients  Availability of genomic sequence Develop PCR-based assay in less than a day with algorithms  More involvement from other professions (engineers, statisticians) Supply tools to keep track of experiments  Realization that many disease-causing mutations do not affect coding sequences

Discussion  Advantages of PAR  Effective use of gene annotation Prioritizes gene segments for screening Conservation of protein structure  Focus on gene segments vs. entire gene Evident that likelihood of finding disease-causing variation in a gene falls with each exon screened with no positive result Serial approach screens all no matter what PAR screens a section with an average chance of finding mutation

Conclusion  Consideration of parameters resulted in significantly higher discoveries per unit of effort  Algorithm can be easily modified and expanded  Most useful for large number of candidate genes in large number of patients  Select best two or four regions in each candidate gene  Screen all as initial screening strategy  Additional screening based on findings from first round and PAR algorithm  Clear PAR approach is preferable to serial screening

References  (1) "Single Strand Conformation Polymorphism." Wikipedia. 28 May Sept  (2) "Single Strand Conformation Polymorphism." Wikipedia. 28 May Sept