EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Genome Wide Haplotype analyses of human.

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

applications of genome sequencing projects
CZ5225 Methods in Computational Biology Lecture 9: Pharmacogenetics and individual variation of drug response CZ5225 Methods in Computational Biology.
Ivan P. Gorlov, Olga Y. Gorlova & Christopher I. Amos.
Major insights from the HGP on Nature (2001) 15 th Feb Vol 409 special issue; pgs 814 & )Gene content 2)Proteome content 3)SNP identification.
Ferdinand van ’t Hooft Cardiovascular Genetics and Genomics Group Karolinska Institutet, Stockholm, Sweden Genome-Wide Association Study GWAS
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Dr. Almut Nebel Dept. of Human Genetics University of the Witwatersrand Johannesburg South Africa Significance of SNPs for human disease.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
The Extraction of Single Nucleotide Polymorphisms and the Use of Current Sequencing Tools Stephen Tetreault Department of Mathematics and Computer Science.
Single nucleotide polymorphisms Usman Roshan. SNPs DNA sequence variations that occur when a single nucleotide is altered. Must be present in at least.
Genotyping of James Watson’s genome from Low-coverage Sequencing Data Sanjiv Dinakar and Yözen Hernández.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Romanian SA1 report Alexandru Stanciu ICI.
Single Nucleotide Polymorphisms Mrs. Stewart Medical Interventions Central Magnet School.
A Primer on Genetic Variation Variety Lawrence Brody - NHGRI.
Case(Control)-Free Multi-SNP Combinations in Case-Control Studies Dumitru Brinza and Alexander Zelikovsky Combinatorial Search (CS) for Disease-Association:
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
National Taiwan University Department of Computer Science and Information Engineering Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
CS177 Lecture 10 SNPs and Human Genetic Variation
SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.
Gene Hunting: Linkage and Association
C Reactive Protein Coronary Heart Disease Genetics Collaboration BMJ 2011;342:d548.
Two RANTES gene polymorphisms and their haplotypes in patients with myocardial infarction from two Slavonic populations Two RANTES gene polymorphisms and.
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
BGRS 2006 SEARCH FOR MULTI-SNP DISEASE ASSOCIATION D. Brinza, A. Perelygin, M. Brinton and A. Zelikovsky Georgia State University, Atlanta, GA, USA 123.
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Prof. Dr. H. Schunkert Medizinische Klinik II UK S-H Campus Lübeck Genome-wide association for myocardial infarction.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE’09 SSC Workshop SAFE Project Proposal.
Genetic Testing Amniocentesis Until recently, most genetic testing occurred on fetuses to identify gender and genetic diseases. Amniocentesis is one technique.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks DSA1.4 – Objectives and Status Ioannis Liabotis.
The International Consortium. The International HapMap Project.
Design and Validation of Methods Searching for Risk Factors in Genotype Case- Control Studies Dumitru Brinza Alexander Zelikovsky Department of Computer.
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks User Support for Distributed Computing Infrastructures.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Nov. 18, EGEE and gLite are registered trademarks High-End Computing - Clusters.
Chemokine RANTES –403 G/A polymorphism in two Slavonic populations with myocardial infarction Tereshchenko IP 1,2, Petrkova J 2,3, Mrazek F 2, Navratilova.
Genome-Wides Association Studies (GWAS) Veryan Codd.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Astrophysical Cluster Session Claudio Vuerli,
Single Nucleotide Polymorphisms (SNPs
SNPs and complex traits: where is the hidden heritability?
Genetics of common complex diseases: a view from Iceland
X Chromosome Inactivation
Design of the Coronary ARtery DIsease Genome-Wide Replication And Meta-Analysis (CARDIoGRAM) StudyClinical Perspective by Michael Preuss, Inke R. König,
Arterioscler Thromb Vasc Biol
Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam
By Michael Fraczek and Caden Boyer
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Exercise: Effect of the IL6R gene on IL-6R concentration
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Figure 2 Association between coronary artery disease polygenic risk score and the presence of migraine Results are given as odds ratios with 95% confidence.
DNA methylation and body-mass index: a genome-wide analysis
Hunting for Celiac Disease Genes
SNPs and CNPs By: David Wendel.
The principles of genetic association
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Genome Wide Haplotype analyses of human complex diseases with the EGEE grid Tregouet David – INSERM UMRS937 – UPMC – Paris - France

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 2 Genome Wide Association Studies (GWAS) Testing the association between a large number (~500K) of single nucleotide polymorphisms (SNPs) and a variable of interest (e.g: a disease) in a large cohort of individuals Principle Estimate the SNP allele frequencies in cases and controls and calculate the corresponding statistical test yielding a pvalue How ? SNP definition Genetic variation in a DNA sequence that occurs when a single nucleotide (~ base: A,C,G,T ) in a genome is altered. Often considered as a binary 0/1 variable

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 3 GWAS' main limits Only single SNP associations are tested May miss 'haplotypic' interaction between SNPs located in the same gene (or region) –Haplotype: Combination of alleles on a given chromosome –For example, with 2 SNPs (C/T & G/A) → 4 haplotypes C G C A T G T A One may want to test for difference in haplotype frequencies between cases and controls It may happen that only one haplotype is at risk

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 4 Genome Wide Haplotype Analysis (GWHAS) Is it possible ? 2 SNPs : up to 4 haplotypes (i.e 00|01|10|11) 3 SNPs : up to 8 haplotypes (i.e 000|001|010|011|100|101|110|111) In a window (eg a gene or a region) of n SNPs, up to 2 n haplotypes a large number of tests / comparisons have to be carried out to identify which combination of SNPs is the best predictor for the disease ? Yes...but

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 5 Genome Wide Haplotype Analysis (GWHAS) Is it possible ? 2 SNPs : up to 4 haplotypes (i.e 00|01|10|11) 3 SNPs : up to 8 haplotypes (i.e 000|001|010|011|100|101|110|111) In a window (eg a gene or a region) of n SNPs, up to 2 n haplotypes Example: In a window of 10 adjacent SNPs, restricting the haplotypes of length 4 lead to 375 combinations to be tested: [SNP1 + SNP2] [SNP1 + SNP3] [SNP1 + SNP10] [SNP2 + SNP3] [SNP2 + SNP10] [SNP9 + SNP10] [SNP1 + SNP2 + SNP3] [SNP1 + SNP2 + SNP4] [SNP1 + SNP9 + SNP10] [SNP2 + SNP3 + SNP4] [SNP3 + SNP6 +SNP8] [SNP8 + SNP9 + SNP10] [SNP1 + SNP2 + SNP3 +SNP4] [SNP1 + SNP6 + SNP7 +SNP10] [SNP7 + SNP8 + SNP9 + SNP10]

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 6 Genome Wide Haplotype Analysis (GWHAS) GWHAS are possible but are extremely computationnally demanding !!!! Distribution of the haplotypic calculations on EGEE –Development of an easygLite interface –Python & Perl script for results ' visualization

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 7 GWHAS on Coronary Artery Disease (CAD) WTCCC data: 1926 CAD patients & 2938 healthy controls 378,000 SNPs Sliding windows approach on each chromosome Windows of size 10 Haplotype composed of up to 4 SNPs 1 to 102 to 113 to 12(n-10) to n..... Search for regions where haplotypes are stronger predictors of CAD risk than SNP alone

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 8 GWHAS on Coronary Artery Disease 8.1 millions of combinations tested in less than 45 days (instead of more than 10 years on a single Pentium 4) 29 regions where haplotypes could be better predictors than SNPs alone were identified To control for false positives, replication was investigated in about 7000 CAD patients and 7000 controls One region on chromosome 6 was confirmed

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 9 Nature Genetics doi: /ng.314

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 10 Conclusions Genome Wide Haplotype Association Studies are now a reality thanks to the use of Grid technology Using EGEE, we were able to identify a cluster of 3 genes where haplotypes are strongly associated with CAD risk (Tregouet et al. Nature Genetics March 2009) Possibility to apply such tool to other human diseases (Diabetes, Cancer....) Possibility to use EGEE to investigate interactions between SNPs that are not necesseraly in the same gene/region

Enabling Grids for E-sciencE EGEE-III INFSO-RI To change: View -> Header and Footer 11 Credits UMRS 937 François Cambien Alexandru Munteanu Laurence Tiret Claire Perret Nilesh Samani Heribert Schunkert Inke König Jeannette Erdmann Andreas Ziegler.... UMR 8623 LRI Cécile Germain