Presentation is loading. Please wait.

Presentation is loading. Please wait.

Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 2 Image:

Similar presentations


Presentation on theme: "Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 2 Image:"— Presentation transcript:

1 Something related to genetics? Dr. Lars Eijssen

2 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 2 Image: http://www.bio.georgiasouthern.edu

3 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 3 Contents 1.Basics of genetic variation 2.Technology to measure variation 3.Linking SNPs to traits

4 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 4 Part 1 Basics of genetic variation

5 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 5 Variations in genes Only 0.1% of the bases are unique! Effect on unique traits But also on susceptibility to disease DNA 1DNA 2

6 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 6 Effects of variations Variations can be: –Harmless –Harmful –Latent A variation is called a mutation if a disadvantageous effect on disease has been proven

7 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 7 Basic types of genetic variation

8 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 8 Definition A SNP (single nucleotide polymorphism) is defined as a single base change in a DNA sequence that occurs in a significant proportion (more than 1 percent) of a large population –SNPs occur once in 500-1000 bases –Currently, dbSNP at NCBI (build 132) has about 6.9M human SNPs (4.5M validated)

9 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 9 Variations other than SNPs Larger variations –Hypervariable regions Repeat length polymorphism –Differences in the number of repeats within a repetetive sequence  ATATATATAT  ATATATATATATATATATAT  ATATATATATATAT

10 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 10 Alleles – Genotypes - Inheritance

11 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 11 Red dominant – Green recessive

12 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 12 Green dominant – Red recessive

13 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 13 A gene can have more than 2 alleles

14 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 14 Allele frequency 55% 35% 10%

15 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 15 Definition Penetrance: the number of people with a certain genotype that also develop the associated phenotype Red: 75%

16 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 16 Haplotype: the combination of alleles (SNPs) one has

17 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 17 Recombination and cross-over

18 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 18 Recombination and cross-over Haplotype block

19 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 19 Types of SNP in a gene ExonIntron Gene Non-coding SNP Coding SNP >

20 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 20 (coding) SNPs in a protein mRNA Protein Synonymous SNP Coding SNP Non-synonymous SNP Truncating SNP

21 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 21 Effect of SNPs on protein composition

22 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 22 SNPs in NCBI (Entrez SNP)

23 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 23 Functional effects of SNPs When amino acid (AA) changes, is the change relevant? –Type of AA –Site of the change –Functional domain –Conservation in other species –Truncating mutation http://avonapbio.pbworks.com

24 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 24 Are non-coding SNPs relevant? Also non-coding SNPs may have an effect: –Effect on target sites of Transcription Factors (regulation of transcription) –Effect on target sites of miRNAs (regulation of transcript decay) –Effect on splice donor or acceptor sites (regulation of – alternative – splicing) –Other…

25 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 25 Non-genetic variations Apart from variations in the sequence of the genes, other inheritable variations occur –These are called epigenetic variations

26 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 26 Part 2 Technology to measyre variation

27 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 27 Sanger sequencing Terminates the chain with incorporation of a ddNT http://www.mrc-lmb.cam.ac.uk

28 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 28 Pyrosequencing Detects formation of pyrophosphate (light) Images from: http://www.har.mrc.ac.uk (left) and http://www.ercim.eu (right)

29 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 29 Large scale measurement of SNPs Affymetrix SNP chip 500,000 or 1M SNPs Genome wide study of SNPs Data analysis? (SM Carr et al. 2008. Comp. Biochem. Physiol. D, 3:11)

30 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 30 (after SM Carr et al. 2008. Comp. Biochem. Physiol. D, 3:11) http://www.mun.ca/biology/scarr/DNA_Chips.html

31 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 31 Resequencing chips Another type of chip allows sequencing genes or genomic regions of interest –Similar technology –One can design the chips depending on the genes of interest –As such one can measure all known mutations related to a disease

32 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 32 Sequencing the whole genome Next Generation Sequecing (NGS) has made it possible to sequence the whole genome of an organism –In principle, all variations between individuals can be determined –Methodological details will not be covered in this course (several platforms available) –In any case: massive amounts of data are generated (Gbs per sample) http://seqanswers.com

33 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 33 Sequencing the whole genome Data analysis is not that easy –Aligning –Calling (‘peak’ calling) –Real changes or sequencing errors Error file Same issue with ‘regular’ sequencing, but there one can evaluate by eyesight –How many fold coverage is needed? http://seqanswers.com http://www.genomics.agilent.com

34 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 34 Part 3 Linking SNPs to traits

35 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 35 SNPs as markers SNPs close to a particular gene acts as a genetic polymorphic marker for that gene –No functional connection needed!

36 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 36 More on markers The more variation the better –Equally likely alleles –SNPs with more than two alleles –Repeat length variations –Longer variabele sequences of DNA SNPs still very useful –Abundant and easy to measure on a large scale

37 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 37 SNP maps Example:

38 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 38 SNP profiles Personalized - medicine - nutrition

39 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 39 Trait A trait is just a characteristic –Length, weights, eye color, sex, … Traits can be discrete (sex, …) or continuous (weight, …) Discrete = ‘quantitative’ Continuous = ‘qualitative’ http://phe.rockefeller.edu

40 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 40 Heritability Often MISinterpreted The heritability of a trait means how much of its variation can be explained by genetic variation –…in the population in which it is measured –Thus high heritability does not mean that the trait is genetically determined in general –It only tells whether the population is informative to study genetic contributions to a phenotype

41 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 41 Images: various sources Heritability > Heritability

42 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 42 Genome wide association studies GWAS (‘association’) tries to link SNPs to traits (diseases) in a genome wide way Makes use of unrelated individuals –So no family members Tries to find which allelic variants, correlate with the phenotype of interest If a complete haplotype goes together with the phenotype, this is considered association

43 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 43 Patient 1 Patient 2 Patient 3 Patient 4 Patient 5 Patient 6 Control 1 Control 2 Control 3 Each color indicates a different haplotype in the study population Region of interest (determine in more detail, or check genes it contains)

44 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 44 http://www.htbiology.com

45 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 45 Linkage studies Linkage makes use of related individuals –family members Adantage is higher power as compared to GWAS But one needs large enough families with enough (informative) ‘cross overs’ and preferably several generations Principle is the same as with GWAS, using markers or SNPs

46 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 46 http://www.molvis.org

47 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 47 Computations The linkage disequilibrium (D or LD) indicated the deviation of a haplotype’s frequency from its expected frequency The LOD score ( 10 log of the odds) indicates the likelihood of obtaining the data given that the loci are indeed linked, versus obtaining the data by chance –A score higher than 3 (which means a 1000:1 odds) is considered evidence of linkage

48 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 48 Limitations A very large sample size is needed but also population uniformity –Trade-off To most common diseases, many SNPs/genes contribute for a few percent each –Difficult to detect Often many genes in haplotype blocks Rare alleles make sampling even more difficult  often discrepancy even between large studies

49 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 49 What’s more? Always realise that the SNPs linked to the phenotype are not (neccesarily) the functional or causing SNPs, they are just close enough to be markers Now we only discussed genetic contributions to a phenotype Other aspects to study: –Genes modifying the effects of other genes (epistatis) –Gene-environment interactions Specific study of interactions is very difficult –Even more possibilities –Even smaller effects Images: http://theosophical.wordpress.com (left) and http://www.foodfacts.info (right)

50 Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 50 THANK YOU! Questions?


Download ppt "Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June 9-11 2011 2 Image:"

Similar presentations


Ads by Google