Presentation is loading. Please wait.

Presentation is loading. Please wait.

Marjana Westergren, Tine Grebenc. MOLECULAR MARKERS Molecular markers = fragment of DNA that is associated with a certain location within the genome or.

Similar presentations


Presentation on theme: "Marjana Westergren, Tine Grebenc. MOLECULAR MARKERS Molecular markers = fragment of DNA that is associated with a certain location within the genome or."— Presentation transcript:

1 Marjana Westergren, Tine Grebenc

2 MOLECULAR MARKERS Molecular markers = fragment of DNA that is associated with a certain location within the genome or other characteristic of an organism Heritable DNA sequence differences (polymorphisms) Phenotypically neutral, developmentally and environmentally stable Detectable Level of resolution?

3 DNA extraction Fungal cell components: Nucleic acids Lipids Proteins Sugars and other water soluble components Lysis of cells and cell walls (buffer, detergent, 65ºC) Precipitation of proteins and removal of lipids (chloroform+ethanol / phenol) Physical separation of cell walls and wather phase solution Precipitation in (2-propanol) Washing of DNA and resuspending Storage (4 ºC or -80 ºC) Sample in buffer Organic solvent Sedimentation Removal of waer phase Sedimentation Resuspending of DNA Kits: Manual procedure:

4 PCR – Polymerase Chain Reaction PCR – the principle Nuclear ribosomal ITS region – commonly used in identification and phylogeny http://www.mun.ca/biology/scarr/PCR_sketch_3.gif

5 DNA sequencing - „reading“ of the nucleotide sequence Sequencing principle http://www.newscientist.com/... Analysis of raw sequence (above): Commercial programs: Sequencher (demonstration) Sequencing Analysis software (Applied Biosystems) Free available: FinchTV Output: sequence in a „FASTA“ format ( = text format)

6 „FASTA“ format ( = text format): >Unknown sample CATTACCAATATCTGGGATGCCAAAGACACAGGCTCCCGATAAAACACATTTATGCGTATCCTCCCATGTTGCTTTCCCAGGCCAGCGGCCACTGCTGCCAGC CATGCCGTTTTTCGGTTACATGGTTGAGGTGCTTGGGGAAGGGCTAATTATCAAACTTTACTTCACCTTATTGTCTGAGAAGGCCATGTGCCGTAATCTTTAAA CATGTTAAAACTTTCAACAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAA TCTTTGAACGCACATTGCGCCCTTTGGTATTCCTTAGGGCATGCCTGTTCGAGCGTCGCAAAAACCCAGATCACCTAGAGTGTGGTATTGGCAGAAGTGGCC GGGGCTATCAGCGCTGCTGCCACTCTGCTGGAATGAATAGGCTGGAAAAGTAGATCATAGCAACAGACTTTCACAGTATTTTGAAATGCTAAATTAGTTTGAAG CTGATCGGAACCTAAGCCATTTGACCCCCATCCTGCGTAAAGCAGTAAGGTTGACCTCGGATCAGGTAGGGATACCCGCTGAACTTAAGCATAT A base-to-base comparison of the nucleotide sequence with available databases: http://www.ncbi.nlm.nih.gov/ (general for all organisms and markers)http://www.ncbi.nlm.nih.gov/ https://unite.ut.ee/ (only for ITS region in fungi)https://unite.ut.ee/

7 STEP 1 - ALIGNMENT OF SEQUENCES Grouping together sequences with least evolutionary differences Pairwise and multiple alignment; most programs use both approaches Specialised tools available for multiple alignment: http://www.ebi.ac.uk/Tools/msa/ Online :+ faster; no need for a good computer (processor power) + usually more user-friendy (in terms of input/output - limiter number of characters/sequences Local : + more flexible in terms of amount of data - less user friendly, some with poor error messages - computing demanded

8 STEP 2 - PHYLOGENETIC ANALYSIS Which phylogenetic method to use: Detect similarity based on multiple alignment a. strong similarity -> Maximum Parsimony b. weak (distant) similarity -> Distance methods c. very weak similarity -> Maximum Likelihood Regardless the method used, always check the validity of results (statistics or comparison of topologies among different approaches.

9 STEP 2 - PHYLOGENETIC ANALYSIS A. MAXIMUM PARSIONY An non-parametric approach also known as minimum evolution Good for similar sequences It builds a single phylogenetic tree which explains the evolution with fewest changes required to the present state and groups sequences with similar amount of variations Only involves parsimony informative characters (e.g. where at least one change exist on one position) Less suitable for larger datasets It does not give branch length but only topology Not statistically consistent

10 STEP 2 - PHYLOGENETIC ANALYSIS B. DISTANCE METHODS A non-parametric method Based on a distance matrix of all compared sequences followed by construction a guide tree (clustering distances) and subsequent iteration build up of branches and nodes Simple algorithms exist to construct a tree directly from pairwise distances (UPGMA - Unweight pair group method with arithmetic mean and NJ - Neighbor joining)

11 STEP 2 - PHYLOGENETIC ANALYSIS C. MAXIMUM LIKELYHOOD (ML) A parametric method A parsimony method that employs an explicit model of character evolution The dominant model in molecular evolution analyses It requires a reliable MODEL (model = list of probabilities for various evolution changes) – MEGA or jModelTest Implemented in most phylogenetic programs such as : MEGA, phyML, MrBayes,…

12 STEP 3 - STATISTICAL APPROACHES Aim : to evaluate the significance of the obtained phylogenetic relationships (trees) Approach: Bootstraping: build up of initial tree; iterative reampling of the original sequences with subsequent evaluation of the initial & new tree topology; each node/branch is evaluated for identity with initial tree and a score is given. Bootstrap value is the sum of scores. Available in most ML programs Approximate likelihood-ratio test (aLRT): acts as an alternative to nonparametric bootstrap and Bayesian estimation of branch support; based on assumption that the inferred branch has length 0; fast but not directly comparable with bootstrap values. Available in phyML program.

13 STEP 4 – PRESENTING PHYLOGENY DATA PHYLOGENETIC TREES Phylogenetic tree (also named dendrogram) is a presentation of the evolutionary relationships among organisms. Phylogenetic tree is composed of: Nodes (representing relationship among taxa/sequences as a special event in past which remained fixed in the evolution) Branches (their length represents number of changes in sequences, Leaves (represent the recent taxa) May be rooted or unrooted. Rooted: root is the common ancestor to all sequences and internal nodes and the distance from root to leaf correspond to evolutionary time

14

15 MICROSATELLITES ARE BEST FOR… → microsatellite loci provide excellent resolution of recent and ongoing microevolutionary processes (Wang 2010) average mutation rate of microsatellites: l = 5 × 10 -4 (Goldstein & Schlotterer 1999; Whittaker et al. 2003) DNA fragments of different sizes detected by initial amplification using polymerase chain reaction (PCR) and visualization via electrophoresis -> size polymorphism reflects variation in the number of repeats of a simple DNA sequence

16 HOW MANY TREES TO SAMPLE? obtaining accurate allele frequencies and accurate estimates of diversity are much more important than detecting all of the alleles, given that very rare alleles (i.e. new mutations) are not very informative for assessing genetic diversity within a population or genetic structure among populations (Hale et al. 2012). 25 to 30 individuals per population suffices for population genetic studies based on microsatellite allele frequencies (Hale et al. 2012).

17 ANALYSIS OF MICROSATELLITE DATA Deviations from Hardy-Weinberg equilibrium Null alleles Linkage disequilibrium (measurement of proximal genomic space) Allelic indices (Na, Ne, Ho, He, Ar) F statistics (F is, F st, F it ) Genetic structure & genetic distances Spatial genetic structure …

18 PRACTICAL EXAMPLE 3 European beech populations from Slovenia (partial, but real data) 5 loci Data in GenAlEx format

19 SOFTWARE NEEDED GenAlEx (Peakall & Smouse 2012): http://biology- assets.anu.edu.au/GenAlEx/Welcome.htmlhttp://biology- assets.anu.edu.au/GenAlEx/Welcome.html Genepop (Raymond & Rousset 1995) or Genepop on the web: http://genepop.curtin.edu.au/ http://genepop.curtin.edu.au/

20 DATA # loci # all trees # populations # trees per population

21 EXPORT TO GENEPOP Leave default options Save as txt file

22 DEVIATIONS FROM HWE Null hypothesis = random union of gametes Open Genepop on the web: http://genepop.curtin.edu.au/http://genepop.curtin.edu.au/ Select option 1 (Hardy Weinberg Exact Tests)Hardy Weinberg Exact Tests Copy paste data into the form on the bottom, select HTML - plain text as results format, otherwise leave default options Press submit

23 DEVIATIONS FROM HWE - RESULTS Results given per locus & per population Note: adjust probability values according to Bonferroni procedure for multiple comparisons when comparing multiple populations of loci (Rice 1989)

24 ADJUSTING FOR MULTIPLE COMPARISONS From genepop output, pop 2 1.Sort by P value 2.Divide alpha (0.05) by number of samples to get adjusted alpha 3.If P value < adjusted alpha, then null hypothesis can be rejected No deviations from HWE in our dataset Bonferroni correction procedure:

25 LINKAGE DISEQUILIBRIUM Null hypothesis = Genotypes at one locus are independent from genotypes at the other locus Open Genepop on the web: http://genepop.curtin.edu.au/http://genepop.curtin.edu.au/ Select option 2 (Linkage Disequilibrium)Linkage Disequilibrium Copy paste data into the form on the bottom, select HTML - plain text as results format, otherwise leave default options Press submit

26 LINKAGE DISEQUILIBRIUM - RESULTS No deviations from linkage equilibrium in our dataset

27 NULL ALLELES Maximum likelihood estimation of null allele frequency Open Genepop on the web: http://genepop.curtin.edu.au/http://genepop.curtin.edu.au/ Select option 8 (Miscellaneous Utilities)Miscellaneous Utilities Copy paste data into the form on the bottom, select HTML - plain text as results format, otherwise leave default options Press submit Note: consider also INEST, MicroChecker, FreeNA for checkig null alleles and their significance

28 NULL ALLELES - RESULTS Null alleles are present on locus L5 with high frequency (significant - CI): omit locus L5 from further analysis or adjust allele frequencies

29 BACK TO GENALEX – ALLELIC INDICES Go through results Populations have genetic diversity estimates of similar values Add-Ins → GenAlEx → Frequency


Download ppt "Marjana Westergren, Tine Grebenc. MOLECULAR MARKERS Molecular markers = fragment of DNA that is associated with a certain location within the genome or."

Similar presentations


Ads by Google