Data analyses Course code: ZOO560 Week 3 Advanced molecular biology (ZOO560) by Rania M. H. Baleela is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
This lecture content Data types Install software: BioEdit GenAlex MEGA Fig Tree Start analyses
Data types Sequences: DNA, RNA, PROTEIN Gene expression profiles Structures Biochemical pathways Gene maps SNPs RFLP Fragments Etc…
HIV-1 V3 analyses
HIV-1 env gene HIV-1 is completely dependent upon the Env protein to enter cells The env gene contains several antigenic determinants, the majority of which are found in the five hypervariable regions (V1 to V5) of the portion encoding gpl20 The third hypervariable region of env, V3, carries the principal neutralization-specific epitope and also a cytotoxic T cell epitope Sequence analysis of a haemophiliac cohort in Edinburgh suggested that sequence variation in the V3, V4 and V5 regions, was in part generated by natural selection
Host range variants of HIV-1 as determined by the Env protein R5 T cell-tropic Virus HIV-1 spends most of its time replicating in activated CD4+ T cells. Activated T cells are activated in the context of the immune response, but also are metabolically active making viral replication more efficient. 2. R5 Macrophage-Tropic Virus Some isolates of HIV-1 that can infect CD4+ T cells can also infect macrophages (i.e. are M-tropic 3. X4 T Cell-Tropic Virus Variants that can use CXCR4 is a distinctive and consistent feature of the viral population in a significant fraction of those infected with the virus. There are continuing discussions as to whether these variants are selected against at the time of transmission, whether they can survive only in an immunodeficient host, whether they contribute to or are a marker for more rapid disease progression, and whether different subtypes have a different propensity to evolve these variants. Thus, while we know a great deal about the evolution of these variants and their properties, we cannot yet place them with confidence in the context of viral pathogenesis.
Software installation BIOEDIT MEGA JAVA
V3 REGION ANALYSES
Terminology
Directional selection Under it the advantageous allele increases as a consequence of differences in survival and reproduction among different phenotypes. The increases are independent of the dominance of the allele, and even if the allele is recessive, it will eventually become fixed sockeye salmon found in the waters of the Bristol Bay in Alaska have recently undergone directional selection on the timing of migration
Balancing selection A number of selective processes by which multiple alleles are actively maintained in the gene pool of a population at frequencies larger than expected from genetic drift alone. This can happen by various mechanism e.g. Sickle-shaped red blood cells e.g. Grove snail shell colours
negative selection or purifying selection Is the selective removal of alleles that are deleterious. This can result in stabilizing selection through the purging of deleterious variations that arise.
The neutral theory of molecular evolution Neutrality= evolving randomly Kimura (1968)
A hypothesis which states that “Most polymorphisms observed at the molecular level are selectively neutral so that their frequency dynamics in a population are determined by a balance between the effects of mutation and random genetic drift” Also known as the theory of selective neutrality.
theoretical principles implications (Kimura, 1983) If a population contain a neutral allele with the allele frequency Po=>Pr(allele to become fixed)=Po, Po=1/2N (i.e. a mutant allele arising in a smaller pop. has higher chance of fixation). The steady-state rate at which neutral mutations are fixed in a population= μ=(1/2N)(2Nμ), whereas 2Nμ is the average # of new neutral mutations/generation. The average time between neutral substitutions = 1/ μ. Among newly arising neutral alleles destinied to be fixed, the average time of fixation=4Ne generations, where Ne= effective population size. Among newly arising neutral alleles destinied to be lost, the average time to loss=(2Ne/N)ln(2N) generations. If each neutral mutation created an allele that is different from all existing others (IAM), then at equilibrium, the expected homozygosity= 1/4Neμ+1
Theta (θ): the population parameter Θ= 4Neμ where μ= neutral mutation rate. Then The average Homozygosity at equilibrium between mutation & genetic drift=1/ θ+1 Heterozygosity=1-homozygosity In IAM, @ equilibrium, the average heterozygosity= 1- (1/ θ+1)= θ/ θ+1
Detection of natural selection Different tests were developed to test for neutrality. Some approaches to determine whether molecular variation is consistent with the neutral theory and to potentially detect purifying and positive selection: Tajima (1989) Genetics 123:585-595. Fu (1997) Genetics 147:915-925. Fu & Li (1993) Genetics 133:693-709. Fu (1996) Genetics 143:557-570.
Tajima’s D test The most powerful test up to date. explicitly account for mutational events Detects selection, population bottlenecks & population subdivision. At neutrality, nucleotide diversity theta (θπ ) = theta of expected number of sites segregating for different nucleotides (θs ) θπ can be influenced by rare alleles and θs strongly influenced by rare alleles
Tajima’s D test is defined as Where d= θπ - θs and k is the average number of nucleotide sites that are different. use MEGA software for sequence data
Tajima’s D values In populations at neutral equilibrium, Tajima’s D should equal zero; In cases of slightly deleterious variants, θs will be greater than θπ and D will be negative; In demographic events such as a population growth from an equilibrium situation, S will grow faster than π leading to a negative D as well. In cases of heterozygote advantage S will be reduced and the estimates of θπ greater and D will be positive. Distinguishing between negative values of D due to selection and those due to demographic events may prove difficult.