DATA ANALYSIS Module Code: CA660 Lecture Block 2.

Slides:



Advertisements
Similar presentations
Population Genetics 1 Chapter 23 in Purves 7 th edition, or more detail in Chapter 15 of Genetics by Hartl & Jones (in library) Evolution is a change in.
Advertisements

DATA ANALYSIS Module Code: CA660 Lecture Block 2.
General Linear Model With correlated error terms  =  2 V ≠  2 I.
Chapter 4 Probability and Probability Distributions
Introduction to Probability
Hardy-Weinberg Equilibrium
 Read Chapter 6 of text  Brachydachtyly displays the classic 3:1 pattern of inheritance (for a cross between heterozygotes) that mendel described.
Evolution and Genetic Equilibrium
The Evolution of Populations. Darwin’s Proposal Individuals are selected; populations evolve. Individuals are selected; populations evolve.
Population Genetics. Mendelain populations and the gene pool Inheritance and maintenance of alleles and genes within a population of randomly breeding.
DATA ANALYSIS Module Code: CA660 Lecture Block 2.
14 Molecular Evolution and Population Genetics
1. Probability 2. Random variables 3. Inequalities 4. Convergence of random variables 5. Point and interval estimation 6. Hypotheses testing 7. Nonparametric.
Chapter 4 Probability.
. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau.
Population Genetics What is population genetics?
2: Population genetics. A: p=1 a: q=0 A: p=0 a: q=1 In such a case, there are no heterozygous individuals in the population, although according to HW,
Brachydactyly and evolutionary change
Evolutionary Concepts: Variation and Mutation 6 February 2003.
© Buddy Freeman, 2015Probability. Segment 2 Outline  Basic Probability  Probability Distributions.
Genetic drift & Natural Selection
Chapter 4 Probability Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 1 Basics of Probability.
Animal Breeding and Genetics
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Population Genetics Learning Objectives
Broad-Sense Heritability Index
Genetic Variation and Mutation. Definitions and Terminology Microevolution –Changes within populations or species in gene frequencies and distributions.
10/1/20151 Math a Sample Space, Events, and Probabilities of Events.
HARDY-WEINBERG EQUILIBRIUM
Genetic Equilibrium Chapter 16- Section 1. What is a population? A group of individuals of the same species that routinely interbreed Population Genetics.
Do Now: 5/14 (Week 36) Objectives : 1. Define gene pool, phenotype frequency, and genotype frequency. 2. State the Hardy-Weinberg Principle. 3. Describe.
Genetic Drift Random change in allele frequency –Just by chance or chance events (migrations, natural disasters, etc) Most effect on smaller populations.
Genetic Equilibrium. A population is a group of individuals of a species that lives in the same area at the same time.
1. 2 Hardy-Weinberg Equilibrium Lecture 5 3 The Hardy-Weinberg Equilibrium.
How to: Hardy - Weinberg
Chapter 7 Population Genetics. Introduction Genes act on individuals and flow through families. The forces that determine gene frequencies act at the.
Lecture 5a: Bayes’ Rule Class web site: DEA in Bioinformatics: Statistics Module Box 1Box 2Box 3.
Population genetics and Hardy-Weinberg equilibrium.
Population Genetics I. Basic Principles. Population Genetics I. Basic Principles A. Definitions: - Population: a group of interbreeding organisms that.
Populations, Genes and Evolution Ch Population Genetics  Study of diversity in a population at the genetic level.  Alleles  1 individual will.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review Instructor: Anirban Mahanti Office: ICT Class.
Population assignment likelihoods in a phylogenetic and demographic model. Jody Hey Rutgers University.
Probability & Statistics I IE 254 Exam I - Reminder  Reminder: Test 1 - June 21 (see syllabus) Chapters 1, 2, Appendix BI  HW Chapter 1 due Monday at.
2-1 Sample Spaces and Events Random Experiments Figure 2-1 Continuous iteration between model and physical system.
Chapter 4 Probability ©. Sample Space sample space.S The possible outcomes of a random experiment are called the basic outcomes, and the set of all basic.
CHANGE IN POPULATIONS AND COMMUNITIES. Important Terms Communities are made up of populations of different species of organisms that live and potentially.
Lecture 21 Based on Chapter 21 Population Genetics Copyright © 2010 Pearson Education Inc.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Allele Frequencies: Staying Constant Chapter 14. What is Allele Frequency? How frequent any allele is in a given population: –Within one race –Within.
Probability You’ll probably like it!. Probability Definitions Probability assignment Complement, union, intersection of events Conditional probability.
Sixth lecture Concepts of Probabilities. Random Experiment Can be repeated (theoretically) an infinite number of times Has a well-defined set of possible.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Measuring Genetic Variation in Natural Populations Historical Method: Examining protein variation via electrophoresis Modern Method: DNA sequencing and.
Hardy-Weinberg Equilibrium Population Genetics and Evolution.
Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
1 Probability- Basic Concepts and Approaches Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
Modern Evolutionary Biology I. Population Genetics A. Overview Sources of VariationAgents of Change MutationN.S. Recombinationmutation - crossing over.
Population Genetics I. Basic Principles. Population Genetics I. Basic Principles A. Definitions: - Population: a group of interbreeding organisms that.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Please feel free to chat amongst yourselves until we begin at the top of the hour.
Lecture 3 - Concepts of Marine Ecology and Evolution II 3) Detecting evolution: HW Equilibrium Principle -Calculating allele frequencies, predicting genotypes.
Probability and Probability Distributions. Probability Concepts Probability: –We now assume the population parameters are known and calculate the chances.
Math a - Sample Space - Events - Definition of Probabilities
Chapter 4 Probability.
POPULATION GENETICS.
Conditional Probability, Total Probability Theorem and Bayes’ Rule
Presentation transcript:

DATA ANALYSIS Module Code: CA660 Lecture Block 2

PROBABILITY – Inferential Basis COUNTING RULES – Permutations, Combinations BASICS Sample Space, Event, Probabilistic Expt. DEFINITION / Probability Types AXIOMS (Basic Rules) ADDITION RULE – general and special from Union (of events or sets of points in space) OR

Basics contd. CONDITIONAL PROBABILITY (Reduction in sample space) MULTIPLICATION RULE – general and special from Intersection (of events or sets of points in space) Chain Rule for multiple intersections Probability distributions, from sets of possible outcomes. Examples - come up with one of each

Conditional Probability: BAYES A move towards “Likelihood” Statistics More formally Theorem of Total Probability (Rule of Elimination) If the events B 1, B 2, …,B k constitute a partition of the sample space S, such that P{B i }  0 for i = 1,2,…,k, then for any event A of S So, if events B partition the space as above, then for any event A in S, where P{A}  0

Example - Bayes 40,000 people in a population of 2 million carry a particular virus. P{Virus} = P{V 1 } = Tests to show presence/absence of virus, give results: P{T / V 1 } =0.99 and P{T / V 2 } = 0.01 P{N / V 2 }=0.98 and P{N / V 1 }=0.02 where V 2 is the event virus absent, T, the event = positive test, N the event = negative test. (All a priori probabilities) So where events V i partition the sample space Total probability

BAYES Bioinformatics Example: Accuracy of Assembled DNA sequences Want estimate of probability that ith letter of an assembled sequence is A,C,G, T or – (unknown) Assume each fragment assembly correct, all portions equally reliable, sequencing errors independ t. & uniform throughout sequence. Assume letters in sequence IID. Let F* = {f 1, f 2, …f N } be the set of fragments Fragments aligned into assembled sequence - correspond to columns i in matrix, while fragments correspond to rows j Matrix elements x ij are members of B* = {A,C,G,T, -, 0} True sequence (in n columns) is s = {s 1, s 2, …s n } where s contained in {A,C,G,T,-} = A*

BAYES contd. Track fragment orientat n. Thus need estimation of = probability ith letter is from molecule “M”, given matrix elements(of fragments). Assuming knowledge of sequencing error rates: so that Bayes gives Total Prob. of b Context = M Summed options for b over M

Example: probability other Bioinformatic problems: e.g. POPULATION GENETICS Counts – Genotypic “frequencies” GENE with n alleles, so n(n+1)/2 possible genotypes Population Equilibrium HARDY-WEINBERG Genes and “genotypic frequencies” constant from generation to generation (so simple relationships for genotypic and allelic frequencies) e.g. 2 allele model p A, p a allelic freq. A, a respectively, so genotypic ‘frequencies’ are p AA, p Aa,, p aa, with p AA = p A p A = p A 2 p Aa = p A p a + p a p A = 2 p A p a p aa = p a 2 (p A + p a ) 2 = p A p a p A + p a 2 One generation of Random mating. H-W at single locus

POPULATION PICTURE at one locus under H-W  m NB : ‘Frequency’ heterozygote maximum for both allelic frequencies = 0.5 (see Fig.) Also if rare allele A So, if rare allele, probability high carried in heterozygous state: e.g. 99% chance for p A = 0.01 say papa

Extended:Multiple Alleles Single Locus p 1, p 2,.. p i,...p n = “frequencies” alleles A 1, A 2, … A i,….A n, Possible genotypes = A 11, A 12, ….. A ij, … A nn Under H-W equilibrium, Expected genotype frequencies (p 1 + p 2 +… p i... +p n ) (p 1 + p 2 +… p j... +p n ) = p p 1 p 2 +…+ 2p i p j …..+ 2p n-1 p n + p n 2 e.g. for 4 alleles, have 10 genotypes. Proportion of heterozygosity in population clearly P H = 1 -  i p i 2 used in screening of genetic markers

Example revisited: Expected genotypic frequencies for a 4-allele system; H-W  m, proportion of heterozygosity in F2 progeny

GENERALISING: PROBABILITY RULES and PROPERTIES – Other Examples in brief For loci, No. of genotypes, where n i = No. alleles for locus i : Changes in gene frequency–from migration, mutation, selection Suppose native population has allelic freq. p n0. Proportion m i (relative to native population) migrates from ith of k populations to native population every generation; immigrants having allelic frequency p i. So allelic frequency in a mixed population :