Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli.

Similar presentations


Presentation on theme: "What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli."— Presentation transcript:

1 What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli distribution

2 The Bernoulli or binomial distribution comes from the Taylor expansion of the binomial Bernoulli or binomial distribution

3 Assume the probability to find a certain disease in a tree population is 0.01. A bio- monitoring program surveys 10 stands of trees and takes in each case a random sample of 100 trees. How large is the probability that in these stands 1, 2, 3, and more than 3 cases of this disease will occur? Mean, variance, standard deviation

4 What happens if the number of trials n becomes larger and larger and p the event probability becomes smaller and smaller. Poisson distribution The distribution or rare events

5 Assume the probability to find a certain disease in a tree population is 0.01. A bio- monitoring program surveys 10 stands of trees and takes in each case a random sample of 100 trees. How large is the probability that in these stands 1, 2, 3, and more than 3 cases of this disease will occur? Poisson solution Bernoulli solution The probability that no infected tree will be detected The probability of more than three infected trees Bernoulli solution

6 Variance, mean Skewness

7 What is the probability in Duży Lotek to have three times cumulation if the first time 14 000 000 people bet, the second time 20 000 000, and the third time 30 000 000? The probability to win is The events are independent: The zero term of the Poisson distribution gives the probability of no event The probability of at least one event:

8 Probabilities of DNA substitution We assume equal substitution probabilities. If the total probability for a substitution is p: A T CG p p p p p The probability that A mutates to T, C, or G is P ¬A =p+p+p The probability of no mutation is p A =1-3p Independent events The probability that A mutates to T and C to G is P AC =(p)x(p) p(A →T)+ p(A →C) +p(A →G) +p(A →A) =1 The construction of evolutionary trees from DNA sequence data

9 The probability matrix A T C G A T C G What is the probability that after 5 generations A did not change? The Jukes - Cantor model (JC69) now assumes that all substitution probabilities are equal.

10 Arrhenius model The Jukes Cantor model assumes equal substitution probabilities within these 4 nucleotides. Substitution probability after time t Transition matrix Substitution matrix t A,T,G,C A The probability that nothing changes is the zero term of the Poisson distribution The probability of at least one substitution is The probability to reach a nucleotide from any other is The probability that a nucleotide doesn’t change after time t is

11 Probability for a single difference This is the mean time to get x different sites from a sequence of n nucleotides. It is also a measure of distance that dependents only on the number of substitutions What is the probability of n differences after time t? We use the principle of maximum likelihood and the Bernoulli distribution

12 Gorilla Pan paniscus Pan troglodytes Homo sapiens Homo neandertalensis Time Divergence - number of substitutions Phylogenetic trees are the basis of any systematic classificaton

13 A pile model to generate the binomial. If the number of steps is very, very large the binomial becomes smooth. The normal distribution is the continous equivalent to the discrete Bernoulli distribution Abraham de Moivre (1667-1754)

14 If we have a series of random variates Xn, a new random variate Yn that is the sum of all Xn will for n→∞ be a variate that is asymptotically normally distributed. The central limit theorem

15 The normal or Gaussian distribution Mean:  Variance:  2

16 Important features of the normal distribution The function is defined for every real x. The frequency at x = m is given by The distribution is symmetrical around m. The points of inflection are given by the second derivative. Setting this to zero gives

17 ++ -- 0.68 +2  -2  0.95 Many statistical tests compare observed values with those of the standard normal distribution and assign the respective probabilities to H 1.

18 The Z-transform The variate Z has a mean of 0 and and variance of 1. A Z-transform normalizes every statistical distribution. Tables of statistical distributions are always given as Z- transforms. The standard normal The 95% confidence limit

19 P(  -  < X <  +  ) = 68% P(  - 1.65  < X <  + 1.65  ) = 90% P(  - 1.96  < X <  + 1.96  ) = 95% P(  - 2.58  < X <  + 2.58  ) = 99% P(  - 3.29  < X <  + 3.29  ) = 99.9% The Fisherian significance levels ++ -- 0.68 +2  -2  0.95 The Z-transformed (standardized) normal distribution

20 Why is the normal distribution so important? The normal distribution is often at least approximately found in nature. Many additive or multiplicative processes generate distributions of patterns that are normal. Examples are body sizes, intelligence, abundances, phylogenetic branching patterns, metabolism rates of individuals, plant and animal organ sizes, or egg numbers. Indeed following the Belgian biologist Adolphe Quetelet (1796- 1874) the normal distribution was long hold even as a natural law. However, new studies showed that most often the normal distribution is only a approximation and that real distributions frequently follow more complicated unsymmetrical distributions, for instance skewed normals. The normal distribution follows from the binomial. Hence if we take samples out of a large population of discrete events we expect the distribution of events (their frequency) to be normally distributed. The central limit theorem holds that means of additive variables should be normally distributed. This is a generalization of the second argument. In other words the normal is the expectation when dealing with a large number of influencing variables. Gauß derived the normal distribution from the distribution of errors within his treatment of measurement errors. If we measure the same thing many times our measurements will not always give the same value. Because many factors might influence our measurement errors the central limit theorem points again to a normal distribution of errors around the mean. In the next lecture we will see that the normal distribution can be approximated by a number of other important distribution that form the basis of important statistical tests.

21 x,s  The estimation of the population mean from a series of samples The n samples from an additive random variate. Z is asymptotically normally distributed. Confidence limit of the estimate of a mean from a series of samples.  is the desired probability level. Standard error

22 How to apply the normal distribution Intelligence is approximately normally distributed with a mean of 100 (by definition) and a standard deviation of 16 (in North America). For an intelligence study we need 100 persons with an IO above 130. How many persons do we have to test to find this number if we take random samples (and do not test university students only)?

23

24 One and two sided tests We measure blood sugar concentrations and know that our method estimates the concentration with an error of about 3%. What is the probability that our measurement deviates from the real value by more than 5%?

25 Albinos are rare in human populations. Assume their frequency is 1 per 100000 persons. What is the probability to find 15 albinos among 1000000 persons? =KOMBINACJE(1000000,15)*0.00001^15*(1-0.00001)^999985 = 0.0347

26 Home work and literature Refresh: Bernoulli distribution Poisson distribution Normal distribution Central limit theorem Confidence limits One, two sided tests Z-transform Prepare to the next lecture:  2 test Mendel rules t-test F-test Contingency table G-test Literature: Łomnicki: Statystyka dla biologów Mendel: http://en.wikipedia.org/wiki/Mendelian_inheritan ce Pearson Chi2 test http://en.wikipedia.org/wiki/Pearson's_chi- square_test


Download ppt "What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli."

Similar presentations


Ads by Google