What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

The logic behind a statistical test. A statistical test is the comparison of the probabilities in favour of a hypothesis H 1 with the respective probabilities.
Markov chains Assume a gene that has three alleles A, B, and C. These can mutate into each other. Transition probabilities Transition matrix Probability.
Lecture 8 Probabilities and distributions Probability is the quotient of the number of desired events k through the total number of events n. If it is.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Sampling: Final and Initial Sample Size Determination
Review of Basic Probability and Statistics
Body size distribution of European Collembola Lecture 9 Moments of distributions.
Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.
Point estimation, interval estimation
BCOR 1020 Business Statistics Lecture 15 – March 6, 2008.
Point and Confidence Interval Estimation of a Population Proportion, p
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Statistics Lecture 11.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
CHAPTER 6 Statistical Analysis of Experimental Data
Slide 1 Statistics Workshop Tutorial 4 Probability Probability Distributions.
Chapter 11: Inference for Distributions
Probability -The ratio of the number of ways the specified event can occur to the total number of equally likely events that can occur. P(E) = n = number.
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Continuous Probability Distribution  A continuous random variables (RV) has infinitely many possible outcomes  Probability is conveyed for a range of.
Standard error of estimate & Confidence interval.
Previous Lecture: Sequence Database Searching. Introduction to Biostatistics and Bioinformatics Distributions This Lecture By Judy Zhong Assistant Professor.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Body size distribution of European Collembola Lecture 9 Moments of distributions.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
1 Ch6. Sampling distribution Dr. Deshi Ye
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Estimation of Statistical Parameters
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Random Sampling, Point Estimation and Maximum Likelihood.
Estimation in Sampling!? Chapter 7 – Statistical Problem Solving in Geography.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Permutations & Combinations and Distributions
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
The normal distribution Binomial distribution is discrete events, (infected, not infected) The normal distribution is a probability density function for.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
The Scientific Method Probability and Inferential Statistics.
Managerial Decision Making Facilitator: René Cintrón MBA / 510.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Biostatistics Class 3 Discrete Probability Distributions 2/8/2000.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Probability Normal Distribution. What is Normal Distribution? Any event can have at least one possible outcome. A trial is a single event. An experiment.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
B AD 6243: Applied Univariate Statistics Data Distributions and Sampling Professor Laku Chidambaram Price College of Business University of Oklahoma.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
1 Probability and Statistical Inference (9th Edition) Chapter 5 (Part 2/2) Distributions of Functions of Random Variables November 25, 2015.
Probability and Distributions. Deterministic vs. Random Processes In deterministic processes, the outcome can be predicted exactly in advance Eg. Force.
Chapter 9 Sampling Distributions Sir Naseer Shahzada.
8.1 Estimating µ with large samples Large sample: n > 30 Error of estimate – the magnitude of the difference between the point estimate and the true parameter.
Chapter 8: Probability: The Mathematics of Chance Probability Models and Rules 1 Probability Theory  The mathematical description of randomness.  Companies.
Computer Performance Modeling Dirk Grunwald Prelude to Jain, Chapter 12 Laws of Large Numbers and The normal distribution.
Review Day 2 May 4 th Probability Events are independent if the outcome of one event does not influence the outcome of any other event Events are.
Sampling and Sampling Distributions
Introduction to Probability - III John Rundle Econophysics PHYS 250
Ch5.4 Central Limit Theorem
PROBABILITY DISTRIBUTION Dr.Fatima Alkhalidi
Sample Mean Distributions
Introduction to Probability Distributions
Introduction to Probability Distributions
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Presentation transcript:

What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli distribution

The Bernoulli or binomial distribution comes from the Taylor expansion of the binomial Bernoulli or binomial distribution

Assume the probability to find a certain disease in a tree population is A bio- monitoring program surveys 10 stands of trees and takes in each case a random sample of 100 trees. How large is the probability that in these stands 1, 2, 3, and more than 3 cases of this disease will occur? Mean, variance, standard deviation

What happens if the number of trials n becomes larger and larger and p the event probability becomes smaller and smaller. Poisson distribution The distribution or rare events

Assume the probability to find a certain disease in a tree population is A bio- monitoring program surveys 10 stands of trees and takes in each case a random sample of 100 trees. How large is the probability that in these stands 1, 2, 3, and more than 3 cases of this disease will occur? Poisson solution Bernoulli solution The probability that no infected tree will be detected The probability of more than three infected trees Bernoulli solution

Variance, mean Skewness

What is the probability in Duży Lotek to have three times cumulation if the first time people bet, the second time , and the third time ? The probability to win is The events are independent: The zero term of the Poisson distribution gives the probability of no event The probability of at least one event:

Probabilities of DNA substitution We assume equal substitution probabilities. If the total probability for a substitution is p: A T CG p p p p p The probability that A mutates to T, C, or G is P ¬A =p+p+p The probability of no mutation is p A =1-3p Independent events The probability that A mutates to T and C to G is P AC =(p)x(p) p(A →T)+ p(A →C) +p(A →G) +p(A →A) =1 The construction of evolutionary trees from DNA sequence data

The probability matrix A T C G A T C G What is the probability that after 5 generations A did not change? The Jukes - Cantor model (JC69) now assumes that all substitution probabilities are equal.

Arrhenius model The Jukes Cantor model assumes equal substitution probabilities within these 4 nucleotides. Substitution probability after time t Transition matrix Substitution matrix t A,T,G,C A The probability that nothing changes is the zero term of the Poisson distribution The probability of at least one substitution is The probability to reach a nucleotide from any other is The probability that a nucleotide doesn’t change after time t is

Probability for a single difference This is the mean time to get x different sites from a sequence of n nucleotides. It is also a measure of distance that dependents only on the number of substitutions What is the probability of n differences after time t? We use the principle of maximum likelihood and the Bernoulli distribution

Gorilla Pan paniscus Pan troglodytes Homo sapiens Homo neandertalensis Time Divergence - number of substitutions Phylogenetic trees are the basis of any systematic classificaton

A pile model to generate the binomial. If the number of steps is very, very large the binomial becomes smooth. The normal distribution is the continous equivalent to the discrete Bernoulli distribution Abraham de Moivre ( )

If we have a series of random variates Xn, a new random variate Yn that is the sum of all Xn will for n→∞ be a variate that is asymptotically normally distributed. The central limit theorem

The normal or Gaussian distribution Mean:  Variance:  2

Important features of the normal distribution The function is defined for every real x. The frequency at x = m is given by The distribution is symmetrical around m. The points of inflection are given by the second derivative. Setting this to zero gives

++ --  -2  0.95 Many statistical tests compare observed values with those of the standard normal distribution and assign the respective probabilities to H 1.

The Z-transform The variate Z has a mean of 0 and and variance of 1. A Z-transform normalizes every statistical distribution. Tables of statistical distributions are always given as Z- transforms. The standard normal The 95% confidence limit

P(  -  < X <  +  ) = 68% P(   < X <   ) = 90% P(   < X <   ) = 95% P(   < X <   ) = 99% P(   < X <   ) = 99.9% The Fisherian significance levels ++ --  -2  0.95 The Z-transformed (standardized) normal distribution

Why is the normal distribution so important? The normal distribution is often at least approximately found in nature. Many additive or multiplicative processes generate distributions of patterns that are normal. Examples are body sizes, intelligence, abundances, phylogenetic branching patterns, metabolism rates of individuals, plant and animal organ sizes, or egg numbers. Indeed following the Belgian biologist Adolphe Quetelet ( ) the normal distribution was long hold even as a natural law. However, new studies showed that most often the normal distribution is only a approximation and that real distributions frequently follow more complicated unsymmetrical distributions, for instance skewed normals. The normal distribution follows from the binomial. Hence if we take samples out of a large population of discrete events we expect the distribution of events (their frequency) to be normally distributed. The central limit theorem holds that means of additive variables should be normally distributed. This is a generalization of the second argument. In other words the normal is the expectation when dealing with a large number of influencing variables. Gauß derived the normal distribution from the distribution of errors within his treatment of measurement errors. If we measure the same thing many times our measurements will not always give the same value. Because many factors might influence our measurement errors the central limit theorem points again to a normal distribution of errors around the mean. In the next lecture we will see that the normal distribution can be approximated by a number of other important distribution that form the basis of important statistical tests.

x,s  The estimation of the population mean from a series of samples The n samples from an additive random variate. Z is asymptotically normally distributed. Confidence limit of the estimate of a mean from a series of samples.  is the desired probability level. Standard error

How to apply the normal distribution Intelligence is approximately normally distributed with a mean of 100 (by definition) and a standard deviation of 16 (in North America). For an intelligence study we need 100 persons with an IO above 130. How many persons do we have to test to find this number if we take random samples (and do not test university students only)?

One and two sided tests We measure blood sugar concentrations and know that our method estimates the concentration with an error of about 3%. What is the probability that our measurement deviates from the real value by more than 5%?

Albinos are rare in human populations. Assume their frequency is 1 per persons. What is the probability to find 15 albinos among persons? =KOMBINACJE( ,15)* ^15*( )^ =

Home work and literature Refresh: Bernoulli distribution Poisson distribution Normal distribution Central limit theorem Confidence limits One, two sided tests Z-transform Prepare to the next lecture:  2 test Mendel rules t-test F-test Contingency table G-test Literature: Łomnicki: Statystyka dla biologów Mendel: ce Pearson Chi2 test square_test