Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove that a theory is false by finding contradicting evidence but, cannot confirm a theory (may find a better theory or a falsification) 2) Statistical methods of Fisher – everything you have learned this semester
Bayesian inference: 1) Requires explicit assignment of prior probabilities, based on existing information, to the outcomes of experiments. 2) Allows for assignment of probabilities of specific outcomes
The frequentist assumption: there is a true, fixed value for parameters of interest (eg. ave weight of yellow perch in Lake Erie, ave height of college students) and the expected value of the parameter equals the average value obtained by random sampling (infinitely repeated). Some problem: -populations change constantly, so nothing really fixed -truly random sample hard to obtain (convenience samples) -experiments rarely repeated
Confidence in (frequentist) parameter estimates: -K% confidence interval around Xbar (usually 95%) -Tells us that will lie within (Xbar 1.96SE) in 95% of the infinite number of possible samples we could collect -It does not state that there is a 95% probability that the true mean does occur in the confidence interval
Frequentist hypothesis testing: --The P-value of a statistical test is the probability of observing the given result conditional on H 0. = P(x|H 0 ) = probability of x, given H 0 -- The P-value does not say how probable the null is given the data i.e., P(H 0 |x) --”Proving” the null is false, does not prove the alternative to be true.
P (H A |x) Probability of the alternative hypothesis given the observed data Likelihood that the treatment caused the effect (for experiments) What we really want to know!!
Many researchers consider that a P-value < 0.05 means that P (H A |x) is high, in other words they think it likely that the alternative hypothesis is true …… when they really tested P (x|H 0 ).
Bayesian parameter estimation: Begins with the joint probability of two events being equal to the probability of the first event and the conditional probability of the second event, given the first event. It is usually expressed: P (A|B) = P(B|A) * P(A) P(B) Bayes’ Theorem
(b. 1702, London - d. 1761, Tunbridge Wells, Kent), mathematician who first used probability inductively and established a mathematical basis for probability inference (a means of calculating, from the number of times an event has not occurred, the probability that it will occur in future trials). He set down his findings on probability in "Essay Towards Solving a Problem in the Doctrine of Chances" (1763), published posthumously in the Philosophical Transactions of the Royal Society of London. Rev Thomas Bayes
Consider 2 racehorses: --Buttercup and Muffin --Buttercup has won 7 of the last 12 races (53.8%) between the two horses --Muffin has won 5 of the last 12 (41.7%) --So…… there is a greater likelihood of Buttercup winning, right? Unless, it is raining……….
--On 3 of Muffin’s previous 5 wins, it was raining --So, if it is raining, you could estimate Muffin’s chance of a win at 3/5 or 60% --But then you ignore the fact that, overall Buttercup has won more often. --You must combine the two pieces of information --Look at 4 possible situations
RainingNot raining Muffin wins32 Muffin looses16 --If it is raining on the day you place your bet, you want to know the probability of Muffin wining in the rain. -- # times X happened / # time X could have happened --3 / 4 = 75% chance of Muffin winning
P (Muffin wins|rain) = P (rain|Muffin wins) * P (Muffin wins) It was raining on 3 of 5 days Muffin won, therefore, P (rain|Muffin wins) = 3/5=0.6 P (Muffin wins) = 5/12 = P (rain) = 4/12 = P (A|B) = P(B|A) * P(A) P(B) P (rain)
P (Muffin wins|rain) = P (rain|Muffin wins) * P (Muffin wins) P (rain) P (Muffin wins|rain) = 0.6 * / = 0.758
P (A|B) = P(B|A) * P(A) P(B) --In Bayesian terminology P(A) is the prior probability of obtaining a specific parameter --P(A) is the probability of observing A expected by the investigator before the experiment is conducted -- existing data --objective statement without data --subjective measure of belief --Priors in ecology usually based on previous data, information usually reported in the introduction of papers.
Non-mathematical explanation of prior probabilities --You want to estimate the whole season batting average of a Player X based on the first 2 weeks of the season --Player X batted.550 during the first two weeks --Any player is very unlikely to bat >.500 for the entire season -- The players, or the entire league’s batting average from the last year could be used as a prior probability -- Common sense dictates that Player X’s average will tend down from.550 and toward last season’s total average
--prior probabilities can be “non informative” in the absence of any inforamtion --in that case P(A) is a uniform distribution, with all values equally likely --Posterior probability is what you are trying to figure out, the probability of the event in which you are interested.
Bayesian Hypothesis testing: --Can generalize; Bayes theorem can be extended to assess the relative probabilities of alternative hypotheses (alternative prior probability distributions) --Can determine the likelihood of an hypothesis posterior odds = prior odds * Bayes factor
-- different scales of Bayes factor proposed to say whether the data favor H A --More intuitive and satisfying than frequentist statistics, but harder to do! -- Math harder (integration) and little software currently available
?? As N increases, it becomes less likely to accept the Null at a fixed alpha level. What guidelines should we use to adjust alpha as sample size increases? ?? How can policy makers use a rejection of a null hypothesis at the p=0.05 level to make a decision? What does rejection at the 0.05 level mean anyway? ?? Why do researcher act as if they have tested P (H A |x) when they really tested P (x|H 0 )? Do most researchers realize what they are testing? Or do they realize and not care….. They just want to support that ol’ alternative hypothesis? ?? What will motivate statisticians to write easy to use Bayesian software? All students be prepared to answer these questions