Review
Common probability distributions Discrete: binomial, Poisson, negative binomial, multinomial Continuous: normal, lognormal, beta, gamma, (negative binomial) Experiment with the distributions in Excel sheet “7 distributions.xlsx”
Beta distribution 0.5,0.5 1,1 1.3,1.3 4,4 50,50 2,60.5,2 Beta function
Beta: key notes Values confined to be 0 < x < 1 Can mimic almost any shape within those bounds Although bounded, can change the bounds by multiplying / dividing x values E.g. survival parameters
Gamma distribution 4, 1 4, 2 1.1, , , 5
Gamma: key notes 0 ≤ x < ∞ Somewhat like an exponential, lognormal, or normal Flexibility without being bounded like the beta distribution E.g. salmon arrival numbers plotted over time Excel function beta.dist() assumes parameters α* = α and β* =1/β
Likelihood case studies
Reading Ecological detective: – Chapter 7 Likelihood and maximum likelihood
Probability If I flip a fair coin 10 times, what is the probability of it landing heads up every time? Given the fixed parameter (p = 0.5), what is the probability of different outcomes? Probabilities add up to 1. I flipped a coin 10 times and obtained 10 heads. What is the likelihood that the coin is fair? Given the fixed outcomes (data), what is the likelihood of different parameter values? Likelihoods do not add up to 1. Hypotheses (parameter values) are compared using likelihood values (higher = better). Likelihood
Probability What is the probability that 5 ≤ x ≤ 10 given a normal distribution with µ = 13 and σ = 4? Answer: What is the probability that –1000 ≤ x ≤ 1000 given a normal distribution with µ = 13 and σ = 4? Answer: What is the likelihood that µ = 13 and σ = 4 if you observed a value of (a) x = 10 (answer: the likelihood is 0.075) (b) x = 14 (answer: the likelihood is 0.097) Conclusion: if the observed value was 14, it is more likely that the parameters are µ = 13 and σ = 4, because is higher than Likelihood Area under curve between 5 and 10 Height of curve at x = 14 Height of curve at x = 10
ProbabilityLikelihood Notation What is the probability of observing a variety of data Y i given fixed parameter value p? What is the likelihood of different hypotheses about a variety of parameter values p m given that you observed fixed data Y
Probability (binomial) You are studying birth rates in a sea otter population with N = 30 females. During a study season each female either gives birth or does not give birth. Probability: it is (somehow) known that birth rates are p = 0.7. What is the probability that ≤10 of the females give birth? Answer: sum the probabilities for x ≤ 10; giving Examples.xlsx, sheet Sea otter prob
Likelihood (binomial) You are studying birth rates in a sea otter population with N = 30 females. During a study season each female either gives birth or does not give birth. Likelihood: 10 females give birth; 20 do not. What is the likelihood that p = 0.3? 0.5? 0.7? Answer: p = 0.3: L = p = 0.5: L = p = 0.7: L = The MLE is p = 10/30 = L = Examples.xlsx, sheet Sea otter like
Review of logarithms
Negative log likelihood (normal) Minimizing sum of squares, weighted by the variance Model-predicted mean Model-predicted standard deviation Data
Omit constants Include constants x at same minimum 8 Examples.xlsx, sheet NLL vs. Like Guess the likelihood that x = 25 Answer: 6.6× Guess the likelihood that x = 51 Answer: 6.8×10 -82
Multiple observations If observations (data points) are independent then we can multiply the likelihoods together Or add up the negative log likelihoods
Multiple observations The sea otter study has been extended to three field seasons, based on the same N = 30 females. In these three seasons, 10, 20, and 13 females gave birth respectively. Assuming independence between years, what is the overall likelihood that p = 0.3? 0.5? 0.7? YearObservedp = 0.3p = 0.5p = 0.7p = L1×L2×L3L1×L2×L Total NLL Scaled L MLE Binomial likelihood for p in that year given the observed data Product of likelihood Likelihood divided by MLE likelihood 8 Examples.xlsx, sheet Sea otter 3 seasons
Mark-recapture example We tagged 100 fish Went back a few days later (after mixing etc.) And recaptured 100 fish λ=5 recaptures were tagged We use Poisson distribution to explore the likelihood of different population sizes (N) The question is: what is N?
What we need Data: number marked, number recaptured, tags recaptured If the population size is N Prop. tagged = num. marked / N Recoveries = prop. tagged × num. recaptured Predicted tags recovered is λ of the Poisson, i.e. if λ = 1,2,…,100 then what is the likelihood of observing 5 recaptures?
Population size Num tagged Prop. tagged Num recapture Pred. tags recoveredLikelihood Examples.xlsx, sheet Mark-recapture Observed data: k = 5
8 Examples.xlsx, sheet Mark-recapture
Multiple observations We go out twice more, capture 100 animals each time, and find 3 of these are tagged the second time, and 4 are tagged the third time
8 Examples.xlsx, sheet Mark-recapture
Multiply the three likelihoods together The result is much narrower 8 Examples.xlsx, sheet Mark-recapture MLE
The likelihood profile Mark-recapture example only had one parameter, so the likelihood profile is just the likelihood For problems with more than one parameter, fix the parameter of interest at discrete values and find the maximum likelihood by searching over all other parameters You can use the likelihood profile to calculate a confidence interval – find the two parameter values where the negative log- likelihood is 1.92 units higher than the MLE value
1.92 units from the MLE Confidence interval for x
Likelihood ratio test and 1.92 When the NLL for a model parameter is more than 1.92 units from the MLE, that is the 95% confidence interval (asymptotically for large sample size, when well behaved, etc.)
8 Examples.xlsx, sheet MR CI calcs 1.92 units higher MLE Mark-recapture example MLE: % CI:
The concept of support The relative likelihood is the amount of support the data offer for one hypothesis compared to another hypothesis The absolute likelihood has no particular meaning without reference to other hypotheses