Download presentation
Presentation is loading. Please wait.
Published byBeatrix Flynn Modified over 8 years ago
1
Statistical Methods
2
2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species. Population – the complete set of sampling units of interest. Sample – the set of sampling units data is collected on.
3
3 Concepts and Notations Parameter – a characteristic of the population we’re interested in, e.g., . Estimator – an equation or process used to produce a parameter estimate from observed data, e.g.,.
4
4 Concepts and Notations Survey – an opportunity to encounter (sign of) the species.
5
5 Concepts and Notations Summation shorthand
6
6 Concepts and Notations Multiplication shorthand
7
7 Concepts and Notations Pr(A) – probability of event A Pr(A|B) – probability of event A conditional upon event B Events A and B are independent if: A B
8
8 Concepts and Notations Pr(X|θ) – probability of observing data X given population parameters θ
9
9 Concepts and Notations Methods of Inference Maximum Likelihood Baysian
10
10 Example: a coin tossing experiment Suppose a coin is tossed 5 times with the result: X = HHTHT Assuming each toss is independent,
11
11 Concepts and Notations Probability statements could be used to estimate parameters using either maximum likelihood or Bayesian methods of inference.
12
12 Likelihood For discrete data, a ‘likelihood’ is simply the probability of observing the data. In a probability statement, the data is conditional upon the parameters. This is reversed in the likelihood, e.g., L( | X) = Pr(X | ) Find what parameter values maximize the likelihood function (i.e., MLE).
13
13 Likelihood The degree of curvature in the likelihood function about the MLE reflects the estimator variance. Technically, variances are obtained from the matrix of second partial derivatives of the likelihood function.
14
14 Example: a coin tossing experiment Likelihood: L(p|HHTHT) = p 3 (1-p) 2 find the value of p that maximizes the likelihood of observing 3 heads from 5 coin tosses (i.e., the MLE). L
15
15 Bayesian Application of Bayes Theorem prior distribution of θ posterior distribution of θ generally unknown, but not required with Markov chain Monte Carlo approaches.
16
16 Bayesian Resulting inferences may be sensitive to choice of prior distribution, particularly when the data contains little information about the parameter(s).
17
17 Example: a coin tossing experiment Bayesian: Use the data to update prior ‘knowledge’ of the distn. to obtain the posterior distn. of p prior Posterior
18
18 Example: a coin tossing experiment Bayesian: Use the data to update prior ‘knowledge’ of the distn. to obtain the posterior distn. of p
19
19 Example: a coin tossing experiment Bayesian: Use the data to update prior ‘knowledge’ of the distn. to obtain the posterior distn. of p
20
20 Example: a coin tossing experiment Bayesian: Data with high information content can overwhelm stronger priors, e.g., 30 heads from 50 coin tosses
21
21 Max. Likelihood vs. Bayesian MLBayesian ProsWidely used. Computationally straightforward. More natural interpretation. Easier to implement complex models. Can incorporate prior knowledge. ConsNumerical issues (has ‘true’ maximum been found?). ‘Unnatural’ interpretation. How should prior knowledge be expressed. Computationally intensive. Difficult to spot unestimatible parameters.
22
22 Incorporating Covariates Often we may wish to allow parameters to be functions of covariates or other factors of interest (e.g., habitat type, rainfall in the last 24 hours, distance from a road). Generally requires a link function to transform linear relationships on to the 0-1 interval.
23
23 The Logit Link which can be rearranged as,
24
24 The Logit Link Interpreting the effect of a covariate on the probability θ can be difficult due to the non-linear relationship.
25
25 The Logit Link Alternatively, we suggest that effects are interpreted in terms of the odds and odds ratios.
26
26 Example Suppose the odds of horse A losing is 2:1, and the odds for horse B losing is 10:1. Which horse is more likely to lose (i.e., which is the ‘long shot’)? The odds for horse B is 5 times greater than for horse A: OR = 5 (B/A).
27
27 The Logit Link is the odds ratio for a 1 unit increase in the covariate x j. An approximate 95% CI:
28
28 The Logit Link
29
29 The Logit Link
30
30 The Logit Link Not the only possibility, but commonly used and in this context analogous to logistic regression. Any function that constrains values to between 0 and 1 could be used. only transformation function used in PRESENCE (for probabilities).
31
31 Design Matrices Design matrices are used to relate the probabilities that feature in the model (‘real’’ parameters) to beta parameters.
32
32 Design Matrices Real Params Beta Params
33
33 Delta Method A method for calculating the variance of transformed parameters. Involves some calculus and matrix multiplication. Based on large-sample approximations
34
34 Delta Method ^ ^ ^ Var (h( )) = [h’( )] V [h’( )] T vector of the derivatives of h( with respect to variance- covariance matrix of transformation of the parameters
35
35 Delta Method - example
36
36 Delta Method - example
37
37 Hypotheses, Theories and Models A hypothesis is simply a story about how the world (or a part of it) works. A theory is a hypothesis that has become widely accepted after surviving repeated attempts to falsify it.
38
38 Hypotheses, Theories and Models A model is an abstraction of a system, and may include hypotheses and theories about the system. Models are used to describe and predict system behavior. A model may be conceptual, verbal or mathematical.
39
39 Hypotheses, Theories and Models Mathematical models are often developed to project the consequences of hypotheses. Competing hypotheses can be represented as different models. Model-based prediction of system behavior under competing hypotheses represents a key step in the conduct of science and management.
40
40 Comparing models Hypothesis testing seek ‘sufficient’ evidence to falsify a null hypothesis e.g., likelihood ratio tests Information theoretic approaches rank models in order of relative distance from ‘truth’ model weights can be calculated e.g., AIC, DIC, etc Bayesian approaches Bayes factors directly estimate model probabilities given the data
41
41 Comparing models Important to distinguish between strength of inference and strength of evidence. Strength of inference comes from study design. Strength of evidence comes from the collected data. May require greater evidence of an ‘effect’ from studies with little inferential power.
42
42 Likelihood Ratio Tests 1. Fit 2 models to the data, with and without the effect of interest ( θ ). 2. Calculate test statistic as 3. Compare to chi-square distribution with degrees of freedom equal to the number of parameters required to estimate θ.
43
43 Information Theoretic Methods Relative measure of distance from ‘truth’ based on Kullback-Liebler Information. K = number of parameters Models with small values are preferred. Parsimony is useful by-product.
44
44 Information Theoretic Methods Small sample adjustment could be used. Debate over ‘effective’ sample size in occupancy models.
45
45 Information Theoretic Methods Can adjust AIC for overdispersion/poor model fit.
46
46 AIC = AIC –min(AIC) is what is important. Information Theoretic Methods
47
47 Information Theoretic Methods Likelihood of model j, given the data.
48
48 Information Theoretic Methods Model weights can be calculated that are ad- hoc measures of support for each model in the candidate set Can be used to obtain model averaged estimates, or summed across models to evaluate the importance of specific factors.
49
49 Information Theoretic Methods Evidence ratios A measure for the degree of support for model j compared to model k.
50
50 Bayes Factors Compares the expected probability of observing the data under two models, based upon prior knowledge about the model parameters.
51
51 Bayes Factors A comparison of how consistent the observed data are with both the assumed model structure and set of prior distributions
52
52 Bayes Factors Can be used to derive the posterior model probabilities
53
53 Bayes Factors Can be very sensitive to assumed prior distributions, particularly ‘non-informative’ ones. Difficult to extract required information from Markov chain Monte Carlo analysis. BF (and posterior model probabilities) are not based upon the posterior distributions of parameters for each model, only the priors.
54
54 Model Averaging May want to make inferences from multiple models.
55
55 Model Averaging Care must be taken when model averaging parameters that respective parameters in different models have the same interpretation. Estimated parameters can be sensitive to other covariates included in model (e.g., interactions or correlations). Suggest model averaging should be performed on the response variable (i.e., the probabilities).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.