Download presentation
Presentation is loading. Please wait.
1
Statistical Distributions
2
Why do we care about probability?
Foundation of theory of statistics. Description of uncertainty associated with random variables. Measurement error Process error Needed to understand: Estimating model parameters. Model validation Prediction
3
Types of random variables
Discrete random variables take only integer values (e.g. counts, memberships in categories). They are represented by probability mass functions. Continuous random variables are represented by probability density functions.
4
Terminology This notation is equivalent.
You will see it all in this course. These are general stand-ins for distributions.
5
Probability Mass Functions
For a discrete random variable, X, the probability that x takes on a value x is a discrete density function, f(x) also known as probability mass or distribution function. 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Event (z) Probability S=support is the domain of function [z]
6
Probability Density Functions: Continuous variables
A probability density function [z] gives the probability that a random variable Z takes on values within a range. [z]≥0 Pr(a ≤ z ≤ b) 𝑎 𝑏 𝑧 𝑑𝑧 −∞ ∞ 𝑧 𝑑𝑧 =1 Non-negative…
7
Probability Mass Functions: First Moment
The expectation of a random variable z is the weighted value of the possible values that z can take, each value weighted by the probability that z assumes it. For discrete variables: Analogous to “center of gravity”. First moment. p(-1)= p(0)= p(1)= p(2)=0.35
8
Probability Mass Functions: Second Central Moment
The variance of a random variable reflects the spread of Z values around the expected value. For discrete variables: Second central moment of a distribution.
9
Continuous distributions
10
Cumulative distribution and quantile function
Z takes on a value less than or equal to u
11
Probability Distributions & Stochasticity
To build stochastic models for ecological data, we need a toolbox of [z]s. This toolbox contains probability functions and probability density functions that describe the way in which different types of data arise. The [z]’s link our deterministic model with the data in a way that reveals uncertainty.
12
A toolbox of [z]’s Discrete Bernoulli: binary outcome of one trial
Binomial: Outcome of multiple trials. Poisson: Counts. Negative binomial: Overdispersed counts. Multinomial: Multiple categorical outcomes. Continuous Normal. Lognormal. Exponential Gamma Beta Others (Uniform, Dirilecht, Wishart).
13
Binomial distribution: Number of successes in n trials (Discrete events can take one of two values)
p = 0.5 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Event (xz Probability E[z] = np Variance =np(1-p) n = number of trials p = prob. of success Example: Probability of survival derived from population data
14
Binomial distribution: Number of successes in n trials (Discrete events can take one of two values)
What are the data? What parameters do we want to estimate? What terms control the uncertainty? E[x] = np Variance =np(1-p) n = number of trials p = prob. of success 0 ≤ p ≤ 1 Data are count (z==y). Prob (p),
15
Binomial distribution
16
Bernoulli=Binomial with n=1 trial
17
Multinomial distribution: Number of successes in n trials with > 2 possible outcomes for “success”
18
Poisson Distribution: Counts (or getting hit in the head by a horse)
500 0.5 1 2 3 4 5 6 7 POISSON 100 200 300 400 Count 0.0 0.1 0.2 0.3 0.4 Proportion per Bar Number of Seedlings/quadrat y= number of units per sampling effort λ=average number of units **Alt param= λ=rt
19
Poisson distribution
20
Exercise What are the data? What parameters do we want to estimate?
What terms control the uncertainty? Data=z’s, we want to estimate the lambda/////
21
Clustering in space or time (overdispersion)
Poisson process E[z]=Variance[z] Negative binomial? Poisson process E[z]<Variance[z] Overdispersed Clumped or patchy
22
Negative binomial: Table 4.2 & 4.3 in H&M Bycatch Data
E[Z]=0.279 Variance[Z]=1.56 Suggests temporal or spatial aggregation in the data
23
Negative Binomial: Counts
100 0.2 90 80 70 60 Count 50 0.1 Proportion per Bar 40 N=total number of trials. R=total number of successes P=Probability of success 30 20 10 0.0 10 20 30 40 50 Number of Seeds NEGBIN
24
Negative Binomial: Counts
100 0.2 90 80 70 60 Count 50 0.1 Proportion per Bar 40 30 20 10 0.0 10 20 30 40 50 Number of Seeds NEGBIN
25
Negative binomial
26
Normal Distribution E[x] = μ Variance = σ2 Normal PDF with mean = 0 y
0.2 0.4 0.6 0.8 1 -5 -4 -3 -2 -1 2 3 4 5 Prob(x) Var = 0.25 Var = 0.5 Var = 1 Var = 2 Var = 5 Var = 10 Reuls
27
Normal Distribution How do the data arise?
Often, the process we model represents the sum of continuous variables (ex., growth increment in trees increases on soil C content). The fact that sums of things tend to be normally distributed is what stands behind the central limit theorem, which shows that the mean of random variables will be normally distributed even when the variables themselves follow other distributions. With large sample sizes, many other distributions (Poisson, Gamma etc) can be approximated by a normal
28
Normal Distribution What are the data?
What parameters do we want to estimate? What terms control the uncertainty?
29
A JAGS aside Note that in JAGS, the normal distribution is parameterized in terms of precision, τ (tau). Precision is the inverse of variance, so a when we use a small precision as a prior, it’s equivalent to a large variance (i.e., the uncertainty for that parameter is large).
30
Lognormal: One tail and no negative values
0.8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 10 20 30 40 50 60 70 y is always positive x Product of processes.
31
Lognormal: Radial growth data
1 2 3 4 HEMLOCK 50 100 150 Count 0.0 0.1 0.2 Proportion per Bar REDCEDAR 10 20 30 40 Growth (cm/yr) Red cedar Hemlock (Data from Date Creek, British Columbia)
32
Exponential 1 2 3 4 5 6 10 20 30 40 50 60 70 80 0.0 0.1 0.2 0.3 0.4 Proportion per Bar Count Data for weather events. Rainfall, probability of extreme events. Exponential is a special case of the gamma.. Variable
33
Exponential: Growth data (negatives assumed 0)
1 2 3 4 5 6 7 8 Growth (mm/yr) 200 400 600 800 1000 1200 Count 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Proportion per Bar Beilschemedia pendula (Data from BCI, Panama)
34
Gamma: One tail and flexibility
mean.gamma<-mean(sapling$Growth) var.gamma<-var(sapling$Growth) # now match moments to distribution parameters. #get prior shape parameters for prior gamma distribution of lambda: a=mean.gamma^2/var.gamma # shape parameter b=mean.gamma/var.gamma #rate parameter
35
Gamma: “raw” growth data
1 2 3 4 5 6 7 8 9 Growth (mm/yr) 200 400 600 800 1000 Count 10 20 30 50 100 150 200 Alseis blackiana Cordia bicolor Growth (mm/yr) (Data from BCI, Panama)
36
Beta distribution: proportions
Survival, proportions.
37
Beta: Light interception by crown trees
(Data from Luquillo, PR)
38
Others Dirilecht: multivariate version of beta Multivariate normal
39
Bolker 2007
40
Bolker 2007
41
The Method of Moments You can calculate the sample values of the moments of the distributions and match them up with the theoretical moments. Recall that: The MOM is a good way to get a first (but biased) estimate of the parameters of a distribution.
42
Why is MOM important? We often need to account for uncertainty in our models but we only have data on sample mean and variance. How can we make predictions when the mean and variance are not the parameters of the distribution (i.e., the normal distribution)? The MOM can give us initial estimates of distribution parameters by matching moments (shape parameters) with sample mean and variance.
43
Haul & capture data (ED)
What distribution? What are the parameters? What are the moments of this distribution? Catch per haul
44
MOM: Negative binomial
We can get these from the data…
45
MOM Gamma Model biomass production (μ) as a function of rainfall (x).
Production cannot be negative, so you need a distribution for data that are continuous and strictly positive. Moreover, a plot of the data shows that the spread of the residuals increases with increasing production, casting doubt on the assumption that variance is constant How do you represent uncertainty in production? Why is normal not appropriate?
46
MOM: Gamma Match moments (mean and variance) to parameters of gamma distribution, scale (α) and rate (β).
47
Mixture models What do you do when your data don’t fit any known distribution? Add covariates Mixture models Discrete Continuous
48
Zero-inflated models Zero-inflated models are a common type of finite mixture models. Combine a standard discrete probability distribution (e.g. binomial, Poisson, or negative binomial), which typically include some probability of sampling zero counts even when some individuals are present with some additional process that can also lead to a zero count (e.g. complete absence of the species or trap failure). Extremely useful in ecology.
49
An example: Seed predation
y =no seeds taken out of N available t1 t2 ( ) Assume each seed has equal probability (p) of being taken. Then: Normalization constant
50
Zero-inflated binomial
51
Discrete mixtures
52
Discrete mixture: Zero-inflated binomial
53
Continuous mixture Seed retention times Turdus albicollis
Regurgitation Defecation Combination of a lognormal and a truncated normal distribution Retention time (min) Uriarte et al. 2011
54
Many other distributions…..
rvm(100,30,2) rose.diag Zimmerman et al. 2007
55
Some intuition for likelihood and stochasticity
0.14 0.12 0.10 0.08 P(z|θ) 0.06 0.04 0.02 0.00 -10 -5 5 10 Observations of z What is the probability of obtaining the observations (z) conditional on the value of θ?
56
Some intuition for likelihood and stochasticity
0.14 0.12 0.10 “Unlikely” values for θ 0.08 P(z|θ) 0.06 0.04 0.02 0.00 -10 -5 5 10 Observations of z
57
Some intuition for likelihood and stochasticity
0.14 0.12 0.10 The most “likely” Values for θ 0.08 P(z|θ) 0.06 0.04 0.02 0.00 -10 -5 5 10 Observations of z
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.