Chapter 4 – Distance methods

Slides:



Advertisements
Similar presentations
Chapter 7 Statistical Data Treatment and Evaluation
Advertisements

Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Lesson Inferences Between Two Variables. Objectives Perform Spearman’s rank-correlation test.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Evaluating Hypotheses
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Chapter 11: Inference for Distributions
Inferences About Process Quality
5-3 Inference on the Means of Two Populations, Variances Unknown
Hypothesis Testing Using The One-Sample t-Test
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Statistical Inference for Two Samples
AM Recitation 2/10/11.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Overview of Statistical Hypothesis Testing: The z-Test
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Hypothesis Testing.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Chapter 8 Hypothesis Testing. Section 8-1: Steps in Hypothesis Testing – Traditional Method Learning targets – IWBAT understand the definitions used in.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.2.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
LECTURE 19 THURSDAY, 14 April STA 291 Spring
What are Nonparametric Statistics? In all of the preceding chapters we have focused on testing and estimating parameters associated with distributions.
Basic concept Measures of central tendency Measures of central tendency Measures of dispersion & variability.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.
Chapter 8 Introduction to Hypothesis Testing ©. Chapter 8 - Chapter Outcomes After studying the material in this chapter, you should be able to: 4 Formulate.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
May 2004 Prof. Himayatullah 1 Basic Econometrics Chapter 5: TWO-VARIABLE REGRESSION: Interval Estimation and Hypothesis Testing.
Chapter 7 Inferences Based on a Single Sample: Tests of Hypotheses.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 10.17:
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Point Pattern Analysis Point Patterns fall between the two extremes, highly clustered and highly dispersed. Most tests of point patterns compare the observed.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Math 4030 – 9a Introduction to Hypothesis Testing
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Fri, Nov 12, 2004.
PCB 3043L - General Ecology Data Analysis.
Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,
Point Pattern Analysis
© Copyright McGraw-Hill 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Formulating the Hypothesis null hypothesis 4 The null hypothesis is a statement about the population value that will be tested. null hypothesis 4 The null.
Inferences Concerning Variances
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
1 Chapter 5 – Density estimation based on distances The distance measures were originally developed as an alternative to quadrat sampling for estimating.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
METU, GGIT 538 CHAPTER V MODELING OF POINT PATTERNS.
Virtual University of Pakistan
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Spatial Point Pattern Analysis
Chapter 10 Analyzing the Association Between Categorical Variables
Analyzing the Association Between Categorical Variables
Inferences Between Two Variables
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

Chapter 4 – Distance methods Distance sampling (or called plotless sampling) is widely used in forestry and ecology to study the spatial patterns of plants. Numerous mathematical models based on distance sampling have been developed since the 50’s. These models depend partly or wholly on distances from randomly selected points to the nearest plant or from a randomly selected plant to its nearest neighbor. The majority of the models are based on the assumptions that (1) the population of interest is randomly distributed (Poisson distribution) within an infinitely large area and (2) an observed distribution is a realization (or part) of the theoretical population. Distance methods make use of precise information on the locations of events and have the advantage of not depending on arbitrary choices of quadrat size or shape.

In general a buffer zone is needed to eliminate edge effect. x y 20 40 60 80 100 ri Two types of distance measures: from tree to tree and from point to tree. In general a buffer zone is needed to eliminate edge effect.

Nearest neighbor distance index This index is the simplest one, based on the distance from a tree to its nearest neighbor. It was first developed by Clark and Evans (1954). It is defined as where R = the nearest neighbor index = average distance from randomly selected plants to their nearest neighbors = expected mean distance between nearest neighbors. Under the Poisson distribution with intensity l, we have * Clark, P.J. and Evans, F.C. 1954. Distance to nearest neighbor as a measure of spatial relationships in populations. Ecology 35:445-453.

Testing the nearest neighbor distance index The ratio R provides a method for detecting the degree to which the observed distance departs from random expectation. In a regular distribution, R would be significantly greater than 1, whereas in an aggregated distribution R would be significantly less than 1. To test the null hypothesis (H0) that the observed distance is from a randomly distributed population, we have where and Two-tail test: p-value = p(|z|  zobs). Large |zobs| value has small p-value, evidence against H0, suggesting aggregated or regular pattern. One-tail test: p-value = p(z  zobs) for testing regularity, or p-value = p(z  zobs) for testing aggregated pattern.

Derivation of the nearest neighbor distance We now go on to show how the nearest neighbor distance was derived. Assume a population of organisms randomly distributed with intensity l, the probability of x individuals falling in any area of unit size is Then, the number of individuals in a circle of radius r follows a Poisson distribution with mean lr2: Similarly, the probability that the number of individuals in the annulus between the concentric circles radii r and r1 is r r1 •

p(r)  p(circle r is empty, but individuals occur in the annulus) • The probability for the nearest neighbor distance r can be derived as follows. p(r)  p(circle r is empty, but individuals occur in the annulus) = p(circle r is empty)  p(individuals occur in the annulus) It is straightforward to compute the two probabilities: The first probability is: The second probability is: Therefore,

The probability for the nearest neighbor distance r is obtained by assuming r1  r: Thus, the pdf for the nearest neighbor distance r is a Weibull distribution: Mean: Variance: Need to use the gamma function:

An example for the nearest neighbor distance We test the spatial pattern for the western hemlock in the Victoria Watershed plot. There are 982 hemlock stems in the 10387 m plot. The procedure is as follows. Randomly choose 200 stems, Measure the distance for each of these 200 stems to its nearest neighbor, Average these 200 distances (= 1.0458), Calculate the expected mean distance (= 1.5104), Compute the density l = 0.1096, The nearest neighbor index R = 0.6924, Calculate the standard error sr = 0.05582, Calculate the z-value = (1.0458-1.5104)/0.05582 = -8.3232, p-value = p(z  zobs) = p(z  -8.3232) = 0, Conclusion: Reject null hypothesis of random distribution; strong evidence for aggregated spatial pattern. R: distance.main(hl.xy,200,”event.event”) Clark & Evans Nearest Neighbor Index 20 40 60 80 100

The nth nearest neighbor distance Thompson (1956) proved that the mean distance to the nth nearest neighbor is For Victoria HL: Observed Hemlock CSR expectation CSR expectation Observed Hemlock Thompson, H.R. 1956. Distribution of distance to nth neighbour in a population of randomly distributed individuals. Ecology 37:391-394.

Hubbell, S.P. et al. 2008. How many tree species are there in the Amazon and how many of them will go extinct? PNAS 105:11498-11504

Index of point to plant distances First proposed by Pielou (1959), is based on the distances from randomly chosen points to their respective nearest events (trees). The index is defined as where a = Pielou’s index of non-randomness l = average density of events per unit area = mean squared distance between randomly chosen points to their nearest neighbors. For randomly distributed population, it is For observed distances, it is calculated as (ri is the distance from the ith point to its nn) * Pielou, E.C. 1959. The use of point-to-plant distances in the study of the pattern of plant populations. Journal of Ecology 47:607-613.

Test statistics for Pielou’s index It can be shown that 2na ~ c22n. (Sketch of the derivation: Following the Weibull distribution on p.7, it is easy to show that  has an exponential distribution: f() = e-  = e- ( is the density per unit circle). Then the sum of ’s follows a gamma distribution of which c2 is a special case.) Thus, Test for the hypothesis of random pattern: p-value = p(c22n > 2na) for testing aggregated pattern of distribution. Large 2na value has small p-value, evidence against H0, suggesting aggregated patterns. p-value = p(c22n < 2na) for testing regularity. Small 2na value leads to small p-value, evidence to suggest regular patterns. (Unbiased estimator)

Hopkins and Skellam’s coefficient of aggregation This test is based on the assumption that a population is randomly distributed if the distribution of distances from a random point to its nearest neighbor is identical to the distribution of distances from a random plant to its nearest neighbor. The index is defined as the ratio of the sum of the squared distances from point-to-plant (1) to the sum of the squared distances from plant-to-plant (2): A = 1 for a randomly distributed population A > 1 for an aggregated population A < 1 for a regular population. To test whether A departs significantly from its expectation of 1, the sampling distribution for the following statistic is derived: Hopkins, B. (with an appendix by Skellam, J.G.) 1954. A new method for determining the type of distribustion of plant individuals. Ann. Bot., London, N.S. 18:213.

It is not difficult to show that x follows a beta distribution. x ~ Beta distribution It is not difficult to show that x follows a beta distribution. That is where The mean and variance of the beta distribution are: Note (same for v): Standard beta distribution:

Test for x x = 0.5 is for random distribution x > 0.5 is for aggregated distribution x < 0.5 is for regular distribution For a large sample size n, x tends towards normality. We have Therefore, a statistical decision can be made based on the size of p-value: p-value = p(z > zobs) for testing aggregated pattern of distribution. Large zobs value has small p-value, suggesting an aggregated pattern. p-value = p(z < zobs) for testing regularity. Small zobs value leads to small p-value, evidence for a regular pattern.

Spatial relationships between two species Unsegregated species Segregated species 0.0 0.2 0.4 0.6 0.8 1.0 Random pattern Aggregated pattern

Index of species segregation Segregation is the degree to which the individuals of two (or more) species tend to separate from one another. We have learned that quadrat counts could be used to test the association of two species, but the results are strongly influenced by quadrat size. An alternative approach which overcomes this problem is based on distance sampling. Assume there are two species, we randomly select an individual plant and locate its nearest neighbor and then record the species type. This process is repeated N times. The data can be summarized in a contingency table similar to the one for the quadrat counts. Nearest neighbor Species A Species B Base Species a b m = a+b c d n = c+d r = a+c s = b+d N

Index of segregation (Kappa statistic) Pielou (1961): Cohen (1960): where Note: With a large sample size,  ~ N(0,1) Nearest neighbor Species A Species B Base Species a (x11) b (x12) m = a+b (x1+) c (x21) d (x22) n = c+d (x2+) r = a+c (x+1) s = b+d (x+2) N * Pielou, E.C. 1961. Segregation and symmetry in two-species population as studied by nearest-neighbor relationships. Journal of Ecology 49:255-269

What we have learned? The concept of nearest neighbor distances Tree-to-tree (event-to-event) distances (Clark & Evans 1954) Point-to-tree distances (Pielou 1959) Hopkins and Skellam’s index of aggregation (Hopkins 1954) Index of species aggregation (kappa statistics)