Tables, Figures, and Equations

Slides:



Advertisements
Similar presentations
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Advertisements

Correlation & the Coefficient of Determination
Chapter 4: Basic Estimation Techniques
Department of Engineering Management, Information and Systems
CHAPTER 27 Mantel Test From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
Chi-Square and Analysis of Variance (ANOVA)
Multiple Regression. Introduction In this chapter, we extend the simple linear regression model. Any number of independent variables is now allowed. We.
Lecture Unit Multiple Regression.
Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 18: The Chi-Square Statistic
Simple Linear Regression Analysis
Copyright © 2012 by Nelson Education Limited. Chapter 13 Association Between Variables Measured at the Interval-Ratio Level 13-1.
Correlation and Linear Regression
Multiple Regression and Model Building
Chapter 16: Correlation.
Heibatollah Baghi, and Mastee Badii
Managerial Economics in a Global Economy
Mutidimensional Data Analysis Growth of big databases requires important data processing.  Need for having methods allowing to extract this information.
CHAPTER 24 MRPP (Multi-response Permutation Procedures) and Related Techniques From: McCune, B. & J. B. Grace Analysis of Ecological Communities.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Lecture 7: Principal component analysis (PCA)
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
CHAPTER 22 Reliability of Ordination Results From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach,
Chapter 13 Introduction to Linear Regression and Correlation Analysis
CHAPTER 19 Correspondence Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
SIMPLE LINEAR REGRESSION
Indicator Species Analysis
Linear Regression and Correlation Analysis
CHAPTER 30 Structural Equation Modeling From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach,
Chapter 6 Distance Measures From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
CHAPTER 18 Weighted Averaging From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
Tables, Figures, and Equations
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Arithmetic Operations on Matrices. 1. Definition of Matrix 2. Column, Row and Square Matrix 3. Addition and Subtraction of Matrices 4. Multiplying Row.
Relationships Among Variables
Example of Simple and Multiple Regression
Regression Analysis (2)
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Introduction to the gradient analysis. Community concept (from Mike Austin)
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Lecture 12 Factor Analysis.
Correlation & Regression Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Tutorial I: Missing Value Analysis
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Correlation and Simple Linear Regression
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Correlation and Simple Linear Regression
6-1 Introduction To Empirical Models
Chapter 10 Correlation and Regression
Correlation and Simple Linear Regression
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Simple Linear Regression and Correlation
Product moment correlation
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Tables, Figures, and Equations From: McCune, B. & J. B. Grace. 2002. Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon http://www.pcord.com

Figure 21. 1. Decision tree for using CCA for community data Figure 21.1. Decision tree for using CCA for community data. Assume that we have a site  species matrix and a site  environment matrix and that chi-square distances are acceptable. RDA is a constrained ordination method based on a linear model (see “Variations” below).

Table 21.1. Questions about the community (A) and environmental or experimental design (E) matrices that are appropriate for using CCA.

The basic method 1. Start with arbitrary but unequal site scores, x. The species data matrix Y contains nonnegative abundances, yij, for i = 1 to n sample units and j = 1 to p species. y+j indicates species totals yi+ indicates and sample unit (site) totals The environmental matrix Z contains values n sites by q environmental variables. 1. Start with arbitrary but unequal site scores, x.

2. Calculate species scores, u, by weighted averaging of the site scores: a = user-selected scaling constant as described later.

2. Calculate species scores, u, by weighted averaging of the site scores: Score for species j a = user-selected scaling constant as described later.

Score (weight) for site i Score for species j 2. Calculate species scores, u, by weighted averaging of the site scores: Score (weight) for site i Score for species j a = user-selected scaling constant as described later.

3. Calculate new site scores, x 3. Calculate new site scores, x*, by weighted averaging of the species scores: a = user-selected scaling constant as described later.

Score (weight) for species j Score for site i 3. Calculate new site scores, x*, by weighted averaging of the species scores: Score (weight) for species j Score for site i a = user-selected scaling constant as described later.

4. Obtain regression coefficients, b, by weighted least- squares multiple regression of the sites scores on the environmental variables. The weights are the site totals stored in the diagonal of the otherwise empty, n  n square matrix R.

4. Obtain regression coefficients, b, by weighted least- squares multiple regression of the sites scores on the environmental variables. The weights are the site totals stored in the diagonal of the otherwise empty, n  n square matrix R. Environmental matrix WA scores

5. Calculate new site scores that are the fitted values from the preceding regression: These are the "LC scores" of Palmer (1993), which are linear combinations of the environmental variables.

6. Adjust the site scores by making them uncorrelated with previous axes by weighted least squares multiple regression of the current site scores on the site scores of the preceding axes (if any). The adjusted scores are the residuals from this regression.

7. Center and standardize the site scores to a mean = 0 and variance = 1.

8. Check for convergence on a stable solution by summing the squared differences in site scores from those in the previous iteration. If the convergence criterion (detailed below) has not been reached, return to step 2.

9. Save site scores and species scores, then construct additional axes as desired by going to step 1.

Axis scaling Centered with Unit Variance. The site scores are rescaled such that the mean is zero and the variance is one. Three steps: where xi* is the new site score wi* is the weight for site i (wi* = yi+ / y++)

Hill's scaling standardizes the scores such that: In CCA, Hill's scaling is accomplished by multiplying the scores by a constant based on la / 1-l (see below). Thus it is a linear rescaling of the axis scores.

Table 21.2. Constants used for rescaling site and species scores in CCA. Combining the choices for axis scaling and optimizing species or sites results in the following constants used to rescale particular axes. Lambda (l) is the eigenvalue for the given axis. Alpha (a) is selected as described in the text.

Interpreting output 1. Correlations among explanatory variables Table 21.3. Correlations among the environmental variables.

2. Iteration report. ITERATION REPORT ----------------------------------------------------------------- Calculating axis 1 Residual = 0.53E+04 at iteration 1 Residual = 0.96E-01 at iteration 2 Residual = 0.47E-01 at iteration 3 Residual = 0.19E-01 at iteration 4 Residual = 0.84E-02 at iteration 5 Residual = 0.43E-02 at iteration 6 Residual = 0.24E-02 at iteration 7 Residual = 0.14E-02 at iteration 8 Residual = 0.88E-03 at iteration 9 Residual = 0.54E-03 at iteration 10 Residual = 0.46E-05 at iteration 20 Residual = 0.40E-07 at iteration 30 Residual = 0.34E-09 at iteration 40 Residual = 0.30E-11 at iteration 50 Residual = 0.69E-13 at iteration 58 Solution reached tolerance of 0.100000E-12 after 58 iterations. Calculating axis 2 Residual = 0.20E+01 at iteration 1 Residual = 0.30E-03 at iteration 2 etc....

3. Total variance in the species data. It is the sum of squared deviations from expected values, which are based on the row and column totals. Let eij = the expected value of species j at site i y+j = total for species j, yi+ = total for site i, and y++ = community matrix grand total. The variance of species j, var(yj), is

and the total variance is

4. Axis summary statistics Table 21.4. Axis summary statistics

5. Multiple regression results Table 21.5. Multiple regression results (regression of sites in species space on environmental variables).

6. Final scores for sites and species 6. Final scores for sites and species. Ordination scores (coordinates on ordination axes) are given for each site, x, and each species, u (Tables 21.6, 21.7, 21.8). Table 21.6. Sample unit scores that are derived from the scores of species. These are the WA scores. Raw data totals (weights) are also given

Table 21.7. Sample unit scores that are linear combinations of environmental variables for 100 sites. These are the LC Scores that are plotted in Fig. 21.3.

Table 21.8. Species scores and raw data totals (weights).

From: McCune, B. 1997. Influence of noisy environmental data on canonical correspondence analysis. Ecology 78:2617-2623.

No noise LC Scores WA Scores Figure 21.2 Influence of the type and amount of noise in environmental data on LC site scores (left column) and WA site scores (right column) from CCA, based on analysis of simulated responses of 40 species to two independent environmental gradients of approximately equal strength.

Moderate noise added to two otherwise perfect environmental variables LC Scores WA Scores Figure 21.2 (cont.) A small amount of noise added to the two environmental variables.

10 random environmental variables LC Scores WA Scores Figure 21.2 (cont.) The two underlying environmental variables replaced with ten random variables.

7. Weights for sites and species 7. Weights for sites and species. Sites and species are weighted by their totals. Table 21.8. Species scores and raw data totals (weights).

8. Correlations of environmental variables with ordination axes. "interset correlations" are correlations of environmental variables with x*, the WA scores. "intraset correlations" are correlations of environmental variables with x the LC scores.

Table 21.9. Biplot scores and correlations for the environmental variables with the ordination axes. Biplot scores are used to plot the vectors in the ordination diagram. Two kinds of correlations are shown, interset and intraset.

9. Biplot scores for environmental variables The environmental variables are often represented as lines radiating from the centroid of the ordination. The biplot scores give the coordinates of the tips of the radiating lines (Fig. 21.3).

If Hill's scaling is used, then The coordinates for the environmental points are based on the intraset correlations. These correlations are weighted by a function of the eigenvalue of an axis and the scaling constant (): where vjk = the biplot score on axis k of environmental variable j, rjk = intraset correlation of variable j with axis k, and α = scaling constant If Hill's scaling is used, then

10. Monte Carlo tests of significance Ho: No linear relationship between matrices. For this hypothesis, the rows in the second matrix are randomly reassigned within the second matrix. Ho: No structure in main matrix and therefore no linear relationship between matrices. For this hypothesis, elements in the main matrix are randomly reassigned within columns.

To evaluate the significance of the first CCA axis: n = the number of randomizations (permutations) with an eigenvalue greater than or equal to the corresponding observed eigenvalue N = the total number of randomizations (permutations) then p = (1 + n)/(1 + N) p = probability of type I error for the null hypothesis that you selected.

Table 21.10. Monte Carlo test results for eigenvalues and species- environment correlations based on 999 runs with randomized data.

Table 21.11. Comparison of CCA and NMS of the example data set.

Redundancy analysis Given matrix of response variables (A) matrix of explanatory variables (E). The basic steps of RDA as applied in community ecology are: Center and standardize columns of A and E. Regress each response variable on E. Calculated fitted values for the response variables from the multiple regressions. Perform PCA on the matrix of fitted values Use eigenvectors from that PCA to calculate scores of sample units in the space defined by E.

Regression with multiple dependent variables In the usual case of regressing a single dependent variable (Y) on multiple independent variables (X), the regression coefficients (B) are found by: B = (XX)-1 X’Y With multiple dependent variables, Y and B are matrices rather than vectors.