PCA of Waimea Wave Climate

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

An Introduction to Multivariate Analysis
Correlation and Regression
Chapter 3 Bivariate Data
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Lecture 7: Principal component analysis (PCA)
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Analysis of Variance & Multivariate Analysis of Variance
Correlation and Regression Analysis
Correlation and Linear Regression
Linear Regression and Correlation
MGQ 201 WEEK 4 VICTORIA LOJACONO. Help Me Solve This Tool.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Chapter 3 Data Exploration and Dimension Reduction 1.
Correlation.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.
When trying to explain some of the patterns you have observed in your species and community data, it sometimes helps to have a look at relationships between.
Examining Relationships in Quantitative Research
Interpreting Principal Components Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University L i n.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Project Presentation Template (May 6)  Make a 12 minute presentation of your results (14 students ~ 132 mins for the entire class) NOTE: send ppt by mid-night.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Carlos H. R. Lima - Depto. of Civil and Environmental Engineering, University of Brasilia. Brazil. Upmanu Lall - Water Center, Columbia.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Principal Component Analysis
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Canonical Correlation Analysis (CCA). CCA This is it! The mother of all linear statistical analysis When ? We want to find a structural relation between.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Correlation and Linear Regression
Factor and Principle Component Analysis
Different Types of Data
MATH-138 Elementary Statistics
Descriptive Statistics
Correlation and Simple Linear Regression
Correlation and Regression
Correlation and Regression
CHAPTER 29: Multiple Regression*
DAY 3 Sections 1.2 and 1.3.
Interpreting Principal Components
Project Presentation Template
Descriptive Statistics vs. Factor Analysis
Stranding Patterns in Stenella spp.
Coral Species distribution and Benthic Cover type He’eia HI
Measuring latent variables
Linear Discriminant Analysis
Principal Component Analysis (PCA)
Multivariate Analysis of a Carbonate Chemistry Time-Series Study
3.1: Scatterplots & Correlation
Principal Component Analysis
Dataset: Time-depth-recorder (TDR) raw data 1. Date 2
Factor Analysis (Principal Components) Output
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Principal Component Analysis
Canonical Correlation Analysis and Related Techniques
Business and Economics 7th Edition
Measuring latent variables
Presentation transcript:

PCA of Waimea Wave Climate By Kjersti Johnson

Do the variables that affect wave climate exhibit significant relationships with each other? Are there continuous patterns/environmental gradients in Waimea swell data? The goal of this project is to investigate trends in characteristics of oceanic wave climate data. A Principal Component Analysis (PCA) will help to determine if significant relationships among environmental variables exist and which of these variables exhibit the strongest covariation My hypothesis was that significant wave height and mean wave period will exhibit the strongest covariance in the data. Generally speaking, a longer wave period indicates larger waves. Therefore, these variables could very likely be the principal components. Do certain variables explain more variability in the data than others?

Dataset Obtained from PACIOOS (Pacific Islands Oceans Observing System) Main Matrix: Variables (n=4): 1) Significant Wave Height (ft) 2) Peak Wave Period (seconds) 3) Mean Wave Period (seconds) 4) Mean Wind Speed (mph) All main matrix variables are quantitative 61 Samples – (Each day of December 2017 and January 2018)  Peak swell season

Data Processing Assumptions of normality/linearity Overall, the skewness looks adequate. Acceptable skewness range is -1 to 1 (Significant wave height is the only variable that is barely out of range) Percentage of empty cells is very low (0.82%) There were no outliers! All variables were within 2 standard deviations of the mean Therefore, I did not discard any samples or variables. Looking at the sums it looks like peak period will explain the most variance in the data (highest sum) Although the dataset was normal, the variables are still in different units (seconds, feet, and miles per hour). The column sums are not relatively equal so I chose to do a general relativization by column to give each variable an equal weight in the analysis.

Significant Correlations Dataset Exploration Sig. Wave Height vs. Mean Period: p<0.0001 Sig. Wave Height vs. Peak Period: p=0.0077 Peak Period vs. Mean Period: p<0.0001 Mean Wind Speed vs. Mean Period: p=0.002

Dataset Analysis Results needed: -% Variance Extracted of 4 axes -Eigenvalues vs. Broken Stick Eigenvalues -Randomization results w/ p-values -Eigenvectors/variable loadings -Ordination plot

Results Interpretation 1st Stopping Rule (Eigenvalue > Broken-Stick Eigenvalue) 3 axes meet this criterion The first 3 axes are the PCA axes that explain more variability than would be expected by chance. This is because the Eigenvalue for these axes are larger than the Broken-stick Eigenvalue (which is the eigenvalue produced by chance) for only these columns. 2nd Stopping Rule (Eigenvalue > Mean Randomization) 2 axes meet this criterion 3rd Stopping Rule (p value <0.05) 1 axis meets this criterion

Results Interpretation Coefficient of Determination and Orthogonality R-squared values indicate percent of pattern explained in original distance matrix. Orthogonality indicates independence of axes. Results show 100% orthogonality for each pair of axes.

Results Interpretation Strongest loadings for each axis are highlighted -Peak period and mean period have the largest influence on variation in the data for the first axis -Significant wave height and mean wind speed are the drivers of the second axis -Significant wave height and peak period are the drivers of the third axis

Results Interpretation Ordination Plot

Results Interpretation Highlighted Variable Correlations with Individual Axes

Discussion The results are fairly consistent with my predictions; significant wave height and mean wave period are significantly correlated with a p-value of <0.0001 and both have fairly strong loadings for the principal axis. However, when considering all of the variables, the resulting ordination of axes from this PCA indicates that mean period and peak period are the strongest influencers of the variability in the data for the first axis and are therefore the principal components (followed by mean wind speed and significant wave height for the second axis). This analysis helped me to understand the relationship among variables that relate to wave climate. If historical atmospheric data for this location was easier to obtain, I would like to incorporate more variables such as barometric pressure to attain more interesting insights of wave climate. For my re-analysis, I would like to perform a polar ordination. The next steps involve selecting endpoints for this analysis.