Multivariate Analysis on Stenella Longirostris Pathology Reports in the Main Hawaiian Islands Haley Boyd
Objective Marine Mammal Response Program completes necropsies on stranded marine cetaceans and sends samples of internal organs to pathologists on the mainland. The pathology reports contain information that identify any diseases that could have lead to the cause of death of each individual animal Through analysis of pathology report data the goal of this project is to determine what environmental factors make Stenella Longirostris most vulnerable to disease. Marine cetaceans are sentinel species that are indicators of ocean health and disease information can be used to identify possible threats to humans or other marine mammal species. The information that can be gathered from applying multivariate statistical tests Stenella longirostris data can help determine any differences within the populations amongst locations, between males and females, and at different age classes.
Data Description Species Data Environmental Data Samples: 30, Stenella Longirostirs Individuals Variables:14, Pathology Report Findings Septicemia Mastitis Pneumonia meningitis Cryptococcus Gttiigi Ulcers Etc. Samples: 30, Stenella Longirostirs Indivduals Variables: 3 Location: Big Island, Maui, Oahu, Kauai Sex: Male, Female Age: Fetus, Neonate, Calf, Subadult, Adult
Data Processing Species Data Discarded Environmental Data Discarded No outliers Identified in Outlier Analysis Rule Used: If only saw disease in one individual was thrown out of analysis Mastitis Morbillivirus Toxoplasmosis Brucella Net Entanglement Using Jaccard Distance Measure 2 samples were discarded because they had > SD
Data Exploration All of the data that I am analyzing is categorical and therefore requires a non parametric test for analysis The correlation values represent strength and the distance between two variables. This means that the relationship between location and sex is positive but not very strong which would make sense. The correlation between location and age is also very low and the correlation between sex and age is negative and .02(not strong).
Data Analysis Used NMDS test with Sorensen (Bray- Curtis) because data is non parametric Max axis=6
NMDS Results Number of axes: Criteria 1: Decrease of stress by 5 = 2 axis result Criteria 2: P value <.05 = 2 axis result
Scree Plot Significance Rule Stress reduction rule: At 2 Dimensions the Real Data line remains below the randomized data. Once 3 Dimensions is reached the line intersects the randomized data. Stress reduction rule: Stress is reduced by 5 or greater for a 2D solution For 3D it is less than 5
Results Coefficient of Determination (% Variance): Orthogonality: Measure of Independence of the three axes
Result Interpretation Ordination plot for the NMDS of Pathology Data with environmental vectors scaled to 100% Axis 1 and Axis 2 have the lowest stress and highest coefficient of variation. Clarkes Rules of thumb signify 7.24 stress level as “a good ordination with no real risk of drawing false inferences” All Variables in small solid circles and all species in open circles
Next Step: MRPP Results Location Sex Age A>0 means that the data is more similar within groups. P<.05 for significance
Discussion The results received of a significant mrpp value for environmental variable location, determine that disease presence is different within groups. The next steps would be to complete an ISA to determine WHICH groups are different via a pairwise comparison, what variables are making them most different Lessons Learned: The more data points available the more similar the sample is to the population and the more significance in results For future research, I would add in several other species for comparison to identify true patterns in the data across species