A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG Terry Neeman, Statistical Consulting Unit, ANU Renee Visser, Fenner School of Environmental Science,

Slides:



Advertisements
Similar presentations
Agenda of Week V Review of Week IV Inference on MV Mean Vector One population Two populations Multi-populations: MANOVA.
Advertisements

PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Pattern Recognition for the Natural Sciences Explorative Data Analysis Principal Component Analysis (PCA) Lutgarde Buydens, IMM, Analytical Chemistry.
An Introduction to Multivariate Analysis
CHAPTER 24 MRPP (Multi-response Permutation Procedures) and Related Techniques From: McCune, B. & J. B. Grace Analysis of Ecological Communities.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
Multivariate Methods Pattern Recognition and Hypothesis Testing.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Dealing With Statistical Uncertainty Richard Mott Wellcome Trust Centre for Human Genetics.
Statistics for Decision Making Descriptive Statistics QM Fall 2003 Instructor: John Seydel, Ph.D.
QM Spring 2002 Statistics for Decision Making Descriptive Statistics.
BHS Methods in Behavioral Sciences I
10/17/071 Read: Ch. 15, GSF Comparing Ecological Communities Part Two: Ordination.
Contingency tables and Correspondence analysis Contingency table Pearson’s chi-squared test for association Correspondence analysis using SVD Plots References.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Separate multivariate observations
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Correspondence Analysis Chapter 14.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Large Two-way Arrays Douglas M. Hawkins School of Statistics University of Minnesota
Chapter 2 Dimensionality Reduction. Linear Methods
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Correlation.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
The Scientific Method Formulation of an H ypothesis P lanning an experiment to objectively test the hypothesis Careful observation and collection of D.
Spatial Association Defining the relationship between two variables.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts.
Multivariate Data Analysis  G. Quinn, M. Burgman & J. Carey 2003.
A Process Control Screen for Multiple Stream Processes An Operator Friendly Approach Richard E. Clark Process & Product Analysis.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
SINGULAR VALUE DECOMPOSITION (SVD)
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L10.1 Lecture 10: Cluster analysis l Uses of cluster analysis.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
LIS 570 Summarising and presenting data - Univariate analysis.
Child social exclusion: development of a small area indicator for Australia Justine McNamara.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Chapter 15: Correlation. Correlations: Measuring and Describing Relationships A correlation is a statistical method used to measure and describe the relationship.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
WELCOME TO BIOSTATISTICS! WELCOME TO BIOSTATISTICS! Course content.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Canadian Bioinformatics Workshops
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Unsupervised Learning
MATH-138 Elementary Statistics
Exploring Microarray data
Chapter 5  NORMAL DISTRIBUTION AND Z-SCORE
Chapter 25 Comparing Counts.
Discrimination and Classification
Chapter 2 Describing Data: Graphs and Tables
Matrix Algebra - Overview
Combinations (= multimetrics)
Principal Components Analysis
Chapter 26 Comparing Counts.
Multidimensional Space,
Displaying Data – Charts & Graphs
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
Chapter 26 Comparing Counts.
Principal Component Analysis (PCA)
Inferring Cellular Processes from Coexpressing Genes
Unsupervised Learning
Presentation transcript:

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG Terry Neeman, Statistical Consulting Unit, ANU Renee Visser, Fenner School of Environmental Science, ANU

Playing it safe with multivariate analysis  Multivariate analysis for observational data Pattern-seeking Avoids hypothesis testing No searching across thousands of potential covariates for a few interesting “drivers” of response No commitments Data-driven

The western barred bandicoot of Heirisson Prong  Once common on mainland Australia  Driven to extinction in 1930s  Small population on Dorre Island  Re-introduced to Western Australian peninsula- Heirisson Prong – 1995  Subject of ecological research

Studying the bandicoot diet at Heirisson Prong  Analysis of faecal samples  40 animals captured in summer, 33 animals captured in winter  Invertebrate and plant matter identified from reference collection  Relative volume of each diet item  7 most common invertebrates used for diet analysis  Data issues unidentified material uninteresting material classification of material  taxonomic categories  Size categories

Sample Data – relative volume (%) (Subset of total columns) IDSeasonBeetle Grass- hoppersSpiderSlaterBugsAnts Scor- pions 1W W W W S S S S S etc

Assessing prey availability at Heirisson Prong  Pitfall traps used to capture invertebrates  Sampled 7 days in winter and 7 days in summer  14 randomly selected quadrants (50x50m) on 2 sites  Vegetation type: open ground, nesting area, common shrubs, other  Counts amalgamated by veg type: 8 sets for each season  7 most common invertebrates counted

Prey Availability Invertebrate counts by area SeasonBeetlesAnts Grass- hoppersSlatersBugsScorpions winter winter summer summer summer Etc…

A multivariate feast of questions….  Does bandicoot diet vary between winter & summer?  What are the patterns of diet observed?  Does prey availability vary between winter & summer?  How does prey availability by season influence diet?  What are the bandicoot diet preferences?

Tools available in multivariate analysis  Correspondence analysis Decomposition of profile matrix of contingency table Generalised singular value decomposition  Principal components analysis Decomposition of centred data matrix Row and column analysis not symmetric  Cluster analysis Non-hierarchical Hierarchical

Data issues – which data do we analyse?  Relative volume data As compositional data (what about all the zeros?) Aggregate categories?  Subset of relative volume data Standardised to sum to 100?  Presence/absence data  Relative volume data - ranked Rank the diet items within each animal Total ranks across animals  Massage it to look more multivariate normal?

Univariate analysis of bandicoot diet – presence / absence data SummerWinter χ2χ2 P-value (n=40)(n=33) Beetles83%100% Grasshoppers48%91%15.4<0.001 Ants98%33%34.5<0.001 Slaters35%9% Bugs33%24%0.6 Spiders38%15% Scorpions30%6%

Correspondence analysis for invertebrate orders  We use relative volume data - treat data as “counts”  data 73x7 matrix  Correspondence analysis weighted PCA on rows and on columns  Row and column scores are computed. Column scores give a lower-dimensional representation of diet patterns across animals Row scores give a lower-dimensional representation of diet patterns within an individual Ordering the data based upon the first row score, and the first column score gives a visual pattern of association between rows and columns.

Relative volume of invertebrate item in faeces (volume <10% removed from table) Seas onScorpions Grass- hoppersSpidersBeetlesAntsBugsSlaters s 1552 s 1432 s 1831 s 1637 w 10 s 2936 s s s 4229 w s 2217 s 96 w1018 s12 47

… a second summer population … seasonScorpions Grass- hoppersSpiders BeetlesAnts BugsSlater s 2217 s 96 s12 47 s 91 s s 55 s 83 s1577 s 12 s3318 s1559 s1761

Beetles and Grasshoppers in winter…. seasonScorpions Grass- hoppersSpidersBeetlesAntsBugsSlaters s10 12 s w 1622 w 1728 s w w 1118 w 2033 w 2631

Correspondence Analysis cumulative inertia - 55%

Principal components analysis - GenStat   Latent roots12   Percentage variation12  ~70% total variation  Latent vectors (loadings)  12   Spiders  Beetles  Bugs  Ants  Slaters  Grasshoppers  Scorpions

Non-hierarchical clustering using k-means Beetles Grasshoppers n=40 Slaters Bugs n=14 Ants Spiders Scorpions n=19 WINTER SUMMER

Hierarchical clustering using complete linkage

Prey availability: counts of captured prey in pit traps

Prey availability: Correspondence Analysis counts below 5 removed -13/16 sites slaterantsscorp Grass- hopperbeetlespiderbugs W W W W W W W W S S 5997 S 5118 S S

Prey availability: Correspondence Analysis counts below 5 removed –summer sites SlaterAntsScorp Grass- hopper BeetleSpiderBugs S S 5997 S 5118 S S S S S

Correspondence Analysis - biplot Winter sites: 1 – 8 Summer sites:

Non-hierarchical clustering: Standardise each variable first! Spiders Beetles Bugs N=8 Ants Slaters N=8 Scorpions Grasshoppers WINTER SUMMER

Hierarchical cluster analysis of prey availability

A univariate look at matching prey availability to diet Presence / absence in pit-traps and faecal samples WINTER

A univariate look at matching prey availability to diet Presence / absence in pit-traps and faecal samples SUMMER

Food availability ranked by total count  SUMMER 1.Ants 2.Slaters 3.Grasshoppers 4.Beetles 5.Spiders 6.Bugs 7.Scorpions  WINTER 1.Ants 2.Beetles 3.Spiders 4.Grasshoppers 5.Bugs 6.Scorpions 7.Slaters Using relative volumes, diet items are ranked for each individual Subtract diet rankings from food availability rankings Positive numbers indicate preference for that food item

Average rank differences rank order preference Invertebrate itemSummerWinter average rank difference (n=40)(n=33)(n=73) Beetles Ants Grasshoppers Spiders Slaters Bugs Scorpions

Double-bootstrap to get confidence intervals  For each iteration (N=1000)  Sampled (with replacement) 8 summer traps, 8 winter traps  Rank prey using totals from re-sampled data  Sampled (with replacement) 40 summer animals, 33 winter animals  Rank diet items for each re-sampled animal  Take difference for each animal: Prey rank – diet item rank  For each diet item, calculate average difference across animals  5% and 95% quantiles of distribution for each diet item

Average rank preferences with bootstrapped confidence intervals

A few conclusions  A multivariate approach gives a richer picture  Profile animals and diet items – look at diet patterns  Non-hierarchical cluster analysis, correspondence analysis and PCA elucidate similar patterns  Hierarchical clustering is too hard to interpret  Rank preference index used to assess diet selectivity  Double bootstrapping prey and diet data can give confidence intervals

References Visser R, Richards J, Neeman T., Diet of the endangered Western Barred Bandicoot Perameles bougainville (Marsupialia: Peramelidae) on Heirisson Prong, Western Australia, Wildlife Research, in press. Krebs C., Ecological Methodology, Addison-Wesley 1999