More complex (multidimensional) methods

Slides:



Advertisements
Similar presentations
Different types of data e.g. Continuous data:height Categorical data ordered (nominal):growth rate very slow, slow, medium, fast, very fast not ordered:fruit.
Advertisements

Psychology Practical (Year 2) PS2001 Correlation and other topics.
Lesson 10: Linear Regression and Correlation
Brief introduction on Logistic Regression
An Introduction to Multivariate Analysis
Cluster Analysis Hal Whitehead BIOL4062/5062. What is cluster analysis? Non-hierarchical cluster analysis –K-means Hierarchical divisive cluster analysis.
Chapter 17 Overview of Multivariate Analysis Methods
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
19-1 Chapter Nineteen MULTIVARIATE ANALYSIS: An Overview.
Gifts in the treasure chest of Methodology: A personal view Rolf Steyer Friedrich Schiller University Jena Institute of Psychology Department of Methodology.
Biol 500: basic statistics
Lecture Slides Elementary Statistics Twelfth Edition
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Dr. Michael R. Hyman Cluster Analysis. 2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Correlation and Regression Analysis
Discriminant Analysis Testing latent variables as predictors of groups.
Educational Research: Correlational Studies EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Clustering analysis workshop Clustering analysis workshop CITM, Lab 3 18, Oct 2014 Facilitator: Hosam Al-Samarraie, PhD.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
Data Mining Techniques
Discriminant Function Analysis Basics Psy524 Andrew Ainsworth.
Introduction to the gradient analysis. Community concept (from Mike Austin)
Understanding Statistics
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Chapter 5 Discrete Probability Distributions 5-1 Review and Preview 5-2.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Causal inferences During the last two lectures we have been discussing ways to make inferences about the causal relationships between variables. One of.
Correlational Research Chapter Fifteen Bring Schraw et al.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Classification. Similarity measures Each ordination or classification method is based (explicitely or implicitely) on some similarity measure (Two possible.
Available at Chapter 13 Multivariate Analysis BCB 702: Biostatistics
CJT 765: Structural Equation Modeling Class 12: Wrap Up: Latent Growth Models, Pitfalls, Critique and Future Directions for SEM.
Applications of Spatial Statistics in Ecology Introduction.
Path Analysis and Structured Linear Equations Biologists in interested in complex phenomena Entails hypothesis testing –Deriving causal linkages between.
Correlational Research Designs. 2 Correlational Research Refers to studies in which the purpose is to discover relationships between variables through.
LECTURE 5 HYPOTHESIS TESTING EPSY 640 Texas A&M University.
DESIGNING, CONDUCTING, ANALYZING & INTERPRETING DESCRIPTIVE RESEARCH CHAPTERS 7 & 11 Kristina Feldner.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree like diagram that.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
Causal inferences This week we have been discussing ways to make inferences about the causal relationships between variables. One of the strongest ways.
MA354 Math Modeling Introduction. Outline A. Three Course Objectives 1. Model literacy: understanding a typical model description 2. Model Analysis 3.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Clustering [Idea only, Chapter 10.1, 10.2, 10.4].
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
FACTOR ANALYSIS CLUSTER ANALYSIS Analyzing complex multidimensional patterns.
PREDICT 422: Practical Machine Learning
Selecting the Best Measure for Your Study
Modify—use bio. IB book  IB Biology Topic 1: Statistical Analysis
Multivariate Ordination Techniques
CJT 765: Structural Equation Modeling
Chapter 25 Comparing Counts.
K-means and Hierarchical Clustering
Educational Research: Correlational Studies
Hierarchical clustering approaches for high-throughput data
Hidden Markov Models Part 2: Algorithms
Cladistics Cladistics: classification based on common ancestry
Classification (Dis)similarity measures, Resemblance functions
CSCI N317 Computation for Scientific Applications Unit Weka
Chapter 26 Comparing Counts.
Register variation: correlation, clusters and factors
Chapter 26 Comparing Counts.
Presentation transcript:

More complex (multidimensional) methods brief outline of possibilities of selected methods

Path analysis See also SEM (Structural Equation Modelling [e.g. in program Statistica; it is a bit wider conception]), or causal modelling

Classic (multidimensional) regression Many predictors, one response In reality - long causal chains – in nature: many variable are influenced and influencing at the same time – leads to causal networks

E.g. typical hydrobiological model Carnivorous fishes Other random factors (e.g. temperature, chemism of water etc. Planktonophagous fishes Zooplankton Phytoplankton

Example with representation of Oxalis

“Causal modelling” But causality is introduced by our assumptions about system functioning and not proved by experimental manipulation Approaches differ according to extend by which our initial model could be “corrected” by our data

The method is useful especially there, where we cannot manipulate (at least some) variables experimentally Popular in evolutionary biology But also in ecology (especially on the level of ecosystems and populations in greater spatial scales) Be careful when interpreting causality

Described + intelligible for biologists Bill Shipley 2004 Cause and Correlation in Biology: A User's Guide to Path Analysis, Structural Equations and Causal Inference. Cambridge University Press. James B. Grace 2006 Structural Equation Modeling and Natural Systems. Cambridge University Press.

(Hierarchical) classifications “Cluster analysis”

Goal for classification Form groups of objects being internally homogenous, but different from other groups

Typical data (matrix) Vegetation sample number

I can classify vegetation samples according to their similarity in species structure (I get groups of similar vegetation samples – and I can call them somehow then [Seslerietum]) species, according to similarity to each other (correlation) in distribution (I get groups of species with similar ecological requirements)

Typical data I want to obtain groups of similar individual - attention, data are on different scales - and have to be standardized prior to analyzes

Classification Numerical taxonomy, numerical phenetics, cladistic methods Numeric taxonomy (earlier mainly phenetics), nowadays much broader approach Cladistics - phylogenetics – construction of phylogenetic trees - in fact, nowadays independent field

Classification With learning vs. without learning Hierarchical vs. non-hierarchical Hierarchical – divizive vs. agglomerative

Cluster analysis = Hierarchical, agglomerative method, result is a tree: Principle – first I compute similarity matrix among all pairs, then I construct tree

In cluster analysis keep in mind: It is considerably influenced by so called (dis)similarity measure (also resemblance function). If I have data measured in different scales I have to standartize. Resemblance functions are usually specific for different fields of biology.

In cluster analysis keep in mind: Linkage rule is very important Defaults in program Statistica (particularly the single linkage, i.e. nearest neighbour) are mostly unsuitable for biological purposes

Cluster analysis always forms groups (clusters) if I don’t want groups, I want just to visualize similarity structure in community structure

Ordination: ordination diagram, where similar vegetation samples are close to each other, similar species close each other and species have their optimums near vegetation samples, where they are present

Proximity means similarity Ordination diagram Urtica Chenopodium Cactus Nymphea Menyanthes Comarum Aira Drosera Proximity means similarity

Ordination diagram Nutrients Urtica Chenopodium Cactus Nymphea Menyanthes Water Comarum Aira Drosera I can have independent variables – either shown ex post, or as so called constrained ordinations.

Various methods Correspondence analysis, Principal component analysis, factor analysis Popular in ecology, but also in taxonomy (e.g., could suggest hybridization among species), and in psychology too

Constrained ordinations even for analysis of experiments removed moss and gob removed gob removed control

Discrimination analysis Example: some populations are diploid and some tetraploid – but I can´t count chromosomes every time – I ask – am I able to find some rule on the basis of measured morphological characters (as their linear combination), which discriminate the two ploidities?

Discriminating function Z Ratio of high and length of shell X 100 Discriminating function Z Ratio of apico-frontal distance and length of shell X 100

When applied beware of circular reasoning (an expert has determined two species for me [mainly on the basis of anther length, but I don’t know this] and then I prove that the two species exist and are well differentiated by their anther length).

Similar thing can be achieved by classification trees Based on another principle (without effect additivity)