Multidimensional Space,

Slides:



Advertisements
Similar presentations
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Advertisements

Covariance Matrix Applications
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
Chapter 17 Overview of Multivariate Analysis Methods
Multivariate Methods Pattern Recognition and Hypothesis Testing.
Principal Component Analysis
DNA Microarray Bioinformatics - #27611 Program Normalization exercise (from last week) Dimension reduction theory (PCA/Clustering) Dimension reduction.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
1 Data Analysis  Data Matrix Variables ObjectsX1X1 X2X2 X3X3 …XPXP n.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
10/17/071 Read: Ch. 15, GSF Comparing Ecological Communities Part Two: Ordination.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker Part of the slides is adapted from Chris Workman.
Presented by Arun Qamra
Clustering and MDS Exploratory Data Analysis. Outline What may be hoped for by clustering What may be hoped for by clustering Representing differences.
POSTER TEMPLATE BY: Cluster-Based Modeling: Exploring the Linear Regression Model Space Student: XiaYi(Sandy) Shen Advisor:
Separate multivariate observations
Business Research Methods William G. Zikmund Chapter 24 Multivariate Analysis.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Chapter 2 Dimensionality Reduction. Linear Methods
Basic concepts in ordination
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Data Mining & Knowledge Discovery Lecture: 2 Dr. Mohammad Abu Yousuf IIT, JU.
©The McGraw-Hill Companies, Inc., 2001Irwin/McGraw-Hill Donald Cooper Pamela Schindler Chapter 19 Business Research Methods.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
Multidimensional Scaling Vuokko Vuori Based on: Data Exploration Using Self-Organizing Maps, Samuel Kaski, Ph.D. Thesis, 1997 Multivariate Statistical.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Dimension Reduction in Workers Compensation CAS predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.
In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Lecture 07: Dealing with Big Data
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.
3/13/2016Data Mining 1 Lecture 1-2 Data and Data Preparation Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB) Bangkok.
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Multivariate statistical methods Cluster analysis.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Clustering and Scaling (Week 19)
Unsupervised Learning
Multivariate statistical methods
PREDICT 422: Practical Machine Learning
Multivariate Analysis - Introduction
LECTURE 10: DISCRIMINANT ANALYSIS
Dimension Reduction in Workers Compensation
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Principal Component Analysis
Data Clustering Michael J. Watts
Principal Component Analysis (PCA)
Principal Component Analysis
Quality Control at a Local Brewery
Clustering and Multidimensional Scaling
Descriptive Statistics vs. Factor Analysis
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Introduction to Statistical Methods for Measuring “Omics” and Field Data PCA, PcoA, distance measure, AMOVA.
Dimension reduction : PCA and Clustering
Geodemographic classification schemes
LECTURE 09: DISCRIMINANT ANALYSIS
Chapter_19 Factor Analysis
Topic 5: Cluster Analysis
Data Pre-processing Lecture Notes for Chapter 2
THE DIMENSION OF A VECTOR SPACE
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Clustering and Scaling (Week 19)
Unsupervised Learning
Presentation transcript:

Multidimensional Space, Chapter 11 Multivariate Data, Multidimensional Space, and Spatialization By Martin Bartnes

Chapter 11 Multivariate Data and Multidimensional Space Distance, Difference, and Similarity Cluster Analysis Spatialization Reducing the Numbers of Variables

Multivariate Data and Multidimensional Space Multivariate data are data where there is more than one item recorded for each observation. Such data are commonly represented in tabular form with rows and columns.

One, Two and Three Variables in a Single Plot

Distance, Difference, and Similarity Distance is a measure of the difference between pairs of observations. The Pythagoras’s theorem: Spatial distances are the same as geographical distances. Each observation in a multidimensional space has a set of coordinates given by its value on each of the recorded variables. We use these to construct a distance matrix recording all the distances between all the observations. Minkowski and Manhattan distances.

Cluster Analysis: Cluster is a set of observations that are similar to each other and relatively different from other set of observations. Cluster analysis can help to identify potential classifications in statistical data. Since we are thinking of each observation as a point in multidimensional space defined by the variables, an obvious step is to look for clusters of observations. This may be an important first step in the developing of a theory.

Simple Cluster Technique

Hierarchical Cluster Analysis Work by building a nested hierarchy clusters. Smal clusters consist of observations that are very similar to one another. Such small cluster are grouped in larger, looser associations higher up the hierarchy.

Spatialization: Mapping Multivariate Data Multidimensional data can not be visualizing in more than three dimensions. Multivariate data may be 5-, 10, or 20-dimentional.

Reducing the Number of Variables Principal components analysis identifies a set of independent, uncorrelated variates, called the principal components. Factor analysis is looking for hidden factors and attempt to identify them. That can replace the original observed variables. The values of the principal components for each observation are calculated from the original variables. There are as many principal components as there are original variables, but a subset of them that capture most of the variability in the original data can be used

Questions!!! What are the main challenges with multidimensional data? What are the point of cluster analysis?