Dimensionality Reduction with Linear Transformations project update by Mingyue Tan March 17, 2004.

Slides:



Advertisements
Similar presentations
Review: Visual Methods for Analyzing Time-Oriented Data Authors: Wolfgang Aigner, Silvia Miksch, Wolfgang Mu¨ ller, Heidrun Schumann, and Christian Tominski.
Advertisements

Component Analysis (Review)
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Visualization and Cluster
Clustering and Dimensionality Reduction Brendan and Yifang April
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Dimensionality Reduction Chapter 3 (Duda et al.) – Section 3.8
© 2003 by Davi GeigerComputer Vision September 2003 L1.1 Face Recognition Recognized Person Face Recognition.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Dimensionality Reduction
RETRIEVAL OF MULTIMEDIA OBJECTS USING COLOR SEGMENTATION AND DIMENSION REDUCTION OF FEATURES Mingming Lu, Qiyu Zhang, Wei-Hung Cheng, Cheng-Chang Lu Department.
HDDVis: An Interactive Tool for High Dimensional Data Visualization by Mingyue Tan April 21st, 2004.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
Face Recognition Jeremy Wyatt.
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Feature Extraction for Outlier Detection in High- Dimensional Spaces Hoang Vu Nguyen Vivekanand Gopalkrishnan.
Unsupervised Learning
What is Cluster Analysis?
Proceedings of the 2007 SIAM International Conference on Data Mining.
Dimensionality Reduction
Dimensionality Reduction. Multimedia DBs Many multimedia applications require efficient indexing in high-dimensions (time-series, images and videos, etc)
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Dimensionality reduction Usman Roshan CS 675. Supervised dim reduction: Linear discriminant analysis Fisher linear discriminant: –Maximize ratio of difference.
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
Review of Statistics and Linear Algebra Mean: Variance:
1 CSE 980: Data Mining Lecture 17: Density-based and Other Clustering Algorithms.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction Presented by Xianwang Wang Masashi Sugiyama.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
CSE 185 Introduction to Computer Vision Face Recognition.
Discriminant Analysis
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
Principal Component Analysis (PCA)
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
ViSOM - A Novel Method for Multivariate Data Projection and Structure Visualization Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Hujun Yin.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
LDA (Linear Discriminant Analysis) ShaLi. Limitation of PCA The direction of maximum variance is not always good for classification.
Principal Components Analysis ( PCA)
Manifold Learning JAMES MCQUEEN – UW DEPARTMENT OF STATISTICS.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.
Data Science Dimensionality Reduction WFH: Section 7.3 Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Principal Component Analysis (PCA)
University of Ioannina
Ke Chen Reading: [7.3, EA], [9.1, CMB]
LECTURE 10: DISCRIMINANT ANALYSIS
Recognition with Expression Variations
Principal Component Analysis (PCA)
Machine Learning Dimensionality Reduction
Roberto Battiti, Mauro Brunato
Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”
Overview Identify similarities present in biological sequences and present them in a comprehensible manner to the biologists Objective Capturing Similarity.
Principal Component Analysis
Descriptive Statistics vs. Factor Analysis
Dimension reduction : PCA and Clustering
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Feature space tansformation methods
LECTURE 09: DISCRIMINANT ANALYSIS
Mean Absolute Deviation
Chapter 7: Transformations
Mean Absolute Deviation
Presentation transcript:

Dimensionality Reduction with Linear Transformations project update by Mingyue Tan March 17, 2004

Domain and Task Questions to answer - What’s the shape of the clusters? - Which clusters are dense/heterogeneous? - Which data coordinates account for the decomposition to clusters? - Which data points are outliers? Data are labeled

Solution - Dimension Reduction 1. Project the high-dimensional points in a low dimensional space while preserving the “essence” of the data - i.e. distances are preserved as well as possible 2. Solve the problems in low dimensions Dimensionality reduction

Principal Component Analysis Intuition: find the axis that shows the greatest variation, and project all points into this axis f1 e1 e2 f2

Problem with PCA Not robust - sensitive to outliers Usually does not show clustering structure

New Approach PCA - seeks a projection that maximizes the sum Weighted PCA - seeks a projection that maximizes the weighted sum - flexibility Bigger w ij -> More important to put them apart

Weighted PCA Varying w ij gives: Weights specified by user Normalized PCA – robust towards outliers Supervised PCA – shows cluster structures - If i and j belong to the same cluster  set w ij =0 - Maximize inter-cluster scatter

Comparison – with outliers - PCA: Outliers typically govern the projection direction

Comparison – cluster structure - Projections that maximize scatter ≠ Projections that separate clusters

Summary MethodTasks Naïve PCAOutlier Detection Weights-specified PCAGeneral view Normalized PCARobustness towards Outliers Supervised PCACluster structure Ratio optimizationCluster structure (flexibility)

Interface

Interface - File

Interface - task

Interface - method

Interface

Milestones Dataset Assembled - same dataset used in the paper Get familiar with NetBeans - implemented preliminary interface (no functionality) Rewrite PCA in Java (from an existing Matlab implementation) – partially done Implement four new methods

Reference [1] Y. Koren and L. Carmel, “Visualization of Labeled Data Using Linear Transformations", Proc. IEEE Information Visualization (InfoVis?3), IEEE, pp , 2003.