Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.

Slides:



Advertisements
Similar presentations
Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.
Advertisements

Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
1 Image Classification MSc Image Processing Assignment March 2003.
Perceptron Learning Rule
Automatic classification of weld cracks using artificial intelligence and statistical methods Ryszard SIKORA, Piotr BANIUKIEWICZ, Marcin CARYK Szczecin.
Data Visualization STAT 890, STAT 442, CM 462
Measures of Information Hartley defined the first information measure: –H = n log s –n is the length of the message and s is the number of possible values.
Soft computing Lecture 6 Introduction to neural networks.
Subspace and Kernel Methods April 2004 Seong-Wook Joo.
Principal Component Analysis
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Distributional Clustering of Words for Text Classification L. Douglas Baker Andrew Kachites McCallum SIGIR’98.
Implementing a reliable neuro-classifier
Multivariate Analysis A Unified Perspective
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Chapter 5. Operations on Multiple R. V.'s 1 Chapter 5. Operations on Multiple Random Variables 0. Introduction 1. Expected Value of a Function of Random.
The Power of Word Clusters for Text Classification Noam Slonim and Naftali Tishby Presented by: Yangzhe Xiao.
Introduction to Data Mining Engineering Group in ACL.
Factor Graphs Young Ki Baik Computer Vision Lab. Seoul National University.
CEN 592 PATTERN RECOGNITION Spring Term CEN 592 PATTERN RECOGNITION Spring Term DEPARTMENT of INFORMATION TECHNOLOGIES Assoc. Prof.
嵌入式視覺 Pattern Recognition for Embedded Vision Template matching Statistical / Structural Pattern Recognition Neural networks.
A Simple Method to Extract Fuzzy Rules by Measure of Fuzziness Jieh-Ren Chang Nai-Jian Wang.
Anomaly detection with Bayesian networks Website: John Sandiford.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self-organizing Maps Kevin Pang. Goal Research SOMs Research SOMs Create an introductory tutorial on the algorithm Create an introductory tutorial on.
ANNs (Artificial Neural Networks). THE PERCEPTRON.
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
ENVI 4.5 Product Updates. Visual Information Solutions ENVI 4.5 Value Proposition ArcGIS Interoperability: Geodatabase and ArcMap access Fx enhancements:
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
The Perceptron. Perceptron Pattern Classification One of the purposes that neural networks are used for is pattern classification. Once the neural network.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Feature selection with Neural Networks Dmitrij Lagutin, T Variable Selection for Regression
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
1 Feature selection with conditional mutual information maximin in text categorization CIKM2004.
Unsupervised Learning Networks 主講人 : 虞台文. Content Introduction Important Unsupervised Learning NNs – Hamming Networks – Kohonen’s Self-Organizing Feature.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Zhonglong Zheng, Jie Yang, Yitan Zhu Engineering Applications of Artificial Intelligence 20 (2007) 101–110 Presenter Chia-Cheng Chen 1.
Introduction to Independent Component Analysis Math 285 project Fall 2015 Jingmei Lu Xixi Lu 12/10/2015.
Perceptrons Michael J. Watts
Chapter 15: Classification of Time- Embedded EEG Using Short-Time Principal Component Analysis by Nguyen Duc Thang 5/2009.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Technische Universität München Yulia Gembarzhevskaya LARGE-SCALE MALWARE CLASSIFICATON USING RANDOM PROJECTIONS AND NEURAL NETWORKS Technische Universität.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Unsupervised Learning Networks
Restricted Boltzmann Machines for Classification
The Elements of Statistical Learning
Joost N. Kok Universiteit Leiden
Estimating Link Signatures with Machine Learning Algorithms
Basic machine learning background with Python scikit-learn
Unsupervised Learning and Autoencoders
PCA vs ICA vs LDA.
Presented by Nagesh Adluru
NORPIE 2004 Trondheim, 14 June Automatic bearing fault classification combining statistical classification and fuzzy logic Tuomo Lindh Jero Ahola Petr.
Dimensionality Reduction
Classification Boundaries
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Introduction to Radial Basis Function Networks
INTRODUCTION TO SIGNALS & SYSTEMS
Machine Learning – a Probabilistic Perspective
Yining ZHAO Computer Network Information Center,
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
What is Artificial Intelligence?
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99

Contents Data visualization Good 2-D projections for high dimensional data interpretation Feature selection Eliminate redundancy Joint mutual information ICA

Introduction Visualization of input data and feature selection are intimately related. Input variable selection is the most important step in the model selection process. Model-independent approaches to select input variables before model specification. Data visualization is very important for human to understand the structural relation among variables in a system.

Joint mutual information for input/feature selection Mutual information Kullback-Leibler divergence Joint mutual information

Conditional MI When Use joint mutual information instead of the mutual information to select inputs for a neural network classifier and for data visualization.

Data visualization methods Supervised methods based on JMI cf) CCA Unsupervised methods based on ICA cf) PCA Efficient method for JMI

Application to Signal Visualization and Classification JMI and visualization of radar pulse patterns Radar pattern 15-dimensional vector, 3 classes Compute JMIs, select inputs

Radar pulse classification 7 hidden units Experiments all inputs vs. 4 selected inputs 4 inputs with the largest JMI vs. randomly selected 4 inputs

Conclusions Advantage of single JMI Can distinguish inputs when all of them have the same Can eliminate the redundancy in the inputs when one input is a function of other inputs