Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.

Slides:

Advertisements

Similar presentations

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.

Advertisements

Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.

1 Image Classification MSc Image Processing Assignment March 2003.

Perceptron Learning Rule

Automatic classification of weld cracks using artificial intelligence and statistical methods Ryszard SIKORA, Piotr BANIUKIEWICZ, Marcin CARYK Szczecin.

Data Visualization STAT 890, STAT 442, CM 462

Measures of Information Hartley defined the first information measure: –H = n log s –n is the length of the message and s is the number of possible values.

Soft computing Lecture 6 Introduction to neural networks.

Subspace and Kernel Methods April 2004 Seong-Wook Joo.

Principal Component Analysis

1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.

Distributional Clustering of Words for Text Classification L. Douglas Baker Andrew Kachites McCallum SIGIR’98.

Implementing a reliable neuro-classifier

Multivariate Analysis A Unified Perspective

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

Chapter 5. Operations on Multiple R. V.'s 1 Chapter 5. Operations on Multiple Random Variables 0. Introduction 1. Expected Value of a Function of Random.

The Power of Word Clusters for Text Classification Noam Slonim and Naftali Tishby Presented by: Yangzhe Xiao.

Introduction to Data Mining Engineering Group in ACL.

Factor Graphs Young Ki Baik Computer Vision Lab. Seoul National University.

CEN 592 PATTERN RECOGNITION Spring Term CEN 592 PATTERN RECOGNITION Spring Term DEPARTMENT of INFORMATION TECHNOLOGIES Assoc. Prof.

嵌入式視覺 Pattern Recognition for Embedded Vision Template matching Statistical / Structural Pattern Recognition Neural networks.

A Simple Method to Extract Fuzzy Rules by Measure of Fuzziness Jieh-Ren Chang Nai-Jian Wang.

Anomaly detection with Bayesian networks Website: John Sandiford.

Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.

Self-organizing Maps Kevin Pang. Goal Research SOMs Research SOMs Create an introductory tutorial on the algorithm Create an introductory tutorial on.

ANNs (Artificial Neural Networks). THE PERCEPTRON.

Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.

ENVI 4.5 Product Updates. Visual Information Solutions ENVI 4.5 Value Proposition ArcGIS Interoperability: Geodatabase and ArcMap access Fx enhancements:

Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.

The Perceptron. Perceptron Pattern Classification One of the purposes that neural networks are used for is pattern classification. Once the neural network.

Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.

METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.

Feature selection with Neural Networks Dmitrij Lagutin, T Variable Selection for Regression

1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.

1 Feature selection with conditional mutual information maximin in text categorization CIKM2004.

Unsupervised Learning Networks 主講人 : 虞台文. Content Introduction Important Unsupervised Learning NNs – Hamming Networks – Kohonen’s Self-Organizing Feature.

PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:

MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Zhonglong Zheng, Jie Yang, Yitan Zhu Engineering Applications of Artiﬁcial Intelligence 20 (2007) 101–110 Presenter Chia-Cheng Chen 1.

Introduction to Independent Component Analysis Math 285 project Fall 2015 Jingmei Lu Xixi Lu 12/10/2015.

Perceptrons Michael J. Watts

Chapter 15: Classification of Time- Embedded EEG Using Short-Time Principal Component Analysis by Nguyen Duc Thang 5/2009.

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

Technische Universität München Yulia Gembarzhevskaya LARGE-SCALE MALWARE CLASSIFICATON USING RANDOM PROJECTIONS AND NEURAL NETWORKS Technische Universität.

Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.

Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Unsupervised Learning Networks

Restricted Boltzmann Machines for Classification

The Elements of Statistical Learning

Joost N. Kok Universiteit Leiden

Estimating Link Signatures with Machine Learning Algorithms

Basic machine learning background with Python scikit-learn

Unsupervised Learning and Autoencoders

PCA vs ICA vs LDA.

Presented by Nagesh Adluru

NORPIE 2004 Trondheim, 14 June Automatic bearing fault classification combining statistical classification and fuzzy logic Tuomo Lindh Jero Ahola Petr.

Dimensionality Reduction

Classification Boundaries

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Introduction to Radial Basis Function Networks

INTRODUCTION TO SIGNALS & SYSTEMS

Machine Learning – a Probabilistic Perspective

Yining ZHAO Computer Network Information Center,

NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &

What is Artificial Intelligence?

Outline Announcement Neural networks Perceptrons - continued

Presentation transcript:

Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99

Contents Data visualization Good 2-D projections for high dimensional data interpretation Feature selection Eliminate redundancy Joint mutual information ICA

Introduction Visualization of input data and feature selection are intimately related. Input variable selection is the most important step in the model selection process. Model-independent approaches to select input variables before model specification. Data visualization is very important for human to understand the structural relation among variables in a system.

Joint mutual information for input/feature selection Mutual information Kullback-Leibler divergence Joint mutual information

Conditional MI When Use joint mutual information instead of the mutual information to select inputs for a neural network classifier and for data visualization.

Data visualization methods Supervised methods based on JMI cf) CCA Unsupervised methods based on ICA cf) PCA Efficient method for JMI

Application to Signal Visualization and Classification JMI and visualization of radar pulse patterns Radar pattern 15-dimensional vector, 3 classes Compute JMIs, select inputs

Radar pulse classification 7 hidden units Experiments all inputs vs. 4 selected inputs 4 inputs with the largest JMI vs. randomly selected 4 inputs

Conclusions Advantage of single JMI Can distinguish inputs when all of them have the same Can eliminate the redundancy in the inputs when one input is a function of other inputs