A REVIEW OF FEATURE SELECTION METHODS WITH APPLICATIONS Alan Jović, Karla Brkić, Nikola Bogunović {alan.jovic, karla.brkic,

Slides:



Advertisements
Similar presentations
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Advertisements

Random Forest Predrag Radenković 3237/10
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Molecular Biomedical Informatics 分子生醫資訊實驗室 Machine Learning and Bioinformatics 機器學習與生物資訊學 Machine Learning & Bioinformatics 1.
Introduction to Data Mining with XLMiner
Feature Selection Presented by: Nafise Hatamikhah
WRSTA, 13 August, 2006 Rough Sets in Hybrid Intelligent Systems For Breast Cancer Detection By Aboul Ella Hassanien Cairo University, Faculty of Computer.
Exploratory Data Mining and Data Preparation
Feature Selection for Regression Problems
Evolving Neural Networks in Classification Sunghwan Sohn.
Dimension Reduction and Feature Selection Craig A. Struble, Ph.D. Department of Mathematics, Statistics, and Computer Science Marquette University.
Jeff Howbert Introduction to Machine Learning Winter Machine Learning Feature Creation and Selection.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, April 3, 2000 DingBing.
Boris Babenko Department of Computer Science and Engineering University of California, San Diego Semi-supervised and Unsupervised Feature Scaling.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
A Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data.
SISAP’08 – Approximate Similarity Search in Genomic Sequence Databases using Landmark-Guided Embedding Ahmet Sacan and I. Hakki Toroslu
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Random Forest-Based Classification of Heart Rate Variability Signals by Using Combinations of Linear and Nonlinear Features Alan Jovic, Nikola Bogunovic.
Text Feature Extraction. Text Classification Text classification has many applications –Spam detection –Automated tagging of streams of news articles,
1 Effective Feature Selection Framework for Cluster Analysis of Microarray Data Gouchol Pok Computer Science Dept. Yanbian University China Keun Ho Ryu.
ICDM 2003 Review Data Analysis - with comparison between 02 and 03 - Xindong Wu and Alex Tuzhilin Analyzed by Shusaku Tsumoto.
MINING MULTI-LABEL DATA BY GRIGORIOS TSOUMAKAS, IOANNIS KATAKIS, AND IOANNIS VLAHAVAS Published on July, 7, 2010 Team Members: Kristopher Tadlock, Jimmy.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
Methods for Automatic Evaluation of Sentence Extract Summaries * G.Ravindra +, N.Balakrishnan +, K.R.Ramakrishnan * Supercomputer Education & Research.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems Faten Hussein Presented by The University of British.
COT6930 Course Project. Outline Gene Selection Sequence Alignment.
Feature Selection Topics
Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.
Data Mining and Decision Support
NTU & MSRA Ming-Feng Tsai
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Feature Selection Poonam Buch. 2 The Problem  The success of machine learning algorithms is usually dependent on the quality of data they operate on.
Unsupervised Streaming Feature Selection in Social Media
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Dr. Gheith Abandah 1.  Feature selection is typically a search problem for finding an optimal or suboptimal subset of m features out of original M features.
3/13/2016Data Mining 1 Lecture 1-2 Data and Data Preparation Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB) Bangkok.
Web Page Clustering using Heuristic Search in the Web Graph IJCAI 07.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
A WEB PLATFORM FOR ANALYSIS OF MULTIVARIATE HETEROGENEOUS BIOMEDICAL TIME - SERIES - A PRELIMINARY REPORT Alan Jovic, Davor Kukolja, Kresimir Jozic, Marko.
Machine Learning with Spark MLlib
Machine Learning for Computer Security
Sentiment analysis algorithms and applications: A survey
An Artificial Intelligence Approach to Precision Oncology
School of Computer Science & Engineering
Chapter 6 Classification and Prediction
Advanced Artificial Intelligence Feature Selection
Machine Learning Feature Creation and Selection
Data Warehousing and Data Mining
ECE539 final project Instructor: Yu Hen Hu Fall 2005
Classification and Prediction
Dept. of Computer Science University of Liverpool
Feature Selection Methods
Data Preprocessing Copyright, 1996 © Dale Carnegie & Associates, Inc.
Data Preprocessing Copyright, 1996 © Dale Carnegie & Associates, Inc.
Presentation transcript:

A REVIEW OF FEATURE SELECTION METHODS WITH APPLICATIONS Alan Jović, Karla Brkić, Nikola Bogunović {alan.jovic, karla.brkic, Faculty of Electrical Engineering and Computing, University of Zagreb Department of Electronics, Microelectronics, Computer and Intelligent Systems

C ONTENT Motivation Problem statement Classification of FS methods Application domains Conclusion

M OTIVATION Data pre-processing often requires feature set reduction Too many features for modeling tools to find the optimal model Feature set may not fit into memory (for big datasets, streaming features) A lot of features may be irrelevant or redundant Few available review papers available on the subject Mostly focused on specific topics (e.g. classification, clustering) Application domains are not discussed in detail

P ROBLEM STATEMENT Effectively, there are four classes of features: Strongly relevant – cannot be removed without affecting the original conditional target distribution, necessary for optimal model Weakly relevant, but not redundant – may or may not be necessary for optimal model Irrelevant – not necessary to include, do not affect original conditional target distribution Redundant – can be completely replaced with a set of other features such that the target distribution is not disturbed (redundancy is always inspected in multivariate case) Goal: develop methods to keep only strongly and weakly relevant features, remove all the rest

C LASSIFICATION OF F EATURE S ELECTION M ETHODS Feature extraction (transformation) E.g. PCA, LDA, MDS... (not our focus) Feature selection Filters Wrappers Embedded Hybrid Structured features Streaming features

F ILTERS Select features based on a performance measure regardless of the employed data modeling algorithm Many performance measures described in literature Fast, but not as accurate as wrappers

W RAPPERS Consider feature subsets by the quality of performance of a modeling algorithm, which is taken as a black box evaluator. The evaluation is repeated for each feature subset Very slow, highly accurate Dependent on the modeling algorithm, may introduce bias

E MBEDDED METHODS Perform feature selection during the modeling algorithm's execution. The methods are embedded in the algorithm either as its normal or extended functionality. Also biased for the modeling algorithm E.g. CART, C4.5, random forest, multinomial logistic regression, Lasso...

H YBRID METHODS Combine the best properties of filters and wrappers. Usual approach: First, a filter method is used in order to reduce the feature space dimension space, possibly obtaining several candidate subsets. Then, a wrapper is employed to find the best candidate subset. Highly used in recent years E.g. fuzzy random forest feature selection, hybrid genetic algorithms, mixed gravitational search algorithm...

S TRUCTURED AND S TREAMING FEATURES Structured feature selection methods suppose that an internal structure (dependency) exists between features (groups, trees, graphs...) Algorithms are mostly based on Lasso regularization Streaming features selection methods assume that unknown number and size of features arrives into the dataset periodically and needs to be considered or rejected for model construction Many approaches in recent years, particularly popular for modeling text messages in social networking E.g. Grafting algorithm, Alpha-Investing algorithm, OSFS algorithm

A PPLICATION DOMAINS

C ONCLUSIONS OF THE REVIEW Hybrid FS methods, particularly concerning the methodologies based on evolutionary computation heuristic algorithms such as swarm intelligence based and various genetic algorithms show the best results Filters based on information theory and wrappers based on greedy stepwise approaches also seem to show great results. Application of FS methods is imporant in areas such as bioinformatics, image processing, industrial applications and text mining where high-dimensional feature spaces are present – the application areas are mostly drivers for development of advanced FS methodologies