Margin Based Sample Weighting for Stable Feature Selection Yue Han, Lei Yu State University of New York at Binghamton.

Slides:

Advertisements

Similar presentations

Predictive Analysis of Gene Expression Data from Human SAGE Libraries Alexessander Alves* Nikolay Zagoruiko + Oleg Okun § Olga Kutnenko + Irina Borisova.

Advertisements

Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.

DIMENSIONALITY REDUCTION: FEATURE EXTRACTION & FEATURE SELECTION Principle Component Analysis.

V ARIANCE R EDUCTION FOR S TABLE F EATURE S ELECTION Presenter: Yue Han Advisor: Lei Yu Department of Computer Science 10/27/10.

Data Mining Classification: Alternative Techniques

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.

Maximum Margin Markov Network Ben Taskar, Carlos Guestrin Daphne Koller 2004.

SVM—Support Vector Machines

Yue Han and Lei Yu Binghamton University.

Feature/Model Selection by Linear Programming SVM, Combined with State-of-Art Classifiers: What Can We Learn About the Data Erinija Pranckeviciene, Ray.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.

Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection Ejaz Ahmed 1, Gregory Shakhnarovich 2, and Subhransu Maji 3 1 University.

Feature Selection Presented by: Nafise Hatamikhah

Optimizing Estimated Loss Reduction for Active Sampling in Rank Learning Presented by Pinar Donmez joint work with Jaime G. Carbonell Language Technologies.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Jierui Xie, Boleslaw Szymanski, Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180, USA {xiej2, szymansk,

4 th NETTAB Workshop Camerino, 5 th -7 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini

Distinguishing Photographic Images and Photorealistic Computer Graphics Using Visual Vocabulary on Local Image Edges Rong Zhang,Rand-Ding Wang, and Tian-Tsong.

ACM SAC’06, DM Track Dijon, France “The Impact of Sample Reduction on PCA-based Feature Extraction for Supervised Learning” by M. Pechenizkiy,

These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.

Aprendizagem baseada em instâncias (K vizinhos mais próximos)

KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.

1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste.

Feature Subset Selection using Minimum Cost Spanning Trees Mike Farah Supervisor: Dr. Sid Ray.

CIBB-WIRN 2004 Perugia, 14 th -17 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini Feature.

Feature Selection and Its Application in Genomic Data Analysis March 9, 2004 Lei Yu Arizona State University.

Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.

05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.

Spatial and Temporal Databases Efficiently Time Series Matching by Wavelets (ICDE 98) Kin-pong Chan and Ada Wai-chee Fu.

Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology

Jeff Howbert Introduction to Machine Learning Winter Machine Learning Feature Creation and Selection.

Selective Sampling on Probabilistic Labels Peng Peng, Raymond Chi-Wing Wong CSE, HKUST 1.

1 Efficiently Learning the Accuracy of Labeling Sources for Selective Sampling by Pinar Donmez, Jaime Carbonell, Jeff Schneider School of Computer Science,

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者：郝柏翰 2013/01/28.

A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.

1 Effective Feature Selection Framework for Cluster Analysis of Microarray Data Gouchol Pok Computer Science Dept. Yanbian University China Keun Ho Ryu.

June 5, 2006University of Trento1 Latent Semantic Indexing for the Routing Problem Doctorate course “Web Information Retrieval” PhD Student Irina Veredina.

Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.

BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.

Anomaly Detection in Data Mining. Hybrid Approach between Filtering- and-refinement and DBSCAN Eng. Ştefan-Iulian Handra Prof. Dr. Eng. Horia Cioc ârlie.

Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

H. Lexie Yang1, Dr. Melba M. Crawford2

Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &

Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.

A Convergent Solution to Tensor Subspace Learning.

1 Classification and Feature Selection Algorithms for Multi-class CGH data Jun Liu, Sanjay Ranka, Tamer Kahveci

Consensus Group Stable Feature Selection

A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.

Stable Feature Selection for Biomarker Discovery Name: Goutham Reddy Bakaram Student Id: Instructor Name: Dr. Dongchul Kim Review Article by Zengyou.

Ensemble Methods in Machine Learning

CS378 Final Project The Netflix Data Set Class Project Ideas and Guidelines.

COT6930 Course Project. Outline Gene Selection Sequence Alignment.

Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.

Feature Selection Poonam Buch. 2 The Problem  The success of machine learning algorithms is usually dependent on the quality of data they operate on.

Unsupervised Streaming Feature Selection in Social Media

Meta-learning for Algorithm Recommendation Meta-learning for Algorithm Recommendation Background on Local Learning Background on Algorithm Assessment Algorithm.

CS Machine Learning Instance Based Learning (Adapted from various sources)

Combining Bagging and Random Subspaces to Create Better Ensembles

Stable Feature Selection: Theory and Algorithms

COMBINED UNSUPERVISED AND SEMI-SUPERVISED LEARNING FOR DATA CLASSIFICATION Fabricio Aparecido Breve, Daniel Carlos Guimarães Pedronette State University.

Introduction Feature Extraction Discussions Conclusions Results

Combining Base Learners

A Unifying View on Instance Selection

Presentation transcript:

Margin Based Sample Weighting for Stable Feature Selection Yue Han, Lei Yu State University of New York at Binghamton

Outline Introduction Related Work Hypothesis-Margin Feature Space Transformation Margin Based Sample Weighting Experimental Study Conclusion and Future Work

Introduction Features(Genes or Proteins) Samples p: # of features n: # of samples High-dimensional data: p >> n Feature Selection:  Alleviating the effect of the curse of dimensionality.  Enhancing generalization capability.  Speeding up learning process.  Improving model interpretability. High dimensional Data Dimension reduced Data Feature Selection (Filter or Wrapper) Learning Model D1D1 D2D2 Sports T 1 T 2 ….…… T N 12 0 ….…… 6 DMDM C Travel Jobs … …… Terms Documents 3 10 ….…… ….…… 16 …

Cont’s D1 D2 Features Samples Given Unlimited Sample Size of D: Feature selection results from D1 and D2 are the same Size of D is limited(n<<p for high dimensional data) Feature selection results from D1 and D2 are different Increasing #of samples could be very costly or impractical Stability of feature selection - the insensitivity of the result of a feature selection algorithm to variations in the training set. Identifying characteristic markers to explain the observed phenomena

Related Work Bagging-based Ensemble Feature Selection (Saeys et al. ECML07) Different bootstrapped samples of the same training set; Apply a conventional feature selection algorithm; Aggregates the feature selection results. Group-based Stable Feature Selection (Yu et al. KDD08, KDD09) Explore the intrinsic feature correlations; Identify groups of correlated features; Select relevant feature groups.

Hypothesis-Margin Feature Space Transformation A framework of margin based instance weighting for stable feature selection  Introduce the concept of hypothesis-margin feature space;  Propose the framework of margin based instance weighting for stable feature selection;  Develop an efficient algorithm under the proposed framework.

Hypothesis-Margin Feature Space Transformation X’ captures the local profile of feature importance for all features at X. Multiple nearest neighbors can be used to compute the HM of a sample hit miss

Cont’s Hypothesis-margin based feature space transformation: (a) original feature space, and (b) hypothesis-margin (HM) feature space.

Margin Based Sample Weighting Discrepancy among samples w.r.t. their local profiles of feature importance(HM feature space) Measure the average distance of X’ to all other samples in the HM feature space and greater average distance indicates higher outlying degree. overall time complexity O(n 2 q) and n is the number of samples and q is the dimensionality of D.

Experimental Study Feature Ranking Feature Subset Selection Feature Correlation Stability of a feature selection algorithm is measured as the average of the pair-wise similarity of various feature selection results produced by the same algorithm from different training sets. Stability Metrics

Cont’s Experimental Setup SVM-RFE: 10 percent of remaining features eliminated at each iteration. En-RFE: 20 bootstrapped training sets to construct the ensemble. IW-RFE: k = 10 for hypothesis margin transformation. 10tims shuffling and 10 fold cross-validation to generate 100 datasets.

Consistent improvement in terms of stability of feature selection results from different stability measures

different feature selection algorithms can lead to similarly good classification results

Conclusion and Future Work Introduced the concept of hypothesis-margin feature space Proposed the framework of margin based sample weighting for stable feature selection Developed an efficient algorithm under the framework  Investigate alternative methods of sample weighting based on HM feature space  Strategies to combine margin based sample weighting with group-based stable feature selection

Questions? Thank you!