Generalized and Heuristic-Free Feature Construction for Improved Accuracy Wei Fan ‡, Erheng Zhong †, Jing Peng*, Olivier Verscheure ‡, Kun Zhang §, Jiangtao.

Slides:



Advertisements
Similar presentations
Latent Space Domain Transfer between High Dimensional Overlapping Distributions Sihong Xie Wei Fan Jing Peng* Olivier Verscheure Jiangtao Ren Sun Yat-Sen.
Advertisements

Forecasting Skewed Biased Stochastic Ozone Days: Analyses and Solutions Forecasting Skewed Biased Stochastic Ozone Days: Analyses and Solutions Kun Zhang,
On the Optimality of Probability Estimation by Random Decision Trees Wei Fan IBM T.J.Watson.
Actively Transfer Domain Knowledge Xiaoxiao Shi Wei Fan Jiangtao Ren Sun Yat-sen University IBM T. J. Watson Research Center Transfer when you can, otherwise.
Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun.
Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren Xiaoxiao Shi Wei Fan Philip S. Yu.
Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1.
Data not in the pre-defined feature vectors that can be used to construct predictive models. Applications: Transactional database Sequence database Graph.
When Efficient Model Averaging Out-Perform Bagging and Boosting Ian Davidson, SUNY Albany Wei Fan, IBM T.J.Watson.
Direct Mining of Discriminative and Essential Frequent Patterns via Model-based Search Tree Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei.
Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
Multi-label Classification without Multi-label Cost - Multi-label Random Decision Tree Classifier 1.IBM Research – China 2.IBM T.J.Watson Research Center.
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Universal Learning over Related Distributions and Adaptive Graph Transduction Erheng Zhong †, Wei Fan ‡, Jing Peng*, Olivier Verscheure ‡, and Jiangtao.
Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science.
A Low-cost Attack on a Microsoft CAPTCHA Yan Qiang,
Special Topic on Image Retrieval Local Feature Matching Verification.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin Nov
Cross Domain Distribution Adaptation via Kernel Mapping Erheng Zhong † Wei Fan ‡ Jing Peng* Kun Zhang # Jiangtao Ren † Deepak Turaga ‡ Olivier Verscheure.
Decision Tree Algorithm
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,
Graph-based Iterative Hybrid Feature Selection Erheng Zhong † Sihong Xie † Wei Fan ‡ Jiangtao Ren † Jing Peng # Kun Zhang $ † Sun Yat-sen University ‡
Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:
Cross Validation Framework to Choose Amongst Models and Datasets for Transfer Learning Erheng Zhong ¶, Wei Fan ‡, Qiang Yang ¶, Olivier Verscheure ‡, Jiangtao.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Relaxed Transfer of Different Classes via Spectral Partition Xiaoxiao Shi 1 Wei Fan 2 Qiang Yang 3 Jiangtao Ren 4 1 University of Illinois at Chicago 2.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Predictive Modeling with Heterogeneous Sources Xiaoxiao Shi 1 Qi Liu 2 Wei Fan 3 Qiang Yang 4 Philip S. Yu 1 1 University of Illinois at Chicago 2 Tongji.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.
Kaihua Zhang Lei Zhang (PolyU, Hong Kong) Ming-Hsuan Yang (UC Merced, California, U.S.A. ) Real-Time Compressive Tracking.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,
Efficient Region Search for Object Detection Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas at Austin.
Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Class-Specific Hough Forests for Object Detection Zhen Yuan Hsu Advisor:S.J.Wang Gall, J., Lempitsky, V.: Class-specic hough forests for object detection.
Detecting Group Differences: Mining Contrast Sets Author: Stephen D. Bay Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
Image Classification for Automatic Annotation
Consensus Extraction from Heterogeneous Detectors to Improve Performance over Network Traffic Anomaly Detection Jing Gao 1, Wei Fan 2, Deepak Turaga 2,
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Support Vector Machines (SVM): A Tool for Machine Learning Yixin Chen Ph.D Candidate, CSE 1/10/2002.
Self-taught Clustering – an instance of Transfer Unsupervised Learning † Wenyuan Dai joint work with ‡ Qiang Yang, † Gui-Rong Xue, and † Yong Yu † Shanghai.
Packet Classification Using Dynamically Generated Decision Trees
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Data Mining Techniques Applied in Advanced Manufacturing PRESENT BY WEI SUN.
Experience Report: System Log Analysis for Anomaly Detection
Spatial Data Management
Cross Domain Distribution Adaptation via Kernel Mapping
Face recognition using improved local texture pattern
Chapter 7: Transformations
A Data Partitioning Scheme for Spatial Regression
Presentation transcript:

Generalized and Heuristic-Free Feature Construction for Improved Accuracy Wei Fan ‡, Erheng Zhong †, Jing Peng*, Olivier Verscheure ‡, Kun Zhang §, Jiangtao Ren †, Rong Yan ‡ and Qiang Yang ¶ ‡ IBM T. J. Watson Research Center † Sun Yat-Sen University *Montclair State University § Xavier University of Lousiana Facebook, Inc ¶ Hong Kong University of Science and Technology Construction works when the original pool is not good enough (feature selection won’t work) Too many choices to construct Evaluate on local space not always on all the data points Better Automated

Feature Construction -- Example XOR like problem Not linearly separable: use both features to construct a “cross” model Linearly separable: one feature F3 is enough

Main Challenges To address these, we have 3 main steps 1.Too many ways to construct new features: x y, x-y,x/y, etc Divide and Conquer 2. Insignificant on the whole data set - highly discriminant in local region Local Feature Construction and Evaluation 3. Automated – not based on domain knowledge Automatically adjusted weighting rules 4 binary operators, 1000 original features up to constructed features F2 not very useful unless considered with F1

Divide-Conquer Local Feature Construction and Evaluation Stopping Criteria: 1.The number of instances in the node is smaller than a threshold 2.The node only contains examples from one class Constructed Features (org + new)

Every node … (1) F (3) (4) Weighted 1.Random subset of orig features 2.“Weighted random” subset of operators (2) Weighting Rule

Weight is proportional to the info-gain of features constructed by the operator. Sum of its past info gain

Properties Number of features is bounded. Highly weighted operator is expected to perform better in its two child nodes (see paper) FCTree’s error is bounded. –also explains why the features are of high quality

Experiment – Data Set UCI repository (Balanced) Caltech-256 database: An image database of 256 object categories. Each category is processed via a 177-dimensional color correlogram (Balanced) Landmine collection: Collected via remote sensing techniques (Skewed) Nuclear Ban data source: A nuclear explosion detection problem used by ICDM’08 contest (Skewed)

Experiment -- Baseline methods Original Features TFC: –enumerates all possible features generated by operators NB,SVM and C45 Operators FCTree:

Performance--Blannced Data Best in 23 out of 33 comparisions

Performance--Skew Data Best in 25 out of 33 comparisions

Scalability Analysis

Strength of Weighting Rule

Original FCTree 177 dimension color correlogram

Conclusion Key points –Divide-conquer to avoid exhaustive enumeration; –Local feature construction subspace evaluation –Weighting rules based search: domain knowledge free and provable performance. Code and data available from the authors