OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et.

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Harun Ug˘uz 2011.KBS A two-stage feature selection method for text categorization by.
ICML Linear Programming Boosting for Uneven Datasets Jurij Leskovec, Jožef Stefan Institute, Slovenia John Shawe-Taylor, Royal Holloway University.
Multi-Label Prediction via Compressed Sensing By Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (NIPS 2009) Presented by: Lingbo Li ECE, Duke University.
Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.
Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.
On feature distributional clustering for text categorization Bekkerman, El-Yaniv, Tishby and Winter The Technion. June, 27, 2001.
DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December
Distributional Clustering of Words for Text Classification Authors: L.Douglas Baker Andrew Kachites McCallum Presenter: Yihong Ding.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Support Vector Machines
Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:
Affinity Rank Yi Liu, Benyu Zhang, Zheng Chen MSRA.
Large-Scale Text Categorization By Batch Mode Active Learning Steven C.H. Hoi †, Rong Jin ‡, Michael R. Lyu † † CSE Department, Chinese University of Hong.
Text Classification Using Stochastic Keyword Generation Cong Li, Ji-Rong Wen and Hang Li Microsoft Research Asia August 22nd, 2003.
Learning from Multiple Outlooks Maayan Harel and Shie Mannor ICML 2011 Presented by Minhua Chen.
TransRank: A Novel Algorithm for Transfer of Rank Learning Depin Chen, Jun Yan, Gang Wang et al. University of Science and Technology of China, USTC Machine.
Presented By Wanchen Lu 2/25/2013
Semantic Similarity over Gene Ontology for Multi-label Protein Subcellular Localization Shibiao WAN and Man-Wai MAK The Hong Kong Polytechnic University.
Xiaoxiao Shi, Qi Liu, Wei Fan, Philip S. Yu, and Ruixin Zhu
How to reform a terrain into a pyramid Takeshi Tokuyama (Tohoku U) Joint work with Jinhee Chun (Tohoku U) Naoki Katoh (Kyoto U) Danny Chen (U. Notre Dame)
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Adding Semantics to Clustering Hua Li, Dou Shen, Benyu Zhang, Zheng Chen, Qiang Yang Microsoft Research Asia, Beijing, P.R.China Department of Computer.
A Comparative Study of Kernel Methods for Classification Applications Yan Liu Oct 21, 2003.
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
Dual Transfer Learning Mingsheng Long 1,2, Jianmin Wang 2, Guiguang Ding 2 Wei Cheng, Xiang Zhang, and Wei Wang 1 Department of Computer Science and Technology.
An Efficient Greedy Method for Unsupervised Feature Selection
Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &
Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.
WSP: A Network Coordinate based Web Service Positioning Framework for Response Time Prediction Jieming Zhu, Yu Kang, Zibin Zheng and Michael R. Lyu The.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
1/18 New Feature Presentation of Transition Probability Matrix for Image Tampering Detection Luyi Chen 1 Shilin Wang 2 Shenghong Li 1 Jianhua Li 1 1 Department.
Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Wen Zhang, Taketoshi Yoshida, Xijin Tang 2011.ESWA A comparative study of TF*IDF,
Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang and Yong Yu Shanghai Jiao Tong University & Hong Kong University of Science and Technology.
Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
2D-LDA: A statistical linear discriminant analysis for image matrix
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text.
Iterative K-Means Algorithm Based on Fisher Discriminant UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND Mantao Xu to be presented.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Martina Uray Heinz Mayer Joanneum Research Graz Institute of Digital Image Processing Horst Bischof Graz University of Technology Institute for Computer.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Boosting the Feature Space: Text Classification for Unstructured.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Data Driven Resource Allocation for Distributed Learning
School of Computer Science & Engineering
Cross Domain Distribution Adaptation via Kernel Mapping
Outline Multilinear Analysis
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Blind Signal Separation using Principal Components Analysis
A New Boosting Algorithm Using Input-Dependent Regularizer
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Feature space tansformation methods
Asymmetric Transitivity Preserving Graph Embedding
Generally Discriminant Analysis
Usman Roshan CS 675 Machine Learning
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Presentation transcript:

OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et al. Microsoft Research Asia Peking University Tsinghua University Chinese University of Hong Kong Virginia Polytechnic Institute and State University

Outline Motivation Problem formulation Related works The OCFS algorithm Experiments Conclusion and future works

Motivation DR are highly desired for web scale text data DR can improve efficiency and effectiveness Feature selection (FS) is more applicable than feature extraction (FE) Most of FS algorithms are greedy. simple, effective, efficient and optimal FS algorithm

Outline Motivation Problem formulation Related works The OCFS algorithm Experiments Conclusion and future works

Problem Formulation (p<<d) Dimension reduction: Consider linear case: suppose where

Problem Formulation We denote the discrete solution space as: given a set of labeled training documents X, learn a transformation matrix such that it is optimal according to some criterion in space. The problem is: FS:

Outline Motivation Problem formulation Related works The OCFS algorithm Experiments Conclusion and future works

Related Works – IG Information gain aims to select a group of optimal features: by: and global optimal is NP , greedy computing

Related Works – CHI CHI aims to select a group of features by: and

Outline Motivation Problem formulation Related works The OCFS algorithm Experiments Conclusion and future works

Orthogonal Centroid Algorithm Orthogonal centroid : FE algorithm. Effective for DR of text classification problems. Computation is based on QR matrix decomposition Theorem: the solution of OC algorithm equals to the solution of the following optimization problem, where

Intuition of Our Work OC from the FE perspective where

Intuition of Our Work OC from the FS perspective: how to optimize J in discrete space by FE by FS

The OCFS Algorithm FS problem: suppose we want to preserve the m th and n th feature of and discard the others.

The OCFS Algorithm Optimization:

The OCFS Algorithm Solution : p largest ones from OCFS:

The OCFS Algorithm

Algorithm Analysis The Number of selected features is subject to where the energy function

Algorithm Analysis Complexity: time complexity is O(cd) OCFS only compute the simple square function instead of some functional computation such as logarithm of IG

Outline Motivation Problem formulation Related works The OCFS algorithm Experiments Conclusion and future works

Experiments Setup Datasets: 20 Newsgroups (5-class; 5,000-data; 131,072-d) Reuters Corpus Volume 1 (4-class; 800,000-data; 500,000-d) Open Directory Project (13-class) Baseline : IG & CHI Performance measurement : CPU runtime and Classifier : SVM SMO

Experimental Results –20NG 20NG F1 CPU runtime

Experimental Results –20NG

Experimental Results – RCV1 F1 CPU runtime RCV1

Experimental Results – ODP F1 ODP

Results Analysis Better than IG and CHI Only half of the time Outperform performance when dimension small. dimension is small, optimal outperform greedy. increasing selected features, the saturation of features makes additional features of less value.

Outline Motivation Problem formulation Related works The OCFS algorithm Experiments Conclusion and future works

Conclusion We proposed a novel efficient and effective feature selection algorithm for text categorization. Main advantages : optimal better performance more efficient

Future Works Future works: unbalanced data combine with other approaches. E.g. OCFS + PCA

The End Thanks! Q & A