CMU at TDT 2004 — Novelty Detection Jian Zhang and Yiming Yang Carnegie Mellon University.

Slides:

Advertisements

Similar presentations

Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.

Advertisements

Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.

An Introduction of Support Vector Machine

Title: The Author-Topic Model for Authors and Documents

An Overview of Machine Learning

Proactive Learning: Cost- Sensitive Active Learning with Multiple Imperfect Oracles Pinar Donmez and Jaime Carbonell Pinar Donmez and Jaime Carbonell Language.

Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding Amirreza Shaban.

Systems Engineering and Engineering Management The Chinese University of Hong Kong Parameter Free Bursty Events Detection in Text Streams Gabriel Pui Cheong.

Generative Topic Models for Community Analysis

Unsupervised Feature Selection for Multi-Cluster Data Deng Cai et al, KDD 2010 Presenter: Yunchao Gong Dept. Computer Science, UNC Chapel Hill.

Techniques for Event Detection Kleisarchaki Sofia.

Locally Constraint Support Vector Clustering

MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 

Introduction to Automatic Classification Shih-Wen (George) Ke 7 th Dec 2005.

Sparse Word Graphs: A Scalable Algorithm for Capturing Word Correlations in Topic Models Ramesh Nallapati Joint work with John Lafferty, Amr Ahmed, William.

Carnegie Mellon Exact Maximum Likelihood Estimation for Word Mixtures Yi Zhang & Jamie Callan Carnegie Mellon University Wei Xu.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

General Information Course Id: COSC6342 Machine Learning Time: MO/WE 2:30-4p Instructor: Christoph F. Eick Classroom:SEC 201

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

Data mining and machine learning A brief introduction.

Anomaly detection with Bayesian networks Website: John Sandiford.

General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123

Bayesian Sets Zoubin Ghahramani and Kathertine A. Heller NIPS 2005 Presented by Qi An Mar. 17 th, 2006.

Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.

Hierarchical Dirichlet Process (HDP) A Dirichlet process (DP) is a discrete distribution that is composed of a weighted sum of impulse functions. Weights.

Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:

Memory Bounded Inference on Topic Models Paper by R. Gomes, M. Welling, and P. Perona Included in Proceedings of ICML 2008 Presentation by Eric Wang 1/9/2009.

Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.

Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.

Eric Xing © Eric CMU, Machine Learning Latent Aspect Models Eric Xing Lecture 14, August 15, 2010 Reading: see class homepage.

1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 7. Topic Extraction.

Storylines from Streaming Text The Infinite Topic Cluster Model Amr Ahmed, Jake Eisenstein, Qirong Ho Alex Smola, Choon Hui Teo, Eric Xing Carnegie Mellon.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.

Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,

Topic Modeling using Latent Dirichlet Allocation

Unsupervised Feature Selection for Multi-Cluster Data Deng Cai, Chiyuan Zhang, Xiaofei He Zhejiang University.

USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.

CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.

Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.

Semi-automatic Product Attribute Extraction from Store Website

Carnegie Mellon School of Computer Science Language Technologies Institute CMU Team-1 in TDT 2004 Workshop 1 CMU TEAM-A in TDT 2004 Topic Tracking Yiming.

CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct

NTU & MSRA Ming-Feng Tsai

CMU TDT Report November 2001 The CMU TDT Team: Jaime Carbonell, Yiming Yang, Ralf Brown, Chun Jin, Jian Zhang Language Technologies Institute, CMU.

2005/09/13 A Probabilistic Model for Retrospective News Event Detection Zhiwei Li, Bin Wang*, Mingjing Li, Wei-Ying Ma University of Science and Technology.

TDT 2004 Unsupervised and Supervised Tracking Hema Raghavan UMASS-Amherst at TDT 2004.

General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301

1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.

Anomaly Detection Nathan Dautenhahn CS 598 Class Lecture March 3, 2011.

Data Mining Practical Machine Learning Tools and Techniques

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1

Machine Learning Clustering: K-means Supervised Learning

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

Machine Learning Basics

Machine Learning Week 1.

Statistical Learning Dong Liu Dept. EEIS, USTC.

Learning with information of features

Probabilistic Models with Latent Variables

INTRODUCTION TO Machine Learning

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Machine Learning Support Vector Machine Supervised Learning

Machine Learning – a Probabilistic Perspective

Semi-Supervised Learning

Presentation transcript:

CMU at TDT 2004 — Novelty Detection Jian Zhang and Yiming Yang Carnegie Mellon University

Outline Novelty Detection Approaches –VSM + Clustering Techniques –Learning with multiple features –Support Vector Machines –Non-parametric Bayesian method TDT 2004 Evaluation Results

Reference: CMU TDT 2002/2003 report VSM + Clustering Techniques Vector Space Model (traditional IR technique) –Documents are represented as vectors –TFIDF term weighting is used, and similarity measure is chosen (e.g. cosine) Clustering –Lookahead window: GAC clustering Slight improvement (appr. 3%) –Past window: incremental clustering No improvement

Reference: CMU TDT 2002/2003 report Supervised learning with informative features Convert novelty detection to a supervised learning task –Positive/Negative data: novel/non-novel stories Build the model –Choose a learning algorithm (Logistic regression, SVM, etc.) –Try to select good features (features we tried: cosine score, cluster size, time stamp) Gives better story-weighted results in the past evaluations

[Reference] Support Region Estimation with SVM Treat the problem as “ density estimation ” Use one-class Support Vector Machines –One of the best performing supervised learning algorithms –Suitable high dimensional, sparse data –Has been successfully applied to novelty detection in hand-written digits Performance in novelty detection task –Worse than “ VSM + clustering ” –Unsupervised kernel selection is hard

[Reference] Non-parametric Bayesian method (Dirichlet Process Mixture Model) Density estimation method in statistics –Converges to empirical distribution asymptotically –Recently has been applied in machine learning/bio-informatics community Advantages: –Handle increasing number of clusters –Probabilistic interpretation Performance: –Comparable to “ VSM+clustering ” –More expensive

TDT 2004 NED Task Dataset –Large size: around 280k documents –Time period: Apr – Sep (6 months) Submitted Method: –VSM + GACINCR clustering System parameter tuning: –Use TDT3 corpus

TDT 2004 NED Results New Event Detection Cost

TDT 2004 NED DET-Curve

References CMU TDT 2001/2002/2003 report. Yiming Yang, Tom Pierce and Jaime Carbonell. A Study on Retrospective and Online Event Detection. SIGIR Yiming Yang, Jian Zhang, Jaime Carbonell and Chun Jin. Topic- conditioned Novelty Detection. SIGKDD Jian Zhang, Yiming Yang and Jaime Carbonell. New Event Detection with Nearest Neighbor, Support Vector Machines and Kernel Regression. CMU Tech. Report CMU-CS (CMU-LTI ). Jian Zhang, Zoubin Ghahramani and Yiming Yang. A Probabilistic Model for Online Document Clustering with Applications to Novelty Detection. NIPS 2004.