Download presentation
Presentation is loading. Please wait.
Published byScott Morton Modified over 9 years ago
1
Transductive Inference for Text Classification using Support Vector Machines - Thorsten Joachims (1999) 서울시립대 전자전기컴퓨터공학부 데이터마이닝 연구실 G201149027 노준호
2
Table of Contents Introduction Text Classification Transductive Support Vector Machines Experiments
3
Introduction Text classification (using SVM) be used to organize document databases filter spam learn users’ newsreding preferences problem little training data, large test set solution transductive inference (semi-supervised learning)
4
Text Classification Text classification using machine learning 1. to learn classifier from examples 2. classifier assign categories automatically Documents strings of characters ( feature : word) Information Retrieval(IR) research suggests that oword stems work computes, computing, computer comput oordering can be ignored
5
Text Classification - Representing text as a feature vector
6
Text Classification representation of text TF – IDF TF(term frequency) IDF(Inverse document frequency n : total number of documents oa word is low if it occurs in many documents oa word is highest if the word occurs in only one
7
Transductive Support Vector Machines SVM Minimize : subjet to :
8
Transductive Support Vector Machines TSVM - training examples : +/-, test examples : dot SVM TSVM
9
Transductive Support Vector Machines * : test data C : trade off margin size parameta : measure the degree of misclassification of the data TSVM
11
Transductive Support Vector Machines How can TSVM be any better? - strong co-occurrence patterns Training data : D1(category A), D6(category B) SVM Test data : D3 ? TSVM Test data : D3 A
12
Experiments Test Colletions Reuters-21578 dataset WebKB collection Ohsumed corpus Performance Measure Precision/Recall-Breakeven Point (F1 measure)
13
Experiments Reuters(Average)
14
Experiments WebKB(category course)
15
Experiments WebKB(category project)
16
Experiments - Reuters - Ohsumed - WebKB
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.