Download presentation
Presentation is loading. Please wait.
Published byAntony Parks Modified over 9 years ago
1
1 LiveClassifier: Creating Hierarchical Text Classifiers through Web Corpora Chien-Chung Huang Shui-Lung Chuang Lee-Feng Chien Presented by: Vu LONG
2
2 Outline 1. Introduction 2. LiveClassifier 3. Evaluation 4. Contribution 5. Future work
3
3 Introduction http://140.109.19.252:8080/charles/index.jsp Uses Web search-result pages as the corpus source Exploits the structure information in the topic hierarchy to train the classifier Creates key terms to amend the insufficiency of the topic hierarchy
4
4 LiveClassifier (Demo version) Classify documents Computer Science Classifier is chosen There are three created classifiers (topics): Computer Science, Europe, Scientists based on Yahoo! directory
5
5 LiveClassifier Classify documents Pseudo class
6
6 LiveClassifier Users can self create their classifiers
7
7 LiveClassifier Feature Extractor - Interacts with Search Engine and extracts highly-ranked search snippets as effective feature source - Outputs feature vectors to describe both topic classes and text objects
8
8 LiveClassifier Hier-Concept-Query-Formulation - Formulate query through the topic hierarchy
9
9 LiveClassifier Text Classifier
10
10 Evaluation Overall performance evaluation
11
11 Evaluation Granularity & Diversity - Classifying text objects into different levels of the topic hierarchy got roughly the same results.
12
12 Evaluation Thematic Metadata for Textual Data
13
13 Evaluation Paper Title Classification - Collect data from 4 CS conferences in 2002 - Classify them into 36 second-level CS classes
14
14 Contribution Finds the ways to collect and organize corpora effectively Creates key terms to amend the insufficiency of the topic hierarchy Classifies text objects automatically without a pre-labeled training set Cooperates with Web information services and other systems easily Helps to create more refined data (thematic metadata) for textual data
15
15 Future work Optimize the classifier based by focusing on the training stage rather than only on organizing corpora Improve responding time Find appropriate pseudo classes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.