Download presentation
Presentation is loading. Please wait.
Published byHugo Banks Modified over 9 years ago
1
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : CHRISTOS BOURAS, VASSILIS TSOGKAS 2012, KBS A clustering technique for news articles using WordNet
2
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
3
Intelligent Database Systems Lab Motivation Document clustering is a powerful technique that has been widely. That some of the problems like synonymy, ambiguity and lack of a descriptive content marking of the generated clusters.
4
Intelligent Database Systems Lab Objectives We are proposing the enhancement of standard k- means algorithm using the external knowledge from WordNet hypernyms. The proposed method enabled significantly improves k-means generating also useful and high quality cluster.
5
Intelligent Database Systems Lab Methodology-Framework
6
Intelligent Database Systems Lab Methodology - Euclidian Distance & City-block Distance
7
Intelligent Database Systems Lab Methodology - Pearson
8
Intelligent Database Systems Lab Methodology - Cosine Distance
9
Intelligent Database Systems Lab Methodology - Spearman-rank Distance
10
Intelligent Database Systems Lab Methodology -Kendall Distance
11
Intelligent Database Systems Lab Methodology - Comparison of various methods Euclidian City-Block Cosine Kendall Spearman Pearson
12
Intelligent Database Systems Lab Methodology - heuristic function For Example for ‘fruit’ d=9, f=2 then W=0.9954 For Example for ‘edible fruit’ d=7, f=1 then W=0.8915’ For Example for ‘food’ d=5, f=1 then W=0.6534
13
Intelligent Database Systems Lab Methodology - Enriching news articles using WordNet hypernyms
14
Intelligent Database Systems Lab Methodology - Labeling clusters using WordNet hypernyms
15
Intelligent Database Systems Lab Methodology - News article’s clustering using W-k means
16
Intelligent Database Systems Lab Experiments
17
Intelligent Database Systems Lab Experiments
18
Intelligent Database Systems Lab Experiments With WordNet use Without WordNet use 0.0001 → ←0.0010 1.000
19
Intelligent Database Systems Lab Experiments
20
Intelligent Database Systems Lab Experiments
21
Intelligent Database Systems Lab Experiments
22
Intelligent Database Systems Lab Conclusions From the plethora of similarity measures that have been used, the appliance of Euclidian and cosine k-means produced the best results. We have also presented a novel algorithmic approach towards enhancing the k-means algorithm using knowledge from an external database, WordNet.
23
Intelligent Database Systems Lab Comments Advantages -The resulting labels are with high precision Applications -News clustering -Cluster labeling
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.