Download presentation
Presentation is loading. Please wait.
Published byGriffin Newman Modified over 9 years ago
1
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain Presenter : Cheng-Hui Chen Authors : Mahmoud F. Hussin, Mahmoud R. farra and Yasser El- Sonbaty IJCNN, 2008
2
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines Motivation Objectives Methodology Experiments Conclusions Comments
3
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation The variants of SOM are limited by the fact that they use only the VSM for document representations. ─ It does not represent any relation between the words. ─ The space complexity to the VSM. The sentences being broken down into their individual components without any representation of the sentence structure. 3 river rafting mild
4
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives Using graphs to represent documents helped the salient features of data through using edges to represent relations and using vertices to represent words. The decrease the space complexity comparing to the VSM. The extend the GHSOM to work in the graph domain to enhance the quality of clusters. 4 rafting river mild
5
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 5
6
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Enhance the DIG to work with G-GHSOM The Document Index Graph (DIG) model ─ the DIG for representing the document and Exploited it in the document clustering. For example (the document table of word "river" is shown) ─ River rafting. (doc1) ─ mild river rafting. (doc2) ─ River fishing. (doc3) 6 e1e1 S 0 (1) e0e0 S 0 (0) river 3 3
7
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Enhance the DIG to work with G-GHSOM 7 1 2 1 2 3 4 3 4
8
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Enhance the DIG to work with G-GHSOM Single-word similarity measure Two document vectors similarity measure The total similarity is the integration 8
9
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Ment of the G-GHSOM to work with graph Neuron Initialization ─ Detecting the matching list to calculate the phrase based similarity. 9
10
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Ment of the G-GHSOM to work with graph 10
11
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Ment of the G-GHSOM to work with graph 11 G in
12
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Data set ─ Reuter’s news articles (RNA) ─ University of Waterloo and Canadian Web sites (UW- CAN) 12
13
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 13
14
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions The extend the GHSOM to a new graph based GHSOM: (G-GHSOM) to enhance the quality of the document clustering. ─ G-GHSOM works successfully with graph domain and achieves a better quality clustering than TGHSOM in document clustering. The enhanced the DIG model to work with GHSOM algorithm. 14
15
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 15 Comments Advantages ─ Enhance the quality of the document clustering Application ─ SOM ─ Clustering
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.