Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graph-based cluster labeling using Growing Hierarchal SOM Mahmoud Rafeek Alfarra College Of Science & Technology The second International.

Similar presentations


Presentation on theme: "Graph-based cluster labeling using Growing Hierarchal SOM Mahmoud Rafeek Alfarra College Of Science & Technology The second International."— Presentation transcript:

1 Graph-based cluster labeling using Growing Hierarchal SOM Mahmoud Rafeek Alfarra College Of Science & Technology m.farra@cst.ps The second International conference of Applied Science & natural Ayman Shehda Ghabayen College Of Science & Technology a.ghabayen@cst.ps Prepared by:

2 Out Line  Labeling, What and why ?  Graph based Representation  Growing Hierarchal SOM  Extraction of labeles of clusters

3 Labeling, What and why ?  Cluster labeling: process tries to select descriptive labels (Key words) for the clusters obtained through a clustering algorithm.

4 Labeling, What and why ?  Cluster labeling is an increasingly important task that: 1.The document collections grow larger. 2.Help To: work with processing of news, email threads, blogs, reviews, and search results

5 Labeling, What and why ? Documents collection Document Labeled Clusters Preprocessing Step DIG Model XB S O L A G C D Clustering Process + Labeling 0G00G0 0G10G1 0Gs0Gs SOM 1G01G0 1G11G1 1Gs1Gs 2G12G1 2G22G2 Hierarchal Growing SOM 2G12G1 2G22G2 1G01G0 1G11G1 2G12G1 2G22G2

6 Graph based Representation 010110010110 259637259637 100000100000 A B X D N C S 2,3 3,3 1,3 1,1 ph1 ph2 ph3 ph4 ph5

7 Graph based Representation  Capture the silent features of the data.  DIG Model: a directed graph. A document is represented as a vector of sentences Phrase indexing information is stored in the graph nodes themselves in the form of document tables. e1e1 e0e0 e2e2 rafting adventures river Document Table e 0 S 1 (1), S 2 (2), S 3 (1) e 0 S 2 (1) e 2 S 1 (2) e 1 S 4 (1) fishing DocTFET 1{0,0,3} 2{0,0,2} 3{0,0,1} S1(2) # Sentence Position of term

8 Graph based Representation Example Document 1 River rafting Mild river rafting River rafting trips Document 2 Wild river adventures River rafting vocation plan fishing trips fishing vocation plan booking fishing trips river fishing mild river rafting trips mild river rafting trips wild adventures vocation plan wild plan mild river rafting trips adventures vocation booking fishing +

9 Growing Hierarchal SOM

10  Determining the winning node … v1v1 v2v2 v3v3 v5v5 v4v4 v7v7 e0e0 v6v6 e0e0 e1e1 e5e5 e3e3 e2e2 e4e4 n-nodes in SOM (Gs) v1v1 v2v2 v5v5 v7v7 e0e0 v6v6 e0e0 e1e1 e5e5 e3e3 Input Document Graph (Gi) Phrases Significance GiGs length Gi

11 Growing Hierarchal SOM  Neuron updating in the graph domain A BD C e0e0 X e0e0 e1e1 e5e5 e3e3 Y B D C E e4e4 e1e1 e5e5 e3e3 A e2e2 e2e2 G1 G2 We choose increasing the matching phrases to update graphs due to its affect is more stronger than increasing terms (nodes) also add matching phrases can consider it as add ordered pair of nodes

12 Over all Document clustering Process

13 Extracting labeling of clusters  To extract the Key word, we need to build a table for each cluster as the following: TermTF- Locations {T, L,B,b} No of matching phrases (MP) Weight Weight = (f1*T + f2*L + f3*B+ f4*b) * 0.4 + MP * 0.6

14 Extracting labeling of clusters T1 T2 T3 T10 T4 T7 T8 T11 T6 T5 T9 TermF-weight# MPNet weight T212.42 (T2,T3), (T2,T5)4.96 + 1.2 =6.16 T310.22 (T2,T3), (T5,T3)4.08 + 1.2= 5.28 T516.63 (T2,T5), (T8, T5), (T5,T3)6.4+ 1.8= 6.4 T814.41 (T8,T5)5.76+ 0.6=6.36

15 Thank You … Questions


Download ppt "Graph-based cluster labeling using Growing Hierarchal SOM Mahmoud Rafeek Alfarra College Of Science & Technology The second International."

Similar presentations


Ads by Google