Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED SYSTEMS Semantic Smoothing for Text Clustering

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab Motivation (VSM) It assumes independency between the vocabulary terms and ignores all the conceptual relations between terms that potentially exist.

4 Intelligent Database Systems Lab Objectives To increase the importance of core words by considering the terms’ relations, and in parallel downsize the contribution of general terms, leading to better text clustering results.

5 Intelligent Database Systems Lab Methodology Document representations Relatedness measure Document clustering A GVSM-based semantic kernel S-VSM Top-k S-VSM 2.1 Omiotis 2.2 Wikipedia-based relatedness 2.3 Average of Omiotis and Wikipedia-based relatedness 2.4 Pointwise mutual information 3.1 Clustering algorithms 3.2 Algorithms complexity 3.3 Clustering criterion functions 1. 2. 3. 4. 5. 1.1 The Vector Space Model(VSM) 1.2 The Generalized Vector Space Model(GVSM)

6 Intelligent Database Systems Lab Methodology The Vector Space Model(VSM)

7 Intelligent Database Systems Lab Methodology The Generalized Vector Space Model(GVSM)

8 Intelligent Database Systems Lab Methodology Omiotis Wikipedia-based relatedness Average of Omiotis and Wikipedia-based relatedness

9 Intelligent Database Systems Lab Methodology Pointwise mutual information

10 Intelligent Database Systems Lab Methodology

11 Intelligent Database Systems Lab Methodology

12 Intelligent Database Systems Lab Methodology

13 Intelligent Database Systems Lab Methodology

14 Intelligent Database Systems Lab Experiments

15 Intelligent Database Systems Lab Experiments Vector similarity Evaluation measures

16 Intelligent Database Systems Lab Experiments Evaluation measures – Purity – Entropy – Error rate

17 Intelligent Database Systems Lab Experiments

18 Intelligent Database Systems Lab Experiments

19 Intelligent Database Systems Lab Experiments

20 Intelligent Database Systems Lab Experiments

21 Intelligent Database Systems Lab Experiments

22 Intelligent Database Systems Lab Experiments

23 Intelligent Database Systems Lab Experiments

24 Intelligent Database Systems Lab Conclusions The evaluation results demonstrated that S-VSM dominates VSM in performance in most of the combinations and compares favorably to GVSM. In order to further reduce the complexity of S-VSM we introduced an extension of it, namely the top-k S-VSM.

25 Intelligent Database Systems Lab Comments Advantages – It offers a very flexible kernel that can be applied within any domain or with any language. – The ability of the S-VSM perform much better than the VSM in the task of text clustering. – It very efficiently in terms of time and space complexity Applications -Text clustering -Semantic smoothing kernels


Download ppt "Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED."

Similar presentations


Ads by Google