Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED.

Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED SYSTEMS Semantic Smoothing for Text Clustering

Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab Motivation (VSM) It assumes independency between the vocabulary terms and ignores all the conceptual relations between terms that potentially exist.

Intelligent Database Systems Lab Objectives To increase the importance of core words by considering the terms’ relations, and in parallel downsize the contribution of general terms, leading to better text clustering results.

Intelligent Database Systems Lab Methodology Document representations Relatedness measure Document clustering A GVSM-based semantic kernel S-VSM Top-k S-VSM 2.1 Omiotis 2.2 Wikipedia-based relatedness 2.3 Average of Omiotis and Wikipedia-based relatedness 2.4 Pointwise mutual information 3.1 Clustering algorithms 3.2 Algorithms complexity 3.3 Clustering criterion functions 1. 2. 3. 4. 5. 1.1 The Vector Space Model(VSM) 1.2 The Generalized Vector Space Model(GVSM)

Intelligent Database Systems Lab Methodology The Vector Space Model(VSM)

Intelligent Database Systems Lab Methodology The Generalized Vector Space Model(GVSM)

Intelligent Database Systems Lab Methodology Omiotis Wikipedia-based relatedness Average of Omiotis and Wikipedia-based relatedness

Intelligent Database Systems Lab Methodology Pointwise mutual information

Intelligent Database Systems Lab Methodology

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments Vector similarity Evaluation measures

Intelligent Database Systems Lab Experiments Evaluation measures – Purity – Entropy – Error rate

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Conclusions The evaluation results demonstrated that S-VSM dominates VSM in performance in most of the combinations and compares favorably to GVSM. In order to further reduce the complexity of S-VSM we introduced an extension of it, namely the top-k S-VSM.

Intelligent Database Systems Lab Comments Advantages – It offers a very flexible kernel that can be applied within any domain or with any language. – The ability of the S-VSM perform much better than the VSM in the task of text clustering. – It very efficiently in terms of time and space complexity Applications -Text clustering -Semantic smoothing kernels

Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED.

Similar presentations

Presentation on theme: "Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED.

Similar presentations

Presentation on theme: "Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED."— Presentation transcript:

Similar presentations

About project

Feedback