Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS 2013. KNOWLEDGE-BASED.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : KADIM TA¸SDEMIR, PAVEL MILENOV, AND BROOKE TAPSALL 2011,IEEE Topology-Based Hierarchical.
Advertisements

Intelligent Database Systems Lab Presenter: WU, JHEN-WEI Authors: Jorge Gorricha, Victor Lobo CG Improvements on the visualization of clusters in.
Intelligent Database Systems Lab Presenter : HONG, CHIA-TSE Authors : Yueh-Min Huang, Yong-Ming Huang, Shu-Hsien Huang, Yen-Ting Lin 2012, CE A ubiquitous.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel document similarity measure based on earth mover’s.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Laurens van der Maaten and Geoffrey Hinton ML Visualizing non-metric similarities.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Efficient Concept-Based Mining Model for Enhancing.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Liang-Chu Chen, Ting-Jung Yu, Chia-Jung Hsieh ACM KeyGraph-based chance discovery.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text classification based on multi-word with support vector.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : UNIVERSIT´E CATHOLIQUE DE LOUVAIN, BELGIUM ASSOCIATION FOR COMPUTING MACHINERY.
Intelligent Database Systems Lab Presenter: MIN-CHIEH HSIU Authors: NHAT-QUANG DOAN ∗, HANANE AZZAG, MUSTAPHA LEBBAH 2013 NN Growing self-organizing trees.
Special Topics in Text Mining Manuel Montes y Gómez University of Alabama at Birmingham, Spring 2011.
Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors : JEROEN DE KNIJFF, FLAVIUS FRASINCAR, FREDERIK HOGENBOOM DKE Data & Knowledge.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Evaluation of novelty metrics for sentence-level novelty mining Presenter : Lin, Shu-Han Authors : Flora.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : Jorge Villalon and Rafael A. Calvo 2011, EST Concept Maps as Cognitive Visualizations.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase.
Intelligent Database Systems Lab Presenter : JHOU, YU-LIANG Authors :Shady Shehata, Fakhri Karray, Mohamed S. Kamel, Fellow 2012, IEEE An Efficient Concept-Based.
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Sheng-Tun Li a,b,*, Fu-Ching Tsai a 2013, KBS A fuzzy conceptualization model for.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Determining the best K for clustering transactional datasets – A coverage density-based approach Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. How valuable is medical social media data? Content analysis of the medical web Presenter :Tsai Tzung.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : GUENAEL CABANES, YOUNES BENNANI, DOMINIQUE FRESNEAU ELSEVIER Improving the Quality.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Intelligent Database Systems Lab Presenter: Wu, Jhen-Wei Authors: Fabian Bürger, Josef Pauli ICPRAM. Representation Optimization with Feature Selection.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : CHRISTOS BOURAS, VASSILIS TSOGKAS 2012, KBS A clustering technique for news articles.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Bui Quang Hung, Masanori Otsubo, Yoshinori Hijikata, Shogo Nishida 2010.WIA. HITS.
Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Eghbal G. Mansoori 2011,IEEE FRBC: A Fuzzy Rule-Based Clustering Algorithm.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : HAI V. PHAM, ERIC W. COOPER, THANG CAO, KATSUARI KAMEI INFORMATION SCIENCES Hybrid.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Kevin Meijer, Flavius Frasincar, Frederik Hogenboom 2014.DSS. A semantic approach.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
INFORMATION RETRIEVAL PROJECT Creation of clusters of concepts that represent a domain corpus.
Intelligent Database Systems Lab Presenter: Bei-YI Jiang Authors: Chantal Baril, Viviane Gascon, Stephanie Cartier CIE Design and Analysis of an.
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Wen Zhang, Taketoshi Yoshida, Xijin Tang 2011.ESWA A comparative study of TF*IDF,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Longzhuang Li, Yi Shang, Wei Zhang 2002.ACM. Improvement of HITS-based Algorithms.
Intelligent Database Systems Lab Presenter : CHANG, SHIH-JIE Authors : Ya-Han Hu, Fan Wu a, Chia-Lun Lo, Chun-Tien Tai b 2012.AIM. Predicting warfarin.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Identifying Domain Expertise of Developers from Source Code Presenter : Wu, Jia-Hao Authors : Renuka.
Intelligent Database Systems Lab Presenter: NENG-KAI, HONG Authors: HUAN LONG A, ZIJUN ZHANG A, ⇑, YAN SU 2014, APPLIED ENERGY Analysis of daily solar.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : STEPHEN T. O’ROURKE, RAFAEL A. CALVO and Danielle S. McNamara 2011, EST Visualizing.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Christopher C. Yang and Tobun Dorbin Ng TSMCA Analyzing and Visualizing Web Opinion.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Community self-Organizing Map and its Application to Data Extraction Presenter: Chun-Ping Wu Authors:
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Junping Zhang, Hua Huang and Jue Wang IEEE INTELLIGENT SYSTEMS Manifold Learning.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Vittorio Carlei, Massimiliano Nuccio PRL Mapping industrial patterns in spatial agglomeration:
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Emilio Corchado, Bruno Baruque 2012 NeurCom WeVoS-ViSOM: An ensemble summarization.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee IPM Multilingual document mining.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive Clustering for Multiple Evolving Streams Graduate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Decision trees for hierarchical multi-label classification.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A method of extracting malicious expressions in bulletin board systems by using context analysis Presenter:
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Yong-Bin Kang, Pari Delir Haghighi, Frada Burstein ESA CFinder: An intelligent key.
2016/9/301 Exploiting Wikipedia as External Knowledge for Document Clustering Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, and Xiaohua Zhou Proceeding.
True/False questions (3pts*2)
Motivation and Background
Motivation and Background
Presentation transcript:

Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS KNOWLEDGE-BASED SYSTEMS Semantic Smoothing for Text Clustering

Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab Motivation (VSM) It assumes independency between the vocabulary terms and ignores all the conceptual relations between terms that potentially exist.

Intelligent Database Systems Lab Objectives To increase the importance of core words by considering the terms’ relations, and in parallel downsize the contribution of general terms, leading to better text clustering results.

Intelligent Database Systems Lab Methodology Document representations Relatedness measure Document clustering A GVSM-based semantic kernel S-VSM Top-k S-VSM 2.1 Omiotis 2.2 Wikipedia-based relatedness 2.3 Average of Omiotis and Wikipedia-based relatedness 2.4 Pointwise mutual information 3.1 Clustering algorithms 3.2 Algorithms complexity 3.3 Clustering criterion functions The Vector Space Model(VSM) 1.2 The Generalized Vector Space Model(GVSM)

Intelligent Database Systems Lab Methodology The Vector Space Model(VSM)

Intelligent Database Systems Lab Methodology The Generalized Vector Space Model(GVSM)

Intelligent Database Systems Lab Methodology Omiotis Wikipedia-based relatedness Average of Omiotis and Wikipedia-based relatedness

Intelligent Database Systems Lab Methodology Pointwise mutual information

Intelligent Database Systems Lab Methodology

Intelligent Database Systems Lab Methodology

Intelligent Database Systems Lab Methodology

Intelligent Database Systems Lab Methodology

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments Vector similarity Evaluation measures

Intelligent Database Systems Lab Experiments Evaluation measures – Purity – Entropy – Error rate

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Conclusions The evaluation results demonstrated that S-VSM dominates VSM in performance in most of the combinations and compares favorably to GVSM. In order to further reduce the complexity of S-VSM we introduced an extension of it, namely the top-k S-VSM.

Intelligent Database Systems Lab Comments Advantages – It offers a very flexible kernel that can be applied within any domain or with any language. – The ability of the S-VSM perform much better than the VSM in the task of text clustering. – It very efficiently in terms of time and space complexity Applications -Text clustering -Semantic smoothing kernels