A New Evolving Tree for Text Document Clustering and Visualization 1 Wui Lee Chang, 1* Kai Meng Tay, 2 Chee Peng Lim 1 Faculty of Engineering, Universiti.

Slides:



Advertisements
Similar presentations
1 Chian Haur Jong, 1* Kai Meng Tay, 2 Chee Peng Lim 1 Faculty of Engineering, Universiti Malaysia Sarawak, Malaysia 2 Centre for Intelligent Systems Research,
Advertisements

Dr. Lorayne Robertson, UOIT
People Counting and Human Detection in a Challenging Situation Ya-Li Hou and Grantham K. H. Pang IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART.
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
ADVISE: Advanced Digital Video Information Segmentation Engine
Creating Concept Hierarchies in a Customer Self-Help System Bob Wall CS /29/05.
Presented by Zeehasham Rasheed
Scalable Text Mining with Sparse Generative Models
Neural Network Homework Report: Clustering of the Self-Organizing Map Professor : Hahn-Ming Lee Student : Hsin-Chung Chen M IEEE TRANSACTIONS ON.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Memoplex Browser: Searching and Browsing in Semantic Networks CPSC 533C - Project Update Yoel Lanir.
Introduction to Data Mining Engineering Group in ACL.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
Intelligent Database Systems Lab Presenter: MIN-CHIEH HSIU Authors: NHAT-QUANG DOAN ∗, HANANE AZZAG, MUSTAPHA LEBBAH 2013 NN Growing self-organizing trees.
Intrusion Detection Using Hybrid Neural Networks Vishal Sevani ( )
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
Recent Trends in Text Mining Girish Keswani
南台科技大學 資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member,
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
ACTIVE LEARNING USING CONFORMAL PREDICTORS: APPLICATION TO IMAGE CLASSIFICATION HypHyp Introduction HypHyp Conceptual overview HypHyp Experiments and results.
An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.
INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY, P.P , MARCH An ANFIS-based Dispatching Rule For Complex Fuzzy Job Shop Scheduling.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology On Data Labeling for Clustering Categorical Data Hung-Leng.
Self Organization of a Massive Document Collection Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Teuvo Kohonen et al.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Artificial Neural Network Building Using WEKA Software
No. 1 Knowledge Acquisition from Documents with both Fixed and Free Formats* Shigeich Hirasawa Department of Industrial and Management Systems Engineering.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
October 2-3, 2015, İSTANBUL Boğaziçi University Prof.Dr. M.Erdal Balaban Istanbul University Faculty of Business Administration Avcılar, Istanbul - TURKEY.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
A Dynamic Query-tree Energy Balancing Protocol for Sensor Networks H. Yang, F. Ye, and B. Sikdar Department of Electrical, Computer and systems Engineering.
Classification using Decision Trees 1.Data Mining and Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Learning Game and Simulation Design through Multilayer Synchronous Collaboration in a Virtual Reality Environment A Pre-Prospectus Proposal Lewis F. Jones.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
1 Efficient Phrase-Based Document Similarity for Clustering IEEE Transactions On Knowledge And Data Engineering, Vol. 20, No. 9, Page(s): ,2008.
Introduction to Machine Learning August, 2014 Vũ Việt Vũ Computer Engineering Division, Electronics Faculty Thai Nguyen University of Technology.
Project GuideBenazir N( ) Mr. Nandhi Kesavan RBhuvaneshwari R( ) Batch no: 32 Department of Computer Science Engineering.
Recent Trends in Text Mining
Latent variable discovery in classification models
Prepared by: Mahmoud Rafeek Al-Farra
CATEGORIZATION OF NEWS ARTICLES USING NEURAL TEXT CATEGORIZER
Clustering medical and biomedical texts – document map based approach
Introductory Seminar on Research: Fall 2017
Self-Organizing Maps for Content-Based Image Database Retrieval
Fuzzy Support Vector Machines
Finding Clusters within a Class to Improve Classification Accuracy
Source: Pattern Recognition Vol. 38, May, 2005, pp
Prepared by: Mahmoud Rafeek Al-Farra
Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation Jinzhuo Wang, WenminWang, Ronggang Wang, Wen Gao.
Sequential Hierarchical Clustering
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Evolutionary Ensembles with Negative Correlation Learning
Presentation transcript:

A New Evolving Tree for Text Document Clustering and Visualization 1 Wui Lee Chang, 1* Kai Meng Tay, 2 Chee Peng Lim 1 Faculty of Engineering, Universiti Malaysia Sarawak, Malaysia 2 Centre for Intelligent Systems Research, Deakin University, Australia. 1* WSC 17 ( 2012)

Presentation Outline Introduction Problem Statements Motivations and Objectives Preliminary Evolving Tree A General Application framework for Evolving Systems The Proposed Procedure Experimental results Concluding Remarks

Introduction: Clustering To group sets of data based on their similarity levels to groups/clusters Examples are Self Organizing Map(SOM), K- mean, Fuzzy C-mean.

Introduction: Textual Document Clustering To cluster/group sets of textual document based on their similarity levels. To ease information retrieval. Examples – the naive Bayes-based document clustering model [21], – WEBSOM [22], and – support vector machines-based for imbalanced text document classification [23]. [21] Lewis, D.: Naïve Bayes at forty: The independence assumption in information retrieval. In: ECML (1998) [22] Azcarraga, A.P., Yap, T.J., Tan, J., Chua, T.S.: Evaluating keyword selection methods for WEBSOM text archives. In: IEEE Transactions on Knowledge and Data Engineering, vol.16, no.3, pp (2004) [23] Liu, T., Loh, H.T., Sun, A.: Imbalanced text classification: A term weighting approach. In: Expert Systems with Applications, vol.36, pp , (2009).

Problem Statements : 1 Traditional textual document clustering uses off-line learning. – Weakness:- needed to re-learn when new document is fed. – Adaptive or evolving feature model can be the alternative to traditional methods. – Evolving increase the learning flexibility. – WEBSOM focuses on off-line learning

Problem Statement: 2 For SOM ( or WEBSOM) – the difficulty in determining the map size before learning [19]. – The map size also affects the learning time [19]. [19] Pakkanen, J., Iivarinen, J., Oja, E.: The Evolving Tree – Analysis and Applications. In: IEEE Transactions on Neural Networks, vol. 17, no.3, pp (2006)

Motivations and Objectives To construct an adaptive textual document clustering tool based on Evolving Tree (ETree). To apply a general application framework for Evolving Systems [24]. To analyze the adaptive activity of the proposed method with UNIMAS ENCON 2008 articles. [24] Lughofer, E.: Evolving Fuzzy Systems – Methodologies, Advanced Concepts and Applications. Ed.1, Springer (2011)

Preliminary: Evolving Tree (ETree) Formed a tree structure that contains root node, trunk nodes and leaf nodes. Root node is the first created node in the tree. Trunk nodes is connecting the leaf nodes. Leaf nodes are the clusters formed. Able to expand hierarchically (form a tree structure) to scale the data. Hierarchical structure reduce the complexity control.

Preliminary: Evolving Tree (ETree)

Preliminary: Evolving Tree (ETree)- The learning Algorithm Finding of BMU. Updating leaf nodes. Expanding the tree.

Preliminary: Evolving Tree (ETree)-- Finding BMU BMU Tree depth Layer 1 Layer 2 Layer 3 Layer 4

Preliminary: Evolving Tree (ETree)-- Updating Leaf Nodes

BMU 1 2 3

A General Application framework for Evolving Systems [24] [24] Lughofer, E.: Evolving Fuzzy Systems – Methodologies, Advanced Concepts and Applications. Ed.1, Springer (2011)

The Proposed Procedure Updating terms of articles ETree Fetching on-line article Refining trained model

The Proposed Procedure :Preprocessing Text

The Proposed Procedure :Term Weighting

The Proposed Procedure : Similarity Match Histogram

The Proposed Procedure : Expanding Tree

Experimental results: Observation

Root node Trunk node Leaf node

Experimental results: Complexity Control Time(s) Label for articles

Number of Clusters Tree sizeTree depth

Concluding Remarks With the proposed approach, articles from ENCON 2008 could be clustered and visualized as a tree structure. In short, the proposed approach constitutes to a new decision support supporting tool for conference organizer. Besides, the proposed procedure could be useful with a larger number of articles with an expected increase in the computation complexity.

Future Works

Thank you for your attentions