Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Qing.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Gianfranco Chicco, Roberto Napoli Federico Piglione, Petru Postolache.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On Rival Penalization Controlled Competitive Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel document similarity measure based on earth mover’s.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Graph self-organizing maps for cyclic and unbounded graphs.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Finding Terminology Translations From Hyperlinks On the.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Keng-Wei Chang Author: Yehuda.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning Phonetic Similarity for Matching Named Entity.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Plagiarism Detection Technique for Java Program Using.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Using the Web for Automated Translation Extraction in.
A Fuzzy k-Modes Algorithm for Clustering Categorical Data
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing map learning nonlinearly embedded manifoldsmanifolds Author :Timo Simila.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Motivated Reinforcement Learning for Non-Player Characters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Automatic Extraction of Translational Japanese- KATAKANA.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors :
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A text mining approach on automatic generation of web.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Iterative Translation Disambiguation for Cross-Language.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Unsupervised Learning with Mixed Numeric and Nominal Data.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A hierarchical clustering algorithm for categorical sequence.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Self Organizing Maps and Bit Signature: a study applied.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Modeling Semantic Similarities in Multiple Maps Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Predicting corporate bankruptcy using a self-organizing map: An empirical study to improve the forecasting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Hierarchical model-based clustering of large datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2005.ACM GECCO.8.Discriminating and visualizing anomalies.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Sanghamitra.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive Clustering for Multiple Evolving Streams Graduate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology IEEE EC1 Generating War Game Strategies Using A Genetic.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Qing Ma, Kyoko Kanzaki, Yujie Zhang, Masaki Murata, Hitoshi Isahara Self-organizing semantic maps and its application to word alignment in Japanese-Chinese parallel Neural Networks 17 (2004) 1241–1253

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Introduction Self-organizing monolingual semantic maps Experimental Results Conclusions Personal Opinion

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation  A number of corpus-based statistical approaches have been used to compute word similarity  It is difficult to recognize the relationships between groups or the relationships between words within groups

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective  We need a technique that can map words from a very large lexicon into a small semantic space  A visible representation where words with similar meanings are placed at the same or neighboring points so that the distance between the points represents the semantic similarity in the words  Semantic maps can be automatically constructed with self-organization

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction  Presents a method of self-organizing monolingual semantic maps for Chinese and Japanese using SOM for specific purpose  To construct semantic maps of nouns from the point of view of the adnominal constituents  Extended to the construction of Japanese–Chinese bilingual semantic maps

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Self-organizing monolingual semantic maps Data coding Baseline method Frequency term-weighting method TFIDF term-weighting method d ij is the word similarity 。。。。。。

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Data coding Word w i can be defined by a set of its co-occurring words as V(w i ) is the input to the SOM only reflects the relationships between a pair of words

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Data coding method  Baseline method ─ d ij : word similarity ─ a i & a j : are the numbers of co-occurring words of w i and w j ─ c ij : is the number of co-occurring words that both w i and w j have in common  Frequency term-weighting method  TFIDF term-weighting method

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Table 1 Comparative results for various coding methods and clustering

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Evaluation methods  Numerical evaluation ─ precision ─ recall ─ F-measure  Intuitive evaluation ─ our ‘common sense’  Comparison with other methods ─ multivariate statistical analyses

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Experimental Results

Intelligent Database Systems Lab N.Y.U.S.T. I. M.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. TFIDF comparison with PCA Fig. 2. Chinese semantic map using principal component analysis Fig. 1. Chinese semantic map based on TFIDF term-weighted coding

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Table 2 Clustering results with TFIDF term-weighted coding The underlined words are those classified into incorrect areas.

Intelligent Database Systems Lab N.Y.U.S.T. I. M.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Semantic map comparison with PCA Fig. 3. Japanese semantic map based on the TFIDF term-weighted coding method Fig. 4. Japanese semantic map using principal component analysis

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Self-organizing bilingual semantic maps(1/2)  When a translation pair of sentences like  Each Japanese word can therefore be automatically aligned to a Chinese word from this map by measuring its distance  If the Chinese word keyi (can) is closest to the Japanese word seta (can), then the Japanese word seta (can) is regarded as being aligned to the Chinese word keyi (can) (Japanese) keiei toppu ga tei seichou jidai teichaku wo jikkan shite iru koto wo ukagawa seta. (Chinese) youci keyi kanchu, zuigao jingyingzhe shengan jingji ren tingliu zai dishu zengzhang shidai. (English) We can see that upper management has realized that the economy is fixed in an eras of slow growth.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Self-organizing bilingual semantic maps(2/2)  A small-scale (10 translation pairs) experimental comparison with the baseline method  Comparison with hierarchical clustering and multivariate statistical analysis

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Data coding(1/2) (Japanese) keiei toppu ga tei seichou jidai teichaku wo jikkan shite iru koto wo ukagawa seta. (Chinese) youci keyi kanchu, zuigao jingyingzhe shengan jingji ren tingliu zai dishu zengzhang shidai. (English) We can see that upper management has realized that the economy is fixed in an eras of slow growth. Ji (i=1,.,m) are Japanese words forming the Japanese sentence Ci (i=1,.,n) are Chinese words forming the translated Chinese sentence

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Data coding(2/2) is a co-occurring word of Ji is the normalized co-occurrence frequency is a co-occurring word of either or severals of J j1 ;.; J j;ni is the normalized co-occurrence frequency

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Semantic map comparison with PCA

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Semantic map comparison with Baseline Table 3 Word alignment result obtained from semantic map Table 4 Baseline word alignment results

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions and Future Work  Proposed a method of self-organizing monolingual semantic maps for Japanese and Chinese  Experimental results proved that these maps were generally consistent with our intuition  Comparison demonstrated that the hierarchical clustering technique is inferior to SOM in terms of classifying ability  Furthermore, multivariate statistical analysis such as principal component analysis and factor analysis gave worse results

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions and Future Work  An extension to the automatic construction of bilingual semantic maps of Japanese and Chinese  Develop an automatic method of transforming both Japanese and Chinese words

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Personal Opinion  …..