Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A 24-h forecast of solar irradiance using artificial neural.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Validating Transliteration Hypotheses Using the Web: Web.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel document similarity measure based on earth mover’s.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Discovering Leaders from Community Actions Presenter : Wu, Jia-Hao Authors : Amit Goyal, Francesco Bonchi,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text classification based on multi-word with support vector.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Probabilistic Model for Definitional Question Answering.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comparison of neural network models with ARIMA and regression models for prediction of Houston's daily.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology On Data Labeling for Clustering Categorical Data Hung-Leng.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extraction Presenter : Jiang-Shan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A quantitative stock prediction system based on financial news Presenter : Chun-Jung Shih Authors :Robert.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Concept similarity in Formal Concept Analysis-An information.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 AC-ViSOM: Hybridising the Modified Adaptive Coordinate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Plagiarism Detection Technique for Java Program Using.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. TurSOM: A Turing Inspired Self-organizing Map Presenter: Tsai Tzung Ruei Authors: Derek Beaton, Iren.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Word sense disambiguation of WordNet glosses Presenter: Chun-Ping Wu Author: Dan Moldovan, Adrian Novischi.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Adaptation of the Vector-Space Model for Ontology-Based.
國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing map learning nonlinearly embedded manifoldsmanifolds Author :Timo Simila.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology SEP/COP: An efficient method to find the best partition.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fraud detection in online consumer reviews Presenter: Tsai Tzung Ruei Authors: Nan Hu, Ling Liu, Vallabh.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Extreme Visualization: Squeezing a Billion Records into a Million Pixels Presenter : Jiang-Shan Wang.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Iterative Translation Disambiguation for Cross-Language.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Regularization in Matrix Relevance Learning Petra Schneider,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Region-based image retrieval using integrated color, shape,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A personal route prediction system base on trajectory.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining concept maps from news stories for measuring civic scientific literacy in media Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Mechanisms and Cluster Identification with TurSOM.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Self Organizing Maps and Bit Signature: a study applied.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Modeling Semantic Similarities in Multiple Maps Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A survey of kernel and spectral methods for clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Providing Justifications in Recommender Systems Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Predicting corporate bankruptcy using a self-organizing map: An empirical study to improve the forecasting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Community self-Organizing Map and its Application to Data Extraction Presenter: Chun-Ping Wu Authors:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Key Blog Distillation: Ranking Aggregates Presenter : Yu-hui Huang Authors :Craig Macdonald, Iadh Ounis.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text Classification, Business Intelligence, and Interactivity:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A support system for predicting eBay end prices Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 f-information measures in medical image registration Presenter.
Presentation transcript:

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary Presenter: Chun-Ping Wu Authors: Yeohoon Yoon, Choong-Nyoung Seon, Songwook Lee, Jungynu Seo IPM 2007 國立雲林科技大學 National Yunlin University of Science and Technology

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Methodology Experiments Conclusion Comments 2

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation The Word Sense Disambiguation is a common problem in natural language processing. Traditional approaches only consider the co-occurrence probability alone. 3 Sample: I deposit some money in the bank. Options: bank = 銀行? bank = 堤 ; 岸? bank = ( 一 ) 排; ( 一 ) 組

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective To construct a WSD system, which can be easily implemented by learning all polysemous words at once, while covering all polysemous words which are listed in MRD. To consider relation between each sense of context words and the sense of the target word. 4 Sample: I deposit some money In the bank. Ans: bank = 銀行

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Learning step  Similarity matrix  Word vector  Vector representations of sense definitions in MRD Disambiguation step  The definition of acyclic weighted digraph.  Selecting context words  Constructing the acyclic weighted digraph  Searching the optimal path on the acyclic weighted digraph 5

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Learning step  Similarity matrix  Word vector  Vector representations of sense definitions in MRD 6

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Learning step  Similarity matrix  Word vector  Vector representations of sense definitions in MRD. 7

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Learning step  Similarity matrix  Word vector  Vector representations of sense definitions in MRD 8

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Disambiguation step  The definition of acyclic weighted digraph.  Selecting context words  Constructing the acyclic weighted digraph  Searching the optimal path on the acyclic weighted digraph 9

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Disambiguation step  The definition of acyclic weighted digraph.  Selecting context words  Constructing the acyclic weighted digraph  Searching the optimal path on the acyclic weighted digraph 10

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Disambiguation step  The definition of acyclic weighted digraph.  Selecting context words  Constructing the acyclic weighted digraph  Searching the optimal path on the acyclic weighted digraph 11

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Disambiguation step  The definition of acyclic weighted digraph.  Selecting context words  Constructing the acyclic weighted digraph  Searching the optimal path on the acyclic weighted digraph 12

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments System results 13

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Experiment on English  The accuracy of the system is 30.7% on average.  The result is very low; there are some reasons as follows.  Context words are not appropriate although context words are very important in that they decide which sense of the target word might be the best.  Mapping English senses to Korean for using English-Korean dictionary leads to some loss of information.  The errors of the stemming process disturbed us to search the right root of the verb in the MRD. 14

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion 15 To consider the relationship between each sense of context words and the sense of the target word By using Viterbi algorithm to reduce computational complexity. The system showed bad results on English (30.7), but it resulted in suitable performances, 76.4% by accuracy, over the semantically ambiguous Korean words. To apply this method to other languages by studying language characteristics.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments 16 Advantage  To consider the relationship between each sense of context words and the sense of the target word.  By using Viterbi algorithm to reduce computational complexity. Drawback  The performance of this system is better in Korean. Application  Word Sense Disambiguation