Download presentation
Presentation is loading. Please wait.
Published byCory Watts Modified over 9 years ago
1
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining knowledge from natural language texts using fuzzy associated concept mapping Presenter : Wu, Jia-Hao Authors : W.M. Wang, C.F.Cheung,W.B. Lee, S.K. Kwork IPM (2008) ˜
2
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 22 Outline Motivation Objective Methodology Experiments Conclusion Comments
3
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation The amount of data of all kinds available electronically is increasing dramatically. In the enterprises, about 80-98% of all data is consists of unstructured or semi-structured documents. Knowledge presented in may documents has an informal, unstructured shape. It has to be converted to a formal shape, with precisely defined syntax and semantics. (ex: document annotations)
4
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective Extracting the propositions in text so as to construct a concept map automatically. The technique, Fuzzy Association Concept Mapping (FACM), is consists of a linguistic module and a recommendation module. Provides a method which can be easily convert by computer. Users can convert scientific and short texts into a structured format. Provides knowledge workers with extra time to rethink their written text and to view their knowledge from another angle.
5
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Objective (Cont.)
6
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 Methodology-FACM The relations and concepts are generated from the document itself rather than retrieved from predefined ontologies. It uses the syntactic structure of the sentences to find relations between the words. An anaphoric resolution is applied based on rule-based reasoning (RBR) and case-based reasoning (CBR) for solving ambiguities arising during the syntactic analysis. This enables a dynamic method of anaphoric resolution that is continually improved.
7
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Methodology-Architecture of FACM. Step 1.Input the Sentence. Step 2.Parsing by POS tagger. Step 3.Case encoding Step 4.Produce the Solution.
8
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 Methodology-FACM’s Anaphora resolution The similarity between the new case and old cases is calculated based on nearest neighbor matching. (1) (2)
9
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Methodology-Proposition recommendation The normalized frequency of concept i and concept j co- existing in the same or adjacent sentence is calculated:
10
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Methodology-the relationship between concepts. (a) (b) (c) IF the normalized frequency of two concepts co-existing in the same sentence is High, THEN the relationship between the two concepts is High(0.7). IF the normalized frequency of two concepts co-existing in the adjacent sentence is High, THEN the relationship between the two concepts is Medium(0.2). The COG of fuzzy set A on the interval a 1 to a 2 with membership function u A is given:
11
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Experiments-SCI abstracts & News from CNET
12
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 12 Experiments-Results of algorithm evaluation
13
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 13 Conclusion Provides an interactive way for concept map builders. Rethink their concept maps. Adapt and Refine the suggestions for completing the concept maps. A human-like construction of concept maps can be achieved. The highly accurate for use in extracting concepts from scientific and short texts such as abstract databases, news groups, emails, discussion forums, etc. Future work The system should be evaluated on bigger collections with more candidate users. The evaluation of the interactive process of the framework is also an essential element. Qualitative methods may be used to evaluate the effectiveness of the recommendation process.
14
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 14 Comments Advantage The convenient mining knowledge method. Drawback How to use the equation to produce the concept map. Application To analyze Abstract.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.