Download presentation
Presentation is loading. Please wait.
Published byLesley Chase Modified over 9 years ago
1
Intelligent Database Systems Lab Presenter : JHOU, YU-LIANG Authors :Shady Shehata, Fakhri Karray, Mohamed S. Kamel, Fellow 2012, IEEE An Efficient Concept-Based Mining Model for Enhancing Text Clustering
2
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Evaluation Conclusions Comments
3
Intelligent Database Systems Lab Motivation In text mining,the term frequency is computed to explore the importance of the term in document. However, two terms can have the same frequency in documents, but one term contributes more to the meaning of its sentences than the other term.
4
Intelligent Database Systems Lab Objectives Using Concept-Based Mining Model for Text Clustering, improve the clustering quality.
5
Intelligent Database Systems Lab Methodology Concept-Based Mining Model
6
Intelligent Database Systems Lab Methodology CONCEPT-BASED MINING MODEL Ex: a concept c which appears twice in document d in the first and the second sentences The concept c appears five times in the verb argument structures of the first sentence s 1, and three times in the verb argument structures of the second sentence s 2. ans : ctf value = (5+3)/2=4
7
Intelligent Database Systems Lab Methodology Corpus-Based Concept Analysis Algorithm
8
Intelligent Database Systems Lab Methodology Example of Conceptual Term Frequency. [ARG0 Texas and Australia researchers] have [TARGET created] [ARG1 industry-ready sheets of materials made from nanotubes that could lead to the development of artificial muscles]. [ARG1 materials] [TARGET made ] [ARG2 from nanotubes that could lead to the development of artificial muscles]. [ARG1 nanotubes] [R-ARG1 that] [ARGM-MOD could] [TARGET lead] [ARG2 to the development of artificial muscles].
9
Intelligent Database Systems Lab Methodology Example of Conceptual Term Frequency 1. First verb argument structure for the verb created:. [ARG0 Texas and Australia researchers]. [TARGET created]. [ARG1 industry-ready sheets of materials made from nanotubes that could lead to the development of artificial muscles]. 2. Second verb argument structure for the verb made:. [ARG1 materials]. [TARGET made]. [ARG2 from nanotubes that could lead to the development of artificial muscles]. 3. Third verb argument structure for the verb lead:. [ARG1 nanotubes]. [R-ARG1 that]. [ARGM-MOD could]. [TARGET lead]. [ARG2 to the development of artificial muscles].
10
Intelligent Database Systems Lab Methodology Example of Conceptual Term Frequency 1. Concepts in the first verb argument structure of the verb created:. Texas Australia researchers. created. industry-ready sheets materials nanotubes lead development artificial muscles 2. Concepts in the second verb argument structure of the verb made:. materials. nanotubes lead development artificial muscles 3. Concepts in the third verb argument structure of the verb lead:. nanotubes. lead. development artificial muscles.
11
Intelligent Database Systems Lab Methodology Example of Conceptual Term Frequency
12
Intelligent Database Systems Lab Methodology Concept-Based Similarity Measure
13
Intelligent Database Systems Lab Experimental Result
14
Intelligent Database Systems Lab Experimental Result
15
Intelligent Database Systems Lab Experimental Result
16
Intelligent Database Systems Lab Experimental Result
17
Intelligent Database Systems Lab Conclusions The new approach enhance text clustering quality.
18
Intelligent Database Systems Lab Comments Advantages Improve the text clustering quality. Applications -Concept-based mining model -Conceptual term frequency
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.