Presentation is loading. Please wait.

Presentation is loading. Please wait.

WEB FORUM MINING BASED ON USER SATISFACTION PAGE 1 WEB FORUM MINING BASED ON USER SATISFACTION By: Suresh Pokharel Information and Communications Technologies.

Similar presentations


Presentation on theme: "WEB FORUM MINING BASED ON USER SATISFACTION PAGE 1 WEB FORUM MINING BASED ON USER SATISFACTION By: Suresh Pokharel Information and Communications Technologies."— Presentation transcript:

1 WEB FORUM MINING BASED ON USER SATISFACTION PAGE 1 WEB FORUM MINING BASED ON USER SATISFACTION By: Suresh Pokharel Information and Communications Technologies Asian Institute of Technology Committee: Dr. Sumanta Guha (Chairperson) Prof. Phan Minh Dung Assoc. Prof. Tapio J. Erke May 2010

2 Introduction Problem Statement Objectives Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration WEB FORUM MINING BASED ON USER SATISFACTION PAGE 2 Agendas Introduction

3  Internet-Forum or Message Board  Online Discussion Site  Asynchronous  People participating in an Internet forum may cultivate social bonds and interest groups for a topic may form from the discussions WEB FORUM MINING BASED ON USER SATISFACTION PAGE 3

4 Figure 1: Organization of Threads WEB FORUM MINING BASED ON USER SATISFACTION PAGE 4 Introduction

5 Table 1: An example of a thread WEB FORUM MINING BASED ON USER SATISFACTION PAGE 5 Introduction Title: Software for Ubuntu. Post No.PostUsersCategory 1 I am really new to Linux, where can I find software for Ubuntu? avacomputersQuestion 2 Applications>Add/remove http://www.getdeb.net/ If you are using Ubuntu Firefox, you can also use http://allmyapps.com/ danielrmtAnswer 3http://www.getdeb.nettheozzlivesAnswer 4 something else I was reading about is a port from OzOs (http://www.cafelinux.org/OzOs/) called apt:foo devildoc5Answer 5 or the easy way >.> CLICK SYSTEM > PREFERENCES > SYNAPTIC PACKAGE MANAGER there you can search all kinds of games, software! anything you like, thousands to choose from! They download and install easily onto your system Codix121Answer 6Thanks GUys.avacomputersAnswer Questioner Repliers Questioner Questioner Post

6 Introduction Problem Statement Objectives Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration WEB FORUM MINING BASED ON USER SATISFACTION PAGE 6 Agendas

7 Problem Statement Which forum may have solution? Lots of Forums…… Ooops WEB FORUM MINING BASED ON USER SATISFACTION PAGE 7 I don’t want to test all forums…

8 Introduction Problem Statement Objectives Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration WEB FORUM MINING BASED ON USER SATISFACTION PAGE 8 Agendas

9 Objectives  To categorize a post as a question post or an answer post.  To classify a thread as answered or unanswered based on questioner’s satisfaction and forum features.  To predict a solution post based on interaction and satisfaction of questioner. WEB FORUM MINING BASED ON USER SATISFACTION PAGE 9

10 Introduction Problem Statement Objectives Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration WEB FORUM MINING BASED ON USER SATISFACTION PAGE 10 Agendas

11 Methodology: Framework of Study Figure 4: Framework of Study WEB FORUM MINING BASED ON USER SATISFACTION PAGE 11

12 Figure 5 : Sentence Classification WEB FORUM MINING BASED ON USER SATISFACTION PAGE 12 Methodology: Sentence Classification Example

13 Label Sequential Patterns (LSPs), p, in the form of  LHS is a sequence, a i is named “item”.  c is a class label (question/non-question) A= has a subsequence B=  A contains B  A LSP p 1 is contained by p 2 if the sequence p 1.LHS is contained by p 2.LHS and p 1.c = p 2.c. Example: t 1 = (,Q) t 2 = (,Q) t 3 = (,NQ) 1 ) LSP p 1 = (, Q) is contained in t 1 and t 2 sup(p 1 ) = 2/3 = 66.7%, conf(p 1 )=(2/3)/(2/3) = 100% 2) LSP p 2 = (, Q) sup(p 2 ) = 2/3 = 66.7%, conf(p 2 )= (2/3)/(3/3) = 66.7% WEB FORUM MINING BASED ON USER SATISFACTION PAGE 13 Methodology: Sentence Classification

14 Mining LSPs  Word length of sequence : 4  Setting minimum support at 0.01% and minimum confidence at 95% Converting to features  LSP  5W1H word  Auxiliary Verb  Question Mark The corresponding feature being set at 1 if a sentence includes a LSP, question mark, start with 5W1H word, or auxiliary verb. WEB FORUM MINING BASED ON USER SATISFACTION PAGE 14 Methodology: Sentence Classification

15 Figure 6 : Classification of Thread WEB FORUM MINING BASED ON USER SATISFACTION PAGE 15 Methodology: Thread Classification

16  Satisfied Phrase Derive Features :  Unsatisfied Phrase WEB FORUM MINING BASED ON USER SATISFACTION PAGE 16 Methodology: Questioner Post Classification

17  Question Present  Presence of More Post WEB FORUM MINING BASED ON USER SATISFACTION PAGE 17 Methodology: Questioner Post Classification Derive Features :

18  Satisfied Post Length WEB FORUM MINING BASED ON USER SATISFACTION PAGE 18  Presence of Emoticons  Happy emoticon ( ) : mood of satisfaction  Unhappy emoticon (  ) : mood of un-satisfaction Methodology: Questioner Post Classification Derive Features :

19  Original Post WEB FORUM MINING BASED ON USER SATISFACTION PAGE 19 Methodology: Questioner Post Classification Derive Features :

20 WEB FORUM MINING BASED ON USER SATISFACTION PAGE 20 Figure 7: Classification of Questioner Post Methodology: Questioner Post Classification

21  Presence of Quote Find Solution Post  Presence of User Name WEB FORUM MINING BASED ON USER SATISFACTION PAGE 21 Methodology: Predict Solution Posts

22  May be between the questioner post QP R R SP Solution Posts WEB FORUM MINING BASED ON USER SATISFACTION PAGE 22 Methodology: Predict Solution Posts

23 Introduction Problem Statement Objectives Scope and Limitation Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration WEB FORUM MINING BASED ON USER SATISFACTION PAGE 23 Agendas

24 Dataset  Forum : Ubuntu (http://ubuntuforums.org/)http://ubuntuforums.org/  Sentence classification: datasets of 100, 200 and 300 from 3000 sentences  Questioner Post Classification: 250 posts from 79 threads  Manual Evaluation : 100 threads by two team contains 5 person in each team Tools and Language  POS Tag, Tokenization, Sentence Detection : OpenNLP  Classifier : Support Vector Machine (LibSVM, SMO)  Model : SVM is trained using libSVM for classifier model  Language : Java WEB FORUM MINING BASED ON USER SATISFACTION PAGE 24 Implementation

25 Introduction Problem Statement Objectives Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration WEB FORUM MINING BASED ON USER SATISFACTION PAGE 25 Agendas

26 FeaturesAccuracyRecallPrecisionF-Measure WH word (5W1H)0.59 0.770.67 Question Mark (QM)0.86 0.880.87 Auxiliary Verb (Aux)0.66 0.780.72 5W1H + QM + Aux0.88 0.890.88 Labeled Sequential Pattern (LSP)0.94 QM+Aux+LSP+5W1H (LSP+)0.96 Table 3: Accuracy of Sentence Classification by using LSP+ by Class WEB FORUM MINING BASED ON USER SATISFACTION PAGE 26 Result and Discussion : Sentence Classification Comparison  All the results obtained from10 fold cross validation

27 Table 5: Questioner Post Classification Comparison using Different Features FeaturesPrecisionRecallF-Measure Satisfied Words (SW) 0.78 Unsatisfied Words (UW) 0.72 Question (Ques) 0.840.850.84 More Post (MP) 0.83 Word Count (WC) 0.700.580.63 Happy Emoticon (HE) 0.620.550.58 Unhappy Emoticon (UE) 0.760.560.64 Original Post (OP) 0.830.760.79 Ques+MP0.84 Ques+OP0.86 MP+UE+OP0.88 Ques+MP+OP0.85 SW + UW + Ques0.830.820.83 SW+UW+Ques+MP+WC+HE+UE+OP0.91 WEB FORUM MINING BASED ON USER SATISFACTION PAGE 27 Result and Discussion : QP Classification Comparison

28 PerformanceAccuracyRecallPrecisionF-Measure System with Team A 0.840.790.820.80 System with Team B 0.790.730.780.75 Average0.810.760.800.78 Table 6: Comparison of System Result with Manual Evaluation for thread classification AccuracyRecallPrecisionF-Measure System Accuracy with Team A0.450.650.54 System Accuracy with Team B0.430.650.52 Average System Accuracy0.440.650.53 Table 7: System Accuracy for Prediction of Solution Posts WEB FORUM MINING BASED ON USER SATISFACTION PAGE 28 Result and Discussion : Comparison with Team’s Evaluation

29 Introduction Problem Statement Objectives Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration WEB FORUM MINING BASED ON USER SATISFACTION PAGE 29 Agendas

30 Conclusion :  Finding answered threads in web forum is achieved by tracing user satisfaction.  Thread and sentence are classified by deriving different features.  Performance of system is increased when combining different features. WEB FORUM MINING BASED ON USER SATISFACTION PAGE 30 Conclusion and Future Work: Conclusion

31 Future Work :  It can be used for query based raking of thread.  It can be used for extracting answered sentences with better accuracy.  The performance of system can be increased by incorporating semantics. WEB FORUM MINING BASED ON USER SATISFACTION PAGE 31 Conclusion and Future Work: Future Work

32


Download ppt "WEB FORUM MINING BASED ON USER SATISFACTION PAGE 1 WEB FORUM MINING BASED ON USER SATISFACTION By: Suresh Pokharel Information and Communications Technologies."

Similar presentations


Ads by Google