Topic Identification in Forums Evaluation Strategy IA Seminar Discussion Ahmad Ammari School of Computing, University of Leeds
2 Identify Discussion Forum Topics Service Topic Identifier Lucene Filtering Hadoop Map/Reduce Topic Weighting Topic Sorting
3 Selected Discussion Forums
4 Envisaged Topic Clouds View in Dicode Forums
5 We aim to implement the service in different variations (approaches / algorithms) to improve the identified topics Variations include: Identified Topics based on Term Frequency (This is the current version of the service!) Different Discussion Clustering Algorithm (K-Medoids) Adding Dimension Reduction before Clustering (SVD) Topic Modelling with Latent Dirichlet Allocation (LDA) Semantic Annotation of Discussions Semantic Augmentation of the Identified Topics Variations of the original approach
6 We aim to set up an evaluation strategy to evaluate the business value of the service To do so, we want to: Broaden the Discussions (we may need to change the forum) Run all the variations of the service on the large Discussions and derive the topic clouds for each variation Determine the evaluation criteria to test/measure Design a set of specific tasks/problems for the users to do using the forum Discussions and the derived topic clouds Split the users randomly into groups Give each group the discussion forum, the topic clouds derived by one variation of the service, and the tasks to do Compare between the groups based on the evaluation criteria Your feedback on the following is invaluable: What evaluation criteria to test (please give examples) For each evaluation criteria, what tasks/problems to give users to do (please give examples) How this evaluation strategy may be further improved Any other recommended evaluation strategies Discussion: Evaluation Strategy