Presentation is loading. Please wait.

Presentation is loading. Please wait.

Shuang Wu REU-DIMACS, 2010 Mentor: James Abello.  Project description  Our research project Input: time data recorded from the ‘Name That Cluster’ web.

Similar presentations


Presentation on theme: "Shuang Wu REU-DIMACS, 2010 Mentor: James Abello.  Project description  Our research project Input: time data recorded from the ‘Name That Cluster’ web."— Presentation transcript:

1 Shuang Wu REU-DIMACS, 2010 Mentor: James Abello

2  Project description  Our research project Input: time data recorded from the ‘Name That Cluster’ web page. Output: statistic results of participants’ different behaviors when using three interfaces.  Collected Statistics  Conclusions

3  For a pre-computed collection of search engine queries, users select for each query one out of three interfaces: Textual, Graphical and Hybrid.  The evaluation process consists of exploring clusters associated with each query, naming the correspond clusters, selecting Clusters Ratings. (ClusterFitRatings and Cluster Name Ratings).  The ClusterFitRatings are on a scale from -1 to 4 and Name Ratings are on a scale from -1 to 2. Note: -1 means that participants didn’t give a rating.  The collected statistics are: Exploration Times, Naming Times, Cluster Rating Times, Name Rating Times, ClusterFitRatings, and Name Ratings.

4 The raw data collected online is:  Userid  QueryString: the evaluated query  ClusterNum: the evaluated cluster in that query  Name: name/description/summary given to the cluster  Timestamp: server data/time at which the evaluation was written on the data base

5

6  440 clusters were evaluated in the Textual interface, and 338 clusters were evaluated in the Graphical interface, another 378 clusters were evaluated in the Hybrid interface.  We used the Exploration Time, and the Evaluation Time = sum of Naming time, ClusterFitRating time and Name Rating time in the following analysis.  Notation: Ex(T),Ex(G),Ex(H) denote Exploration time per interface; T,G,H denote Evaluation time per interface; NT(T), NT(G), NT(H) denote Naming time per interface.

7 Dealt with the outliers Note: We treated data with 3.5 standard deviations from mean as outliers.

8 TWO SAMPLE T-TEST ANOVA F-TEST Test for the difference in means of two samples. Null Hypothesis: there is no difference in two means. vs. Alternative Hypothesis: a mean of the first sample is larger/smaller than a mean of the second sample. Reject a Null Hypothesis if P-value is less than.05. Test for the difference in means for three or more samples. Null Hypothesis: all means are equal. vs. Alternative Hypothesis: at least one of the means are different. Reject a Null Hypothesis if P-value is less than.05.

9 Statistics Results After a series of T-tests and ANOVA F-tests we got the following results.  Exploration time: There is no difference on the average of Exploration times per interface.  Name time: The Textual interface has the larger Naming time mean.  Evaluation time: The Graphical interface has a larger mean of Evaluation time than the Textual and Hybrid interfaces.

10 We wanted to see if there was a relationship between ClusterFitRatings and Evaluation times or Naming times for the cluster collection.

11

12 We also wanted to see if there was a relationship between Name Ratings and Evaluation times or Naming times for the cluster collection.

13

14 Statistics Results According to the results from the four pages and a regression test: test for the linear relationship between a response variable and a explanatory variable, we got the following observations.  When participants gave ClusterFitRating=4, they had the shorter mean of Evaluation time and Naming time than the other Cluster FirRatings in all interfaces.  Users either had the shorter mean of Evaluation time and Naming time when they gave a Name Rating=-1 or 2 than when they gave other ratings or there was no significant time difference in all interfaces.  There are linear correlations between ClusterFitRatings and Name Ratings in all interfaces.

15 We wanted to see if there was a per query variation of task time among the three interfaces. In order to do this, we grouped the queries that were evaluated with the three interfaces by different users. There was 16 queries that were evaluated by different users with the 3 interfaces. For each such query we tested for the difference.

16

17

18 Statistics Results According to the results from the last two pages, we got the following observations for these 16 queries user triples.  The Textual interface has the shorter mean of Evaluation time and Exploration time than the Graphical and Hybrid interfaces.  The difference in the average of Evaluation time and Exploration time between the Textual and the Graphical interfaces is larger than the one between the Textual and the Hybrid and the Graphical and the Hybrid.

19 To see if there is an interface with the shortest Exploration time and Evaluation time for these 16 qualified queries. We found the minimum number of triples over all such queries, in order to best deal with the leftovers (a remaining data after grouping in triples). After a consideration of the number of triples per query and the outliers of these data, we set this minimum number as five.

20 This is a part of table with five randomly selected triples from each query.

21 Statistics Results After a series of T-tests, ANOVA F-tests and regression tests, we got the following results for this tripled set of 16 queries  There is no difference in the means of Exploration times and Evaluation times for each interface.  There exist linear correlations between Exploration times and Evaluation times in the Graphical and the Hybrid interface, but not in the Textual interface.

22  The Textual interface has the larger mean of Naming time.  The Graphical interface has the larger mean of Evaluation time.  Participants give the highest ClusterFitRating have the shorter mean of Evaluation time and Naming time in all interfaces.  There exists a linear correlation between ClusterFitRatings and Name Ratings in all interface; and a linear correlation between Exploration times and Evaluation times in the Graphical and Hybrid interfaces.

23  Name Than Cluster online survey, http://gem1.rutgers.edu/userstudy/login.php http://gem1.rutgers.edu/userstudy/login.php  J. Abello, J, Schulz, H, Gaudin, B, and Tominski, C (2007). Name That Cluster - Text vs. Graphics, IEEE InfoVis Conference, Sacramento, November 2007.  Ramsey, Fred L, The statistical sleuth : a course in methods of data analysis, Duxbury/Thomson Learning, 2002

24 THE END


Download ppt "Shuang Wu REU-DIMACS, 2010 Mentor: James Abello.  Project description  Our research project Input: time data recorded from the ‘Name That Cluster’ web."

Similar presentations


Ads by Google