Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Similar presentations


Presentation on theme: "A Model of Information Foraging via Ant Colony Simulation Matthew Kusner."— Presentation transcript:

1 A Model of Information Foraging via Ant Colony Simulation Matthew Kusner

2 Information Foraging Theory Background – People search for information in roughly the same way that animals search for food in their surroundings. Information Scent – Ex: “the text associated with Web links” (Fu, 2007) – Background knowledge – Recommendations

3 Ant Colony Simulation Pheromone trails – Laid by ants who've found food. – Followed by other ants with probability p. – Path Evaporation Path Optimization Simulation specifics

4 AOL Data Set 21 million queries (March 1– May 31, 2006) 650k users19 million click-through events Quantities:querytime of query click URLuser IDclicked link rank

5 Information Foraging → Ant Colony user → ant clicked link → food information scent → pheromone path website importance → food distance where website importance is defined by: – 1. Rank – 2. Popularity of website – 3. Combination of above methods

6 Distancing Methods Ranking Popularity Combination [based on data in Joachims et al., 2005]

7 Results AOL user-visit per website vector – [numWvisits 1, numWvisits 2,..., numWvisits n ] Simulation ant-visit per food vector – [numAvisits 1, numAvisits 2,..., numAvisits n ] Pearson Correlation Score (PCS) Permutation Test → 95% Coverage Interval – (AOL_data i, simulation_data i ) selection with replacement Bootstrapping → p-value – Shuffle AOL vector

8 Query Type of distancing # of users # of clicked links # of distinct websites visited Average PCS Average 95% CI Start Average 95% CI End Significa nt p-val? ranking12559190.81820.32030.9364Yes vacationpopularity12559190.1296-0.17680.6624 combination12559190.1488-0.38190.3920 ranking392560.7631-0.47810.9854 rhinopopularity392560.3906-0.24840.9919 combination392560.2013-0.73890.9657 ranking536112-0.1825-0.54260.4706 zebrapopularity536112-0.0110-0.46670.5079 combination5361120.1558-0.36550.6754 ranking523990.6118-0.17970.9214 lionpopularity523990.0699-0.57760.7296 combination523990.0304-0.61700.6609 ranking19456210.5358-0.09520.9301 footballpopularity19456210.2693-0.15830.6722 combination19456210.4149-0.02230.7612 ranking22074160.7137-0.42250.9529 basketballpopularity22074160.2228-0.17550.6455 combination22074160.1415-0.34700.6661

9 Results Queries with significant p-values: – vacation” (ranking), “baseball” (ranking), “reebok” (ranking), “adidas” (ranking), “marbles” (ranking), “helicopter” (ranking), “car” (ranking), “potatoes” (ranking), “coffee” (ranking), “farming” (ranking), “rock” (popularity), “shirts” (ranking), “playstation” (ranking), “sega” (popularity), “tom cruise” (ranking), “mel gibson” (ranking), “burger king” (ranking), “chicago” (ranking), “los angeles” (ranking), and “paris” (ranking) Distancing methods without 95% CI overlap: – Ranking: “potatoes” - neither popularity, nor combination “shirts” - not popularity “playstation” - not popularity “burger king” - not combination

10 Discussion Disadvantages of popularity and combination methods – “vacation” example Possible reasons for 95% CI overlap – Randomness – Disregard of structure Significance of queries with low p-values – Search engine matching Future directions – Different Simulation – Other similarity metrics – Random beginnings

11 References Fu, W., & Pirolli, P. (2007). SNIF-ACT: a cognitive model of user navigation on the World Wide Web. Human-Computer Interaction, 22(4), 355-412. T. Joachims, L. Granka, B. Pang, H. Hembrooke, and G. Gay (2005). Accurately Interpreting Clickthrough Data as Implicit Feedback, Proceedings of the ACM Conference on Research and Development on Information Retrieval (SIGIR).


Download ppt "A Model of Information Foraging via Ant Colony Simulation Matthew Kusner."

Similar presentations


Ads by Google