1 Predicting Download Directories for Web Resources George ValkanasDimitrios Gunopulos 4 th International Conference on Web Intelligence, Mining and Semantics.

Slides:



Advertisements
Similar presentations
Fusing Online Commerce and Social Network: Enhance Social Shopping Experience via Desktop Application A Master Project Presented By Ning Song.
Advertisements

Understanding and Detecting Malicious Web Advertising
M I S Dr. Ernst-Gerd vom Kolke 1 Web Design - Introduction n Design for printed and electronic information isn’t very different n Special aspects for web.
University at Buffalo Office of Graduate Medical Education E*Value User Guide 2/24/09 Prepared by: Sharon Sullivan Administrator, Residency Electronic.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
Chapter 12: Web Usage Mining - An introduction
Enhancing Electrical Engineering Education by Developing Online Courses M. Mohandes, M. Dawoud, A. Hussain, M. Deriche, A. Balghonaim Electrical Engineering.
University of Athens, Greece Pervasive Computing Research Group Predicting the Location of Mobile Users: A Machine Learning Approach 1 University of Athens,
COMP 630L Paper Presentation Javy Hoi Ying Lau. Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou,
Discovery of Aggregate Usage Profiles for Web Personalization
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
1 Automatic Identification of User Goals in Web Search Uichin Lee, Zhenyu Liu, Junghoo Cho Computer Science Department, UCLA {uclee, vicliu,
+ Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie.
Grouper Training End Users Admin UI – Part 6 Shilen Patel Duke University This work licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported.
NEES Central Goran Josipovic IT Manager
The basics of the Online Portal
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
Vienna University of Economics and Business Administration User-centered Navigation Re-Design for Web-based Information Systems Michael Hahsler, Department.
Web Usage Mining with Semantic Analysis Date: 2013/12/18 Author: Laura Hollink, Peter Mika, Roi Blanco Source: WWW’13 Advisor: Jia-Ling Koh Speaker: Pei-Hao.
 Copyright 2006 Digital Enterprise Research Institute. All rights reserved. Collaborative Building of Controlled Vocabularies Crosswalks Mateusz.
Face Model Fitting with Generic, Group-specific, and Person- specific Objective Functions Chair for Image Understanding and Knowledge-based Systems Institute.
Optimizing Traditional and Advocating New Prevention Methods Mark Jenne Tatiana Alexenko Cross-Site-Request-Forgery.
Generating Intelligent Links to Web Pages by Mining Access Patterns of Individuals and the Community Benjamin Lambert Omid Fatemieh CS598CXZ Spring 2005.
E-Commerce: Introduction to Web Development 1 Dr. Lawrence West, Management Dept., University of Central Florida Topics What is a Web.
315 Feature Selection. 316 Goals –What is Feature Selection for classification? –Why feature selection is important? –What is the filter and what is the.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Predicting Content Change On The Web BY : HITESH SONPURE GUIDED BY : PROF. M. WANJARI.
1 Automatic Classification of Bookmarked Web Pages Chris Staff First Talk February 2007.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Improving Web Sites with Web Usage Mining, Web Content Mining, and Semantic Analysis Jean-Pierre Norguet.
Intent Subtopic Mining for Web Search Diversification Aymeric Damien, Min Zhang, Yiqun Liu, Shaoping Ma State Key Laboratory of Intelligent Technology.
Web Usage Mining for Semantic Web Personalization جینی شیره شعاعی زهرا.
Procrastinator: Pacing Mobile Apps’ Usage of the Network mobisys 2014.
Interoperable Visualization Framework towards enhancing mapping and integration of official statistics Haitham Zeidan Palestinian Central.
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.
1 Automatic Classification of Bookmarked Web Pages Chris Staff Third Talk February 2007.
1 Murat Ali Bayır Middle East Technical University Department of Computer Engineering Ankara, Turkey A New Reactive Method for Processing Web Usage Data.
Low-Rank Kernel Learning with Bregman Matrix Divergences Brian Kulis, Matyas A. Sustik and Inderjit S. Dhillon Journal of Machine Learning Research 10.
Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization --- Lei Tang, Jianping Zhang and Huan Liu.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Multidimensional classification of burst triggers from LIGO S5 run Soma Mukherjee for the LSC Center for Gravitational Wave Astronomy University of Texas.
CoNMF: Exploiting User Comments for Clustering Web2.0 Items Presenter: He Xiangnan 28 June School of Computing National.
ASSIST: Adaptive Social Support for Information Space Traversal Jill Freyne and Rosta Farzan.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Web Site Development - Process of planning and creating a website.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Adaptive Faceted Browsing in Job Offers Danielle H. Lee
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Guided By Ms. Shikha Pachouly Assistant Professor Computer Engineering Department 2/29/2016.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Assess usability of a Web site’s information architecture: Approximate people’s information-seeking behavior (Monte Carlo simulation) Output quantitative.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Transportation Agenda 19. Transportation Your Role: Designer Designers organize SharePoint content and determine how to display that content Typical tasks.
Training Manual. 2 Contents What Is the Performance Review Process?3 System Access4 - 5 Login - User ID and Password Employee Data6 Employee Data7 Suggest.
Introduction to Machine Learning, its potential usage in network area,
Managing the Privacy of Incidental Information During Collaboration
Challenges in Creating an Automated Protein Structure Metaserver
SportSuite Forms – Conditional Statements
Why is it important? ❏ Most website owners do not think that if they paste address on page or post than they have too much risk of spam. ❏
The Educator Development Solution
Steps in accessing Past Examination Papers
Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz
Leverage Consensus Partition for Domain-Specific Entity Coreference
Facilitating Navigation on Linked Data through Top-K Link Patterns
Presentation transcript:

1 Predicting Download Directories for Web Resources George ValkanasDimitrios Gunopulos 4 th International Conference on Web Intelligence, Mining and Semantics June 3, 2014 Dept. of Informatics & Telecommunications University of Athens, Greece

2 Online User Activities ActivityABS Survey StatCan Survey Infoplease Survey ing91%93%92% General Web Browsing 87%> 70%83% Online Purchases 45%> 50%62% Download Content 37%~30%42%

3 Facilitating Downloads Save Link In Folder

4 Facilitating Downloads Save Link In Folder Problems: Predefined Directories Blunt approach / No learning UI Clutter Tedious user management

5 A principled solution

6 Associate the navigation through the hierarchy with a cost function One possible c.f.: Hierarchical Navigation Cost (HNC), i.e., #clicks HNC(imgs/, docs/) = 2

7 Problem Definition Given  The hierarchical structure  A target directory T, where the resource will be saved Goal  Suggest a directory S that minimizes the cost function cf( S, T )

8 Problem Definition Given  The hierarchical structure  A target directory T, where the resource will be saved Goal  Suggest a directory S that minimizes the cost function cf( S, T ) But if I know T, why not suggest T directly? (0 cost)

9 Problem Definition Given  The hierarchical structure  A target directory T, where the resource will be saved Goal  Suggest a directory S that minimizes the cost function cf( S, T ) But if I know T, why not suggest T directly? (0 cost) In this setting, we don’t know T until it’s too late!

10 Casting to a classification framework Directories are potential class values T is the true target class S is the output of a classification process Web resource properties → classification features Recommend S that best matches T  Use directories from past saves as candidate classes

11 Features & Distances FeatureDistance TimestampExponential decay Domain (current / referrer)Equality Path, filename (current / referrer page) Tokenize & Jaccard TitleTokenize & Jaccard FilenameTokenize & Jaccard ExtensionCovariance Matrix KeywordsJaccard

12 Experimental Setup Implement classifier as a FF plugin  DiDoCtor approach  Javascript  1-NN classifier 6 participants  4-month minimum use period Baseline  Last-by-domain (LBD), current browser approach  Simulated, based on submitted result Metrics  Click Distance: HNC, Breadcrumbs  Classification Accuracy

13 Preliminary Result Analysis

14 Preliminary Result Analysis Take Home Messages 1.Users have different saving pattern behavior(s)

15 Preliminary Result Analysis Take Home Messages 1.Users have different saving pattern behavior(s) 2.Users have high variability in their accesses to each directory

16 Click Distance - HNC Take Home Message Significant reduction in number of clicks to reach target directory!

17 Click Distance - HNC Take Home Message Significant reduction in number of clicks to reach target directory! Click distance gain is even higher when considering a breadcrumbs UI!

18 Running Accuracy Take Home Message DiDoctor is much more accurate in predicting the download directory

19 Basic Model Extensions Feature reweighting  RELIEF_F

20 Basic Model Extensions Feature reweighting  RELIEF_F Suggesting k directories

21 Alternative classifiers Take Home Messages Classifiers can help! DiDoCtor generally performs the best Accuracy is affected by user behavior!

22 Conclusions & Future work Approach for facilitating downloads  Optimization problem & classification framework Experimentation with real users  Basic model extensions Further exploit the temporal dimension More informative features (e.g., entities) Automatic generation of directories

23 Thank you! Questions? Acknowledgements  To the evaluators of our plugin  Heraclitus II fellowship, THALIS-GeoComp, THALIS-DISFER, Aristeia-MMD, EU project INSIGHT