Marina Drosou Department of Computer Science University of Ioannina, Greece Thesis Advisor: Evaggelia Pitoura

Slides:



Advertisements
Similar presentations
Evolutionary Algorithms Nicolas Kruchten 4 th Year Engineering Science Infrastructure Option.
Advertisements

Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :
Cognitive Publish/Subscribe for Heterogeneous Clouds Šarūnas Girdzijauskas, Swedish Institute of Computer Science (SICS) Joint work with:
Preference-Aware Publish/Subscribe Delivery with Diversity Marina Drosou Department of Computer Science University of Ioannina, Greece Joint work with.
Project Proposal.
University of Cincinnati1 Towards A Content-Based Aggregation Network By Shagun Kakkar May 29, 2002.
On Novelty in Publish/Subscribe Delivery Kostas Stefanidis * Dept. of Computer Science and Engineering, Chinese University of Hong Kong, Hong Kong *Work.
CUSTOMER NEEDS ELICITATION FOR PRODUCT CUSTOMIZATION Yue Wang Advisor: Prof. Tseng Advanced Manufacturing Institute Hong Kong University of Science and.
Preferential Publish/Subscribe Marina Drosou Department of Computer Science University of Ioannina, Greece Joint work with Evaggelia Pitoura and Kostas.
6/2/ An Automatic Personalized Context- Aware Event Notification System for Mobile Users George Lee User Context-based Service Control Group Network.
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
1 Complexity of Network Synchronization Raeda Naamnieh.
Estimation of Distribution Algorithms Ata Kaban School of Computer Science The University of Birmingham.
Mobility Improves Coverage of Sensor Networks Benyuan Liu*, Peter Brass, Olivier Dousse, Philippe Nain, Don Towsley * Department of Computer Science University.
Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,
Preference-Aware Publish/Subscribe Delivery with Diversity Marina Drosou Master Thesis Supervisor: Evaggelia Pitoura.
“You May Also Like” Results in Relational Databases Kostas Stefanidis Department of Computer Science University of Ioannina, Greece Joint work with Marina.
An Optimization Problem in Adaptive Virtual Environments Ananth I. Sundararaj Manan Sanghi Jack R. Lange Peter A. Dinda Prescience Lab Department of Computer.
ReDrive: Result-Driven Database Exploration through Recommendations Marina Drosou, Evaggelia Pitoura Computer Science Department University of Ioannina.
Fuego Event Service: Towards Modularity in Event Routing Sasu Tarkoma Rutgers-Helsinki Workshop
Chapter 6: Transform and Conquer Genetic Algorithms The Design and Analysis of Algorithms.
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
IT Job Roles Task 20. Software Engineer Job Description Software engineers are responsible for creating and maintaining software of various different.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Achieving fast (approximate) event matching in large-scale content- based publish/subscribe networks Yaxiong Zhao and Jie Wu The speaker will be graduating.
Search Result Diversification by M. Drosou and E. Pitoura Presenter: Bilge Koroglu June 14, 2011.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
QoS-Aware In-Network Processing for Mission-Critical Wireless Cyber-Physical Systems Qiao Xiang Advisor: Hongwei Zhang Department of Computer Science Wayne.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
Socially-aware pub-sub system for human networks Yaxiong Zhao Jie Wu Department of Computer and Information Sciences Temple University Philadelphia
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Dynamic Diversification of Continuous Data Marina Drosou and Evaggelia Pitoura Computer Science Department University of Ioannina, Greece
1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
Marina Drosou, Evaggelia Pitoura Computer Science Department
LOGO 1 Corroborate and Learn Facts from the Web Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Shubin Zhao, Jonathan Betz (KDD '07 )
作者 :Satyajeet Ahuja and Marwan Krunz 出處 :IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 7, NOVEMBER 2008 報告者 : 黃群凱 1.
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Optimization Problems
Heuristic Methods for the Single- Machine Problem Chapter 4 Elements of Sequencing and Scheduling by Kenneth R. Baker Byung-Hyun Ha R2.
Improving SLP Efficiency and Extendability by Using Global Attributes and Preference Filters Weibin Zhao Henning Schulzrinne
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Distributed Ranked Data Dissemination in Social Networks Joint work with: Mo Sadoghi Vinod Muthusamy Hans-Arno.
Reorganization in Network Regions for Optimality and Fairness Robert E. Beverly IV, MSc Thesis.
A Simulation-Based Study of Overlay Routing Performance CS 268 Course Project Andrey Ermolinskiy, Hovig Bayandorian, Daniel Chen.
Stefanos Antaris Distributed Publish/Subscribe Notification System for Online Social Networks Stefanos Antaris *, Sarunas Girdzijauskas † George Pallis.
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Efficient Point Coverage in Wireless Sensor Networks Jie Wang and Ning Zhong Department of Computer Science University of Massachusetts Journal of Combinatorial.
哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.
1 Link Privacy in Social Networks Aleksandra Korolova, Rajeev Motwani, Shubha U. Nabar CIKM’08 Advisor: Dr. Koh, JiaLing Speaker: Li, HueiJyun Date: 2009/3/30.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Optimization Problems
Introduction to Approximation Algorithms
Algorithm Design Methods
GPSR Greedy Perimeter Stateless Routing
Advanced Artificial Intelligence Evolutionary Search Algorithm
Heuristics Definition – a heuristic is an inexact algorithm that is based on intuitive and plausible arguments which are “likely” to lead to reasonable.
CSc4730/6730 Scientific Visualization
Optimization Problems
Algorithm Design Methods
Algorithm Design Methods
Retrieval Performance Evaluation - Measures
Alex Bolsoy, Jonathan Suggs, Casey Wenner
Algorithm Design Methods
Presentation transcript:

Marina Drosou Department of Computer Science University of Ioannina, Greece Thesis Advisor: Evaggelia Pitoura

Introduction  Publish/Subscribe is an attractive alternative to typical searching  Users do not need to repeatedly search for new interesting data  They specify their interests once and the system automatically notifies them whenever relevant data is made available 2 DMOD Laboratory, University of Ioannina

Motivation Typically, all subscriptions are considered equally important Users may receive: overwhelming amounts of notifications too much overlapping information In such cases, users would like to receive only a fraction of notifications, the most interesting ones: Say Addison is more interested in horror movies than comedies Addison would like to receive notifications about (various) comedies only if there are no (or just a few) notifications about horror movies Current publish/subscribe systems do not allow Addison to express this different degree of interest DMOD Laboratory, University of Ioannina 3

Purpose DMOD Laboratory, University of Ioannina 4 This PhD thesis aims at increasing user satisfaction by the received results by proposing methods that will control the number of results delivered to users, mainly through ranking them according to their importance. Importance can be described by Relevance to user interests (or preferences) Diversity Freshness

Applications  Social networks  Lots of notifications  Friends’ news, friends’ actions, games, quizzes etc. DMOD Laboratory, University of Ioannina 5

Outline  Challenges  Preferences  Diversity  Freshness  Combining Criteria  DiveR DMOD Laboratory, University of Ioannina 6

Outline  Challenges  Preferences  Diversity  Freshness  Combining Criteria  DiveR DMOD Laboratory, University of Ioannina 7

Model  We consider a content-based pub/sub system  Greater expressiveness  Most content methods rely on a description of the content DMOD Laboratory, University of Ioannina 8 string title = LOTR: The Return of the King string director = Peter Jackson time release date = 1 Dec 2003 string genre = fantasy integer oscars = 11 string director = Peter Jackson time release date > 1 Jan 2003 Notification exampleSubscription example

Timing  A major issue is the continuous flow of information  Most important (or top-k) notifications over which?  Events are associated with a number of time instants:  Publication time  Matching time  Forwarding time  Delivery time  Pub/sub systems are usually best-effort  Arbitrary delays, different for each event DMOD Laboratory, University of Ioannina 9

Timing  We consider time-partitioning methods:  Periodic  Window-based  Time-based  Event-based DMOD Laboratory, University of Ioannina 10

Users  Another major issue is that users are independent  Their interests may vary a lot  There are no globally top-ranked notifications  A possible solution is clustering DMOD Laboratory, University of Ioannina 11

Outline  Challenges  Preferences  Diversity  Freshness  Combining Criteria  DiveR DMOD Laboratory, University of Ioannina 12

Expressing preferences  Preferences are associated with subscriptions  Different degrees of granularity  Subscription-based preferences  Attribute-based preferences DMOD Laboratory, University of Ioannina 13 genre = drama director = T. Burton genre = horror director = S. Spielberg ≻ director = T. Burtondirector = S. Spielberg ≻ genre = dramagenre = horror ≻

Ordering Subscriptions  To order user subscriptions we use the winnow operator, applying it on various levels  per subscription or per attribute DMOD Laboratory, University of Ioannina 14 genre = dramagenre = horror User preferences genre = comedy genre = romance genre = action ≻ genre = dramagenre = horror ≻ genre = comedy genre = romance ≻ genre = action ≻ genre = comedygenre = horror genre = dramagenre = comedy genre = horrorgenre = romance genre = action Preference graph

Extracting preference ranks  We perform a topological sort to compute winnow levels. The subscriptions of level i are associated with a preference rank (i):  is a monotonically decreasing function with → [0, 1]  e.g. for = (D +1 – (l -1)) / (D +1) DMOD Laboratory, University of Ioannina 15 genre = dramagenre = comedy genre = horrorgenre = romance genre = action Preference graph Preference rank = 1 Preference rank = 2/3 Preference rank = 1/3

Event ranks  Generally, to compute the importance of an event, we consider only the matching subscriptions  Furthermore, we can consider only the most specific among them DMOD Laboratory, University of Ioannina 16 User subscriptions genre = adventure 0.9 director = Peter Jackson 0.7 string title = King Kong string director = Peter Jackson time release date = 14 Dec 2005 string genre = adventure string title = King Kong string director = Peter Jackson time release date = 14 Dec 2005 string genre = adventure 0.9 ℱ = max

Outline  Challenges  Preferences  Diversity  Freshness  Combining Criteria  DiveR DMOD Laboratory, University of Ioannina 17

Motivation DMOD Laboratory, University of Ioannina 18 title = The Godfather genre = drama showing time = 21:10 title = Ratatouille genre = comedy showing time = 21:15 title = Fight Club genre = drama showing time = 23:00 title = Casablanca genre = drama showing time = 23:10 title = Vertigo genre = drama showing time = 23:20 Published events User subscriptions genre = drama 0.9 genre = comedy 0.8 The most highly ranked events may be very similar to each other…

Motivation DMOD Laboratory, University of Ioannina 19

Diversity  We wish to retrieve results on a broader variety of user interests  Two different perspectives on achieving diversity:  Avoid overlap: choose notifications that are dissimilar to each other  Increase coverage: choose notifications that cover as many user interests as possible  How to measure diversity?  Many alternative ways  Common ground: measure similarity/distance among the selected items DMOD Laboratory, University of Ioannina 20

Problem Definition & Complexity  Diversify(k) problem:  Generally, the problem of choosing diverse results is NP- hard 1  Intuitively:  To find the most diverse subset S* of all results M we have to compute all possible combinations of k items out of |M| and keep the one with the maximum diversity DMOD Laboratory, University of Ioannina 21 Select the k out of all results that exhibit the maximum diversity 1 Erhan Erkut, “The discrete p-dispersion problem”, European Journal of Operational Research vol. 46, n o

Solutions  To solve the problem, heuristics are employed  Greedy heuristics:  Selecting items one by one until we have k of them  Interchange heuristics:  Start with a random solution and interchange items that improve the objective function  Neighborhood heuristics:  Disqualify items close to the ones already selected  Simulated Annealing:  Apply simulated annealing to avoid local maxima  General difficulty: Diversify(k-1)  Diversify(k) DMOD Laboratory, University of Ioannina 22

Outline  Challenges  Preferences  Diversity  Freshness  Combining Criteria  DiveR DMOD Laboratory, University of Ioannina 23

Freshness  Continuous flow of information  Also, system overheads, network latency…  Need for fresh information  We consider a model in which a notification’s importance decreases along with time DMOD Laboratory, University of Ioannina 24 Maximum time that e is considered important e ‘s publication time

Outline  Challenges  Preferences  Diversity  Freshness  Combining Criteria  DiveR DMOD Laboratory, University of Ioannina 25

Combining criteria  Various alternatives  still in search for the best one  Relevance + Freshness  Relevance + Freshness + Diversity DMOD Laboratory, University of Ioannina 26  [0, 1] event rank for user X set diversity

Outline  Challenges  Preferences  Diversity  Freshness  Combining Criteria  DiveR DMOD Laboratory, University of Ioannina 27

Implementation  We have initially implemented our ideas in SIENA  However, this implementation is not optimized for distributed environments, since the focus was on exploring ranking methods  We are in the process of designing a new implementation (DiveR) that can efficiently support ranked pub/sub delivery DMOD Laboratory, University of Ioannina 28

DiveR DMOD Laboratory, University of Ioannina 29

Summary  Our goal is to increase the quality of notifications delivered in pub/sub systems  Three main factors  Relevance to user preferences  Diversity  Freshness  Up to now, we have mainly focused in our model for information ranking  Our future plans are to develop ranking methods that can be efficiently applied in a distributed environment DMOD Laboratory, University of Ioannina 30

Thank you!