ICDCS Beijing China Routing of XML and XPath Queries in Data Dissemination Networks Guoli Li, Shuang Hou Hans-Arno Jacobsen Middleware Systems Research.

Slides:



Advertisements
Similar presentations
17 th International World Wide Web Conference 2008 Beijing, China XML Data Dissemination using Automata on top of Structured Overlay Networks Iris Miliaraki.
Advertisements

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Optimizing Join Enumeration in Transformation-based Query Optimizers ANIL SHANBHAG, S. SUDARSHAN IIT BOMBAY VLDB 2014
Bloom Based Filters for Hierarchical Data Georgia Koloniari and Evaggelia Pitoura University of Ioannina, Greece.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Hans-Arno Jacobsen June 23, 2011 Resource Allocation Algorithms for Publish/Subscribe Systems
Small-Scale Peer-to-Peer Publish/Subscribe
Transactional Mobility in Distributed Content-Based Publish/Subscribe Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese.
Subscription Subsumption Evaluation for Content-Based Publish/Subscribe Systems Hojjat Jafarpour, Bijit Hore, Sharad Mehrotra, and Nalini Venkatasubramanian.
Selective Dissemination of Streaming XML By Hyun Jin Moon, Hetal Thakkar.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Applications over P2P Structured Overlays Antonino Virgillito.
©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California.
Carnegie Mellon University Complex queries in distributed publish- subscribe systems Ashwin R. Bharambe, Justin Weisz and Srinivasan Seshan.
1 AINA 2006 Wien, April th 2006 DiVES: A DISTRIBUTED SUPPORT FOR NETWORKED VIRTUAL ENVIRONMENTS The IEEE 20th International Conference on Advanced.
Quantitative Characterization of Denial of Service Attacks: A Case Study of Location Services Adam Bargteil David Bindel Yan Chen.
CS218 – Final Project A “Small-Scale” Application- Level Multicast Tree Protocol Jason Lee, Lih Chen & Prabash Nanayakkara Tutor: Li Lao.
Hermes: A Distributed Event- Based Middleware Architecture Peter Pietzuch and Jean Bacon 1st DEBS Workshop, Vienna,
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
Distributed Publish/Subscribe Network Presented by: Yu-Ling Chang.
Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24 th 2010 ICDCS 2010 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Effects of Routing Computations in Content-Based Routing Networks with Mobile Data Sources Vinod Muthusamy, Milenko Petrovic, Hans-Arno Jacobsen University.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Scalable Security and Accounting Services for Content-based Publish/Subscribe Systems Himanshu Khurana NCSA, University of Illinois.
Publisher Mobility in Distributed Publish/Subscribe Systems Vinod Muthusamy, Milenko Petrovic, Dapeng Gao, Hans-Arno Jacobsen University of Toronto June.
Research Interests Georgia Koloniari Computer Science Department University of Ioannina, Greece.
MIDDLEWARE SYSTEMS RESEARCH GROUP Denial of Service in Content-based Publish/Subscribe Systems M.A.Sc. Candidate: Alex Wun Thesis Supervisor: Hans-Arno.
Gil EinzigerRoy Friedman Computer Science Department Technion.
Supporting Disconnected Operations in Publish/Subscribe Systems Vinod Muthusamy Joint work with Milenko Petrovic, Ioana Burcea, H.-Arno Jacobsen, Eyal.
Content-Based Routing in Mobile Ad Hoc Networks Milenko Petrovic, Vinod Muthusamy, Hans-Arno Jacobsen University of Toronto July 18, 2005 MobiQuitous 2005.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware A Policy Management Framework for Content-based Publish/Subscribe Middleware Hans-Arno Jacobsen Department.
DISTRIBUTED EVENT AGGREGATION FOR CONTENT-BASED PUBLISH/SUBSCRIBE SYSTEMS Navneet Kumar Pandey 1 Stéphane Weiss 1 Roman Vitenberg 1 Kaiwen Zhang 2 Hans-Arno.
Dynamic Load Balancing in Distributed Content-based Publish/Subscribe Alex K. Y. Cheung & Hans-Arno Jacobsen University of Toronto November 30 th, 2006.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Total Order in Content-based Publish/Subscribe Systems Joint work with: Vinod Muthusamy, Hans-Arno Jacobsen.
Distributed Automatic Service Composition in Large-Scale Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese Academy of.
Classification and Analysis of Distributed Event Filtering Algorithms Sven Bittner Dr. Annika Hinze University of Waikato New Zealand Presentation at CoopIS.
PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH.
MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.
Early Profile Pruning on XML-aware Publish- Subscribe Systems Mirella M. Moro, Petko Bakalov, Vassilis J. Tsotras University of California VLDB 2007 Presented.
MIDDLEWARE SYSTEMS RESEARCH GROUP Adaptive Content-based Routing In General Overlay Topologies Guoli Li, Vinod Muthusamy Hans-Arno Jacobsen Middleware.
QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates Changqing Li,Tok Wang Ling.
PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH.
Minimal Broker Overlay Design for Content-Based Publish/Subscribe Systems Naweed Tajuddin Balasubramaneyam Maniymaran Hans-Arno Jacobsen University of.
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
SocialVoD: a Social Feature-based P2P System Wei Chang, and Jie Wu Presenter: En Wang Temple University, PA, USA IEEE ICPP, September, Beijing, China1.
Tree Traversals, TreeSort 20 February Expression Tree Leaves are operands Interior nodes are operators A binary tree to represent (A - B) + C.
Information-Centric Networks10b-1 Week 10 / Paper 2 Hermes: a distributed event-based middleware architecture –P.R. Pietzuch, J.M. Bacon –ICDCS 2002 Workshops.
GPX-Matcher - A Generic Boolean Predicate-based XPath Expression Matcher Mohammad Sadoghi, Ioana Burcea, and Hans-Arno Jacobsen Middleware Systems Research.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
Information-Centric Networks Section # 10.2: Publish/Subscribe Instructor: George Xylomenos Department: Informatics.
STATE KEY LABORATORY OF NETWORKING & SWITCHING BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATAIONS A Semantic Peer-to- Peer Overlay for Web Services.
Peter R Pietzuch and Jean Bacon Peer-to-Peer Overlay Networks in an Event-Based Middleware DEBS’03, San Diego, CA, USA,
Distributed Automatic Service Composition in Large-Scale Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese Academy of.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Distributed Ranked Data Dissemination in Social Networks Joint work with: Mo Sadoghi Vinod Muthusamy Hans-Arno.
Community Clustering in Distributed Publish/Subscribe System Wei Li 1,2,Songlin Hu 1, Jintao Li 1, Hans-Arno Jacobsen 3 1 Institute of Computing Technology,
Stefanos Antaris Distributed Publish/Subscribe Notification System for Online Social Networks Stefanos Antaris *, Sarunas Girdzijauskas † George Pallis.
MIDDLEWARE SYSTEMS RESEARCH GROUP Divide and Conquer Algorithms for Pub/Sub Overlay Design Chen Chen 1 joint work with Hans-Arno Jacobsen 1,2, Roman Vitenberg.
Congestion Avoidance with Incremental Filter Aggregation in Content-Based Routing Networks Mingwen Chen 1, Songlin Hu 1, Vinod Muthusamy 2, Hans-Arno Jacobsen.
1 Towards Scalable Pub/Sub Systems Shuping Ji 1, Chunyang Ye 2, Jun Wei 1 and Arno Jacobsen 3 1 Chinese Academy of Sciences 2 Hainan University 3 Middleware.
Tree-Pattern Aggregation for Scalable XML Data Dissemination
Efficient Filtering of XML Documents with XPath Expressions
RE-Tree: An Efficient Index Structure for Regular Expressions
Navneet Kumar Pandey1 Stéphane Weiss1 Roman Vitenberg1
Towards an Internet-Scale XML Dissemination Service
Composite Subscriptions in Content-based Pub/Sub Systems
Small-Scale Peer-to-Peer Publish/Subscribe
Tree-Pattern Similarity Estimation for Scalable Content-based Routing
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Relax and Adapt: Computing Top-k Matches to XPath Queries
Presentation transcript:

ICDCS Beijing China Routing of XML and XPath Queries in Data Dissemination Networks Guoli Li, Shuang Hou Hans-Arno Jacobsen Middleware Systems Research Group University of Toronto

ICDCS Beijing China Agenda Motivation Advertisement-based routing Covering Evaluation Conclusions

ICDCS Beijing China Motivation Data sources: publish XML data Data users: register XPath queries The data dissemination network: deliver matching results to a large and dynamically changing group of users Content-based Data Dissemination … XML … Queries Results

ICDCS Beijing China Publish/Subscribe Publisher Subscriber Subscription (XPath) Publication (XML) Advertisement (DTD) Subscriber Matching of XMLs and XPaths [ICDE’06] Matching of Advertisements and XPaths Exploring relations among XPaths

ICDCS Beijing China Covering-based Routing

ICDCS Beijing China Language Model Advertisement: generated from DTDs Non-recursive advertisement  e.g., A = /t1/t2/t3…/tn-1/tn Recursive advertisement  Simple A = A1(A2)+A3  SeriesA = A1(A2)+A3(A4)+A5  EmbeddedA = A1(A2(A3 )+ A4)+A5 … /personnel/person /personnel/person/name /personnel/person/name/family /personnel/person/name/given /personnel/person/ /personnel/person/url /personnel/person/link DTD Advertisements

ICDCS Beijing China Language Model Subscription: XPaths Absolute e.g., /c/d/*/e Relative e.g., c/d/*/e Descendant operators e.g., c//e/*/c c d e * e * c b a

ICDCS Beijing China Advertisement-based Routing P(A) P(S) P(A) P(S) P(A) P(S) Subscription (S) Broker A1: /a/b/*/e A2: /b/e A3: /a/b/d A4: /a/b/e …

ICDCS Beijing China Overlapping Algorithms S = /a /b /c /* /b /e AdvSubOverlap **Y *tY t*Y ttY t1t2N Next Table A = /a /b /c /* /b /c /* /b /e /a /b /c /* /b /c /* /b /e /a /b /c /* /b /e /a /b /c /* /b /c /* /b /e /a /b /c /* /b /e /a /b /c /* /b /c /* /b /e e.g, S = /a /b //c /* /b //e Basic case: Other cases:

ICDCS Beijing China Subscription Tree Subscriptions are maintained in a hierarchical tree A child has more than one parent Siblings may intersect If a publication does not match a node, it does not match any of the descendants ROOT /a /b/e/c/f /*/bd/a/b /a/b/a/c/a/*/d /a/b/d/a/c/d /b/d/b/e /b/d/a pointer

ICDCS Beijing China Tree Maintenance Insert Delete

ICDCS Beijing China Covering Algorithms Similar to Adv-Sub overlapping algorithms Absolute simple XPEs Relative simple XPEs XPEs with // operator e.g., S1S2Cover **Y *tY t*N ttY t1t2N S2 = /a /a /* //c /e /c /d S1 = /* /a //e /c /a /a /*//c /e /c /d /* /a /e /c /a //c /e /c /d/*

ICDCS Beijing China Merging Rules Rules XPEs with one difference (e.g., element, op) e.g., S1= /a/*/c/d S2 = /a/*/c/e S = /a/*/c/* XPEs with different sub-XPEs e.g., … XPE1 XPE2 … S1 S2 … S // Merge degree P(S1) P(S2) P(S)

ICDCS Beijing China Evaluation Setup Implemented in C++ Overlay with 127 content-based routers Cluster (each node:1.86GHz, 4G) vs. PlanetLab Workloads are generated from two DTDs: NITF and PSD Metrics Number of subscriptions per router Network traffic XPE processing time Notification delay

ICDCS Beijing China Routing Table Size

ICDCS Beijing China Routing Table Size

ICDCS Beijing China Network Traffic MethodNetwork TrafficDelay(ms) No-Adv-No-Cov654, No-Adv-With-Cov572, With-Adv-No-Cov398, With-Adv-With-Cov326, With-Adv-With-CovPM254, With-Adv-With-CovIPM257,

ICDCS Beijing China Process Time

ICDCS Beijing China Notification Delay (PSD)

ICDCS Beijing China Notification Delay (NITF)

ICDCS Beijing China Related Work Locating data sources in large distributed systems [Galanis et al. 2003] DHT based approach Data summary Query aggregation for scalable data dissemination [Chan et al. 2002] Equivalence between the original query set and the aggregated set ONYX [Diao et al. 2004] Deliver part of the XML documents Share common prefixes among queries using NFA XTreeNet [Fenner et al. 2005] Unify the pub/sub model and the query/response model Avoid repeatedly matching at each hop

ICDCS Beijing China Conclusions Investigate advertisement-based routing for XML data dissemination networks Propose a novel data structure to maintain covering & merging relationships among XPEs. Perform experimental evaluation on a 127 broker overlay to demonstrate the approach Reduce routing table by up to 90% Improve routing latency by roughly 85% Future work Extend to tree patterns Share common prefixes among XPEs in overlapping and covering algorithms

ICDCS Beijing China Q & A Contact Middleware systems research group, University of Toronto

ICDCS Beijing China Process Time Number of Subscriptions Time (ms)

ICDCS Beijing China Notification Delay (NITF)

ICDCS Beijing China Notification Delay (PSD) Number of Hops Notification Delay (ms)

ICDCS Beijing China False Positives

ICDCS Beijing China Conclusions Investigate advertisement-based routing for XML data dissemination networks Present algorithms to determine the covering relations among arbitrary XPEs Propose a novel data structure to maintain covering & merging relationships among XPEs. Explore rules to merge similar XPEs in order to further reduce the routing table size Perform experimental evaluation on a 127 broker overlay to demonstrate the approach Reduce routing table by up to 90% Improve routing latency by roughly 85%