Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Sequoia: Virtual-Tree Models for Internet Path Metrics Rama Microsoft Research Also:Ittai Abraham (Hebrew Univ.) Mahesh Balakrishnan (Cornell) Archit Gupta.
NUS.SOC.CS Roger Zimmermann (based in part on slides by Ooi Wei Tsang) Peer-to-Peer Streaming.
Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Scalable Content-Addressable Network Lintao Liu
1 Greedy Forwarding in Dynamic Scale-Free Networks Embedded in Hyperbolic Metric Spaces Dmitri Krioukov CAIDA/UCSD Joint work with F. Papadopoulos, M.
Research: Group communication in distributed interactive applications Student: Knut-Helge Vik Institute: University of Oslo, Simula Research Labs.
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
The Cache Location Problem. Overview TERCs Vs. Proxies Stability Cache location.
Quadtrees Raster and vector.
The Cache Location Problem IEEE/ACM Transactions on Networking, Vol. 8, No. 5, October 2000 P. Krishnan, Danny Raz, Member, IEEE, and Yuval Shavitt, Member,
Are You moved by Your Social Network Application? Abderrahmen Mtibaa, Augustin Chaintreau, Jason LeBrun, Earl Oliver, Anna-Kaisa Pietilainen, Christophe.
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Ashish Gupta Under Guidance of Prof. B.N. Jain Department of Computer Science and Engineering Advanced Networking Laboratory.
ZIGZAG A Peer-to-Peer Architecture for Media Streaming By Duc A. Tran, Kien A. Hua and Tai T. Do Appear on “Journal On Selected Areas in Communications,
A Comparison of Layering and Stream Replication Video Multicast Schemes Taehyun Kim and Mostafa H. Ammar.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Application Layer Multicast
Count / Top-k Continuous Queries on P2P Networks 01/11/2006.
Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.
CS218 – Final Project A “Small-Scale” Application- Level Multicast Tree Protocol Jason Lee, Lih Chen & Prabash Nanayakkara Tutor: Li Lao.
An Optimization Problem in Adaptive Virtual Environments Ananth I. Sundararaj Manan Sanghi Jack R. Lange Peter A. Dinda Prescience Lab Department of Computer.
P2P Course, Structured systems 1 Introduction (26/10/05)
CSC 2300 Data Structures & Algorithms February 6, 2007 Chapter 4. Trees.
Important Problem Types and Fundamental Data Structures
UCSC 1 Aman ShaikhICNP 2003 An Efficient Algorithm for OSPF Subnet Aggregation ICNP 2003 Aman Shaikh Dongmei Wang, Guangzhi Li, Jennifer Yates, Charles.
Communication Part IV Multicast Communication* *Referred to slides by Manhyung Han at Kyung Hee University and Hitesh Ballani at Cornell University.
Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy Bilkent University, Ankara, Turkey.
Providing Resiliency to Load Variations in Distributed Stream Processing Ying Xing, Jeong-Hyon Hwang, Ugur Cetintemel, Stan Zdonik Brown University.
ON DESIGING END-USER MULTICAST FOR MULTIPLE VIDEO SOURCES Y.Nakamura, H.Yamaguchi, A.Hiromori, K.Yasumoto †, T.Higashino and K.Taniguchi Osaka University.
Network Aware Resource Allocation in Distributed Clouds.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
De-Nian Young Ming-Syan Chen IEEE Transactions on Mobile Computing Slide content thanks in part to Yu-Hsun Chen, University of Taiwan.
Universität Stuttgart Institute of Parallel and Distributed Systems (IPVS) Universitätsstraße 38 D Stuttgart Scalable Processing of Trajectory-Based.
Paper Group: 20 Overlay Networks 2 nd March, 2004 Above papers are original works of respective authors, referenced here for academic purposes only Chetan.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Paper # – 2009 A Comparison of Heterogeneous Video Multicast schemes: Layered encoding or Stream Replication Authors: Taehyun Kim and Mostafa H.
A Mechanized Model for CAN Protocols Context and objectives Our mechanized model Results Conclusions and Future Works Francesco Bongiovanni and Ludovic.
TREES. What is a tree ? An Abstract Data Type which emulates a tree structure with a set of linked nodes The nodes within a tree are organized in a hierarchical.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
Lecture 3: Uninformed Search
KAIS T On the problem of placing Mobility Anchor Points in Wireless Mesh Networks Lei Wu & Bjorn Lanfeldt, Wireless Mesh Community Networks Workshop, 2006.
Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
An overlay for latency gradated multicasting Anwitaman Datta SCE, NTU Singapore Ion Stoica, Mike Franklin EECS, UC Berkeley
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Hierarchical clustering approaches for high-throughput data Colin Dewey BMI/CS 576 Fall 2015.
1 Minimum Interference Algorithm for Integrated Topology Control and Routing in Wireless Optical Backbone Networks Fangting Sun Mark Shayman University.
Spanning Tree Method for Link State Aggregation in Large Communication Networks Whay Choiu Lee.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Yiting Xia, T. S. Eugene Ng Rice University
Efficient Join Query Evaluation in a Parallel Database System
A Study of Group-Tree Matching in Large Scale Group Communications
CS223 Advanced Data Structures and Algorithms
CE 221 Data Structures and Algorithms
Dynamic Replica Placement for Scalable Content Delivery
Replica Placement Heuristics of Application-level Multicast
Binary Trees.
Design and Implementation of OverLay Multicast Tree Protocol
Trees.
Presentation transcript:

Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004

One-line Comments This paper is addressing the operator placement problem in distributed query processing by using network latency information

Contents Motivation Problem Solution Approach Central Version of Algorithm  Edge  Edge+  In-Network  latency Constrained Distributed Version of Algorithm Experiment Critique

Motivation Small scale query processing system: Not-scalable  A lot of data stream & query request Widely-distributed query processing

Problem Operator placement problem  Operators in query processing trees should be dispersed into the network O 00 O 10 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 00 Processing tree (query plan)IP network O 10 O 11 O 22 O 23 O 26 O 25 O 20 O 21 O 24 operatornode Application node

Problem : formalized version Operator placement problem  For efficient operator placement  Cost: Bandwidth O: operators A: their connected inputs & outputs V: nodes E: their links C(): link cost, bandwidth c(a)=0 if for a=(m,n) : Source (operator’s) locations are determined m n ac(a)

Solution Approach Network-aware operator placement algorithms  Edge Consider only sources and the proxy location  Edge+ Edge with pair-wise server communication latencies  In-Network Sources, proxy, a subset of all locations  Latency-bound algorithm

Contents Motivation Problem Solution Approach Central Version of Algorithm Distributed Version of Algorithm Experiment Critique

Algorithm Design Principle Naïve algorithm for operator placement  Calculate all the combination of possible mapping  => Too complex Greedy algorithm  Calculate only for the locations of having high possibility  Locate operators in post-order  When we put a operator at a location, we can move by its children Processing tree O 00 O 10 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 IP network operatornode Application node S0S0 S1S1

Mapping Function O O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28

Edge Location candidate: sources, proxy Candidate with high possibility  (1) One of children’s locations  (2) A common location  (3) Proxy’s location Link cost

Edge (1) One of children’s locations  A location that maximizes the total tree cost between the operator and all of its children O 00 O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1 S0S0 S1S1 O 10 O 20 O 22 O Processing tree

Edge (2) A common location Idea  Placing an operator and its children at a common location  -> zero overlay cost between the operator and its children Common location (cl)  Good place for all its children  -> an intersection of each child’s dl (the set of descendant leaf locations) O 00 O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1 dl(O 11 )={S 0, S 1, S 2 } cl(O 00 )={S 0, S 1 }

Edge (3) Proxy’s location Idea  If tree costs are higher near the root  -> proxy location, r O 00 O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1

Edge – Summary Summary

Edge+ Location candidate: sources, proxy Edge with network latency (d) between two locations Link cost Mapping function

In-Network Placement Location candidate : arbitrary locations (including sources and proxy) Overlay cost and mapping function is the same as Edge+ Problem: reducing the candidate location set

In-Network Placement Approach  Remove the location unless its distance to all current child placements is less than all pairwise distances between child placements O 00 O 10 O 12 O 11 O 12 O 10 O N2N2 N4N4 N7N7 N8N8

Latency-Constrained Placement Find the configuration satisfying the latency-constrained Latency-constrained o cici O 20 O 22 O 21 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1 P: a set of leaf-to-root paths cici O O N4N4 N7N7 S1S1 O 22 O 21 S0S0 O 20 N5N5 If l=75

Contents Motivation Problem Solution Approach Central Version of Algorithm Distributed Version of Algorithm Experiment Critique

Distributed Query Placement Reason  Centralized approach – not scalable Substantial network state Algorithm complexity

Distributed Query Placement O1O1 C1C1 C2C2 C3C3 C4C4 O2O2 O3O3 O4O4 Processing tree Application proxy  Partition a processing tree into subtrees (zones)  Assign each zone to a coordinator node

Distributed Query Placement C1C1 C2C2 C3C3 C4C4 Tree Overlay

Experiment Experimental Setup  Processing Tree Binary tree Depth: 3 ~ 5  Network Topology Max pair-wise path delay: 500ms  Server and proxy location Uniform: APD = ASD Star: APD = 0.5*ASD Cluster: APD = 2*ASD APD: Average Proxy Distance ASD: Average Server Distance ServerProxy, Uniform Proxy, ClusterProxy, Star

Experiment Latency constraints  120ms (0.9nd, tight delay) vs. 300ms (2.2nd, loose delay) Direct comparison  Baseline case: all operators are located at the proxy Result Bandwidth consumptionLatency stretch

Critique Pros  Operator placement problem Focus on network-related cost not processing cost (BW, latency) Cons  High complexity algorithm possible to apply? Heavy processing Too much time taken to complete the placement  Latency information of many places is needed  Sequential convergence in a bottom-up manner => impossible to use in case of complex query plan & topology => more simple algorithm is appropriate  Dynamic? Unresilient to Dynamic topology change  In case of node leave, latency change