Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.

Slides:



Advertisements
Similar presentations
Optical networks: Basics of WDM
Advertisements

Orthogonal Drawing Kees Visser. Overview  Introduction  Orthogonal representation  Flow network  Bend optimal drawing.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
BY PAYEL BANDYOPADYAY WHAT AM I GOING TO DEAL ABOUT? WHAT IS AN AD-HOC NETWORK? That doesn't depend on any infrastructure (eg. Access points, routers)
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 8 Network models.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
Distributed Graph Analytics Imranul Hoque CS525 Spring 2013.
MAX FLOW APPLICATIONS CS302, Spring 2013 David Kauchak.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Gossip Scheduling for Periodic Streams in Ad-hoc WSNs Ercan Ucan, Nathanael Thompson, Indranil Gupta Department of Computer Science University of Illinois.
Lectures on Network Flows
HW2 Solutions. Problem 1 Construct a bipartite graph where, every family represents a vertex in one partition, and table represents a vertex in another.
ZIGZAG A Peer-to-Peer Architecture for Media Streaming By Duc A. Tran, Kien A. Hua and Tai T. Do Appear on “Journal On Selected Areas in Communications,
Adaptive Push-Pull: Disseminating Dynamic Web Data Pavan Deolasee, Amol Katkar, Krithi,Ramamritham Indian Institute of Technology Bombay Dept. of CS University.
Placement of Integration Points in Multi-hop Community Networks Ranveer Chandra (Cornell University) Lili Qiu, Kamal Jain and Mohammad Mahdian (Microsoft.
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
An Efficient Clustering-based Heuristic for Data Gathering and Aggregation in Sensor Networks Wireless Communications and Networking (WCNC 2003). IEEE,
Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,
Efficiently Maintaining Stock Portfolios Up-To-Date On The Web Prashant Shenoy Manish Bhide Krithi Ramamritham 2002 IEEE E-Commerce System Proceedings.
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 06/04/2007.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
Lecture 11. Matching A set of edges which do not share a vertex is a matching. Application: Wireless Networks may consist of nodes with single radios,
PROMISE: Peer-to-Peer Media Streaming Using CollectCast Presented by: Randeep Singh Gakhal CMPT 886, July 2004.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Flow Models and Optimal Routing. How can we evaluate the performance of a routing algorithm –quantify how well they do –use arrival rates at nodes and.
Homework 2. Problem 1 Families 1…..N go out for dinner together. To increase their social interaction, no two members of the same family use the same.
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,
Efficient Gathering of Correlated Data in Sensor Networks
The Minimal Communication Cost of Gathering Correlated Data over Sensor Networks EL 736 Final Project Bo Zhang.
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
Network Aware Resource Allocation in Distributed Clouds.
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct
1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant.
CSE 421 Algorithms Richard Anderson Lecture 24 Network Flow Applications.
Node-based Scheduling with Provable Evacuation Time Bo Ji Dept. of Computer & Information Sciences Temple University Joint work.
Logical Topology Design
CP Summer School Modelling for Constraint Programming Barbara Smith 2. Implied Constraints, Optimization, Dominance Rules.
Minimax Open Shortest Path First (OSPF) Routing Algorithms in Networks Supporting the SMDS Service Frank Yeong-Sung Lin ( 林永松 ) Information Management.
MAX FLOW APPLICATIONS CS302, Spring 2012 David Kauchak.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
QoS Routing ISDL Quality of Service Routing Algorithms for Bandwidth-Delay Constrained Applications Yi Yang, Jogesh Muppala et al.
CS223 Advanced Data Structures and Algorithms 1 Maximum Flow Neil Tang 3/30/2010.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Partitioning using Mesh Adjacencies  Graph-based dynamic balancing Parallel construction and balancing of standard partition graph with small cuts takes.
1 Shetal Shah, IITB Dissemination of Dynamic Data: Semantics, Algorithms, and Performance.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Static Process Scheduling
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
Graph Data Management Lab, School of Computer Science Add title here: Large graph processing
1 Power Efficient Monitoring Management in Sensor Networks A.Zelikovsky Georgia State joint work with P. BermanPennstate G. Calinescu Illinois IT C. Shah.
11/21/02CSE Max Flow CSE Algorithms Max Flow Problems.
Instructor Neelima Gupta Edited by Divya Gaur(39, MCS '09) Thanks to: Bhavya(9), Deepika(10), Deepika Bisht(11) (MCS '09)
Optimization-based Cross-Layer Design in Networked Control Systems Jia Bai, Emeka P. Eyisi Yuan Xue and Xenofon D. Koutsoukos.
Honors Track: Competitive Programming & Problem Solving Seminar Topics Kevin Verbeek.
Errol Lloyd Design and Analysis of Algorithms Approximation Algorithms for NP-complete Problems Bin Packing Networks.
Constraint-Based Routing
Lectures on Network Flows
Richard Anderson Lecture 23 Network Flow
Frank Yeong-Sung Lin (林永松) Information Management Department
Instructor: Shengyu Zhang
R. Johnsonbaugh Discrete Mathematics 5th edition, 2001
Vertex Covers, Matchings, and Independent Sets
Problem Solving 4.
Dissemination of Dynamic Data on the Internet
Algorithms (2IL15) – Lecture 7
Frank Yeong-Sung Lin (林永松) Information Management Department
Maximum Flow Neil Tang 4/8/2008
Fast Min-Register Retiming Through Binary Max-Flow
Presentation transcript:

Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar University of California Riverside

Dynamic Data Traffic data packets thru switches / vehicles on highways Stock prices, Sport Scores rapid and unpredictable changes time critical, value critical used in on-line monitoring, decision making More and more of data gathered from the web/internet is dynamic

Coherency of Dynamic Data Strong coherency The client and source always in sync (U(t) = S(t)) Strong coherency is expensive! Relax strong coherency:  - coherency Time domain:  t - coherency Value domain:  v - coherency The difference in the data values at the client and the source bounded by  v at all times E.g.: temperature changes greater than 1 degree Source S(t) Repository R(t) Clien t U(t)

Broad Focus of work To create a scalable content dissemination network (CDN) for streaming/dynamic data. Metric: Fidelity: % of time coherency requirement is met

Clients request for different data items by specifying coherence requirements for each item Repositories derive their requirements from the client requirements Source pushes the changes of interest to repositories Repositories cooperate with each other and the source to serve clients Basic Framework: Sources, Repositories, Clients

Example Dissemination Network Data Set: p, q, r Max Clients : 2 Source p : 0.2, q : 0.2 r : 0.2 p : 0.4, r : 0.3 q : 0.3 R1R1 R2R2 R4R4 R3R3

Challenges – I Given the data and coherency needs of repositories, how should repositories cooperate to satisfy these needs? How should repositories refresh the data such that coherency requirements of dependents are satisfied? How to make repository network resilient to failures? [VLDB02, VLDB03, IEEE TKDE]

Challenges – II: Service to Clients Given the data and coherency needs of clients what data at what coherency should reside in each repository? Given the data and the coherency available at repositories, how to assign clients to the repositories? Service to Clients Assign data to repositories Assign clients to repositories

Assigning clients to repositories Client request is satisfied Overheads are low Communication delay Computational delay C1C1 R2R2 Source p :0.2, q :0.2 r :0.2 p :0.4, r : 0.3 q : 0.3 R1R1 R4R4 R3R3 ? Assign to repository

Overview Client assignment problem is NP-Hard Solve using preferences Clients and repositories order each other by preferences Use Stable Marriages Assign costs and do many-to-one client-repository pairing

Cost based Client Assignment Assign cost to each potential pair Minimum Cost Assignment = {1,3,7} Repositories

Cost based Client Assignment Assign cost to each potential pair Minimum Cost Assignment = {1,3,7} Repositories

Cost based Client Assignment Assign cost to each potential pair Minimum Cost Assignment = {1,3,7} Load imbalance! Assignment = {1, 3, 8} Repositories Minimum Weight Matching

Client Assignment An assignment may contribute to delay for other assignments at the same node Assignment = {1,3,8} Minimum Weight Matching Repositories

Matching Bipartite Graph (G, V 1, V 2, E) E -> (V 1, V 2 ) Weight on each edge Matching Set of edges, where no vertex appears more than once Minimum weight matching Of all matchings, the one of minimum weight Many clients, few repositories Many to one matching

Many-to-one Matching: Min Cost Network Flows Directed graph, G={V, E} Start vertex End vertex or sink Edge Capacity: maximum flow the edge can have Cost: per unit flow Intermediate vertex Inflow = outflow Start End

A Possible flow in the network 2 Start End Start End  2 Capacity Graph Flow Graph

Maximum Flow Value of the flow: flow leaving the source Maximum flow: value of flow is maximum Cost of flow =  edges  flow * cost per unit flow) Min Cost Flow: maximum flow of minimum cost 2 Start End

Client Assignment Using Network Flows X,Y, Z : number of clients the repository is willing to serve Capacity of edge = 1 Sum of capacities on edges  number of client requests End 1 Start X Y Z 11

Network Flows: Costs and Capacities edge Capacity : 1 Cost: function of communication delays and coherence requirement Cost of all other edges:0 Start X Y Z

Max Flows Flow out of start node = number of client requests Each unit of flow makes one assignment Cost of unit flow = cost of assignment Maximum Flow of minimum Cost => required solution But this could overload the repositories! Start X Y Z

Considering Load: Iterative Min Cost Flows Load depends on the coherence requirement of the assignments Assignments depend on this load! Limit the number of requests assigned to a repository using capacity But this number does not translate into load It translates to load if coherences are close to each other

Iterative Min Cost Flows Split the requests into ranges. For each range : Calculate the approximate load at each repository due to the previous assignments Calculate the approximate load of the assignments to be made in this range Determine the capacity of each repository Find min-cost max flow

For Each Range Number of updates for coherence c i is c i -2 Approximate load at a repository:A i. Average load A. For n client requests, expected load = n * c i -2 Number of repositories: k Let t i be the number of assignments in the current range to repository R i Total load at R i will be A i + t i * c i -2 Average load at R after assignment = Capacity for R i

Client Assignment Problem is NP-Hard! Input: S 1 …S s sources, R 1 …R r repositories, L 1 …L n clients, d 1 …d k data items For each S i and R j : list of data items served and the coherence requirements. triplets and a positive value  ij Distribution of each d j Communication delays and computational delays Question: Is there an assignment of every to some R k, such that c i R j  c i L j and minimum fidelity is at least  ij ?

Reduce Partition to the Client Assignment Problem Partition Input: Set S For each element in S, a positive weight w i A positive integer, k Question: Is there a partition of S into k equally weighted parts?

Best Effort Service Source p :0.2, q :0.2 r :0.2 p :0.4, r : 0.3 q : 0.3 R1R1 R4R4 R3R3 q :0.1 C1 Client will be served q at coherence 0.2 R2R2

Augmentation Source p :0.2, q :0.1 r :0.2 p :0.4, r : 0.3 q : 0.3 R1R1 R4R4 R3R3 q :0.1 C1 Coherence of A for q is changed to 0.1. R2R2

Experimental Methodology Network: 1 source, repositories, 10,000 – 80,000 client requests Real stock traces: Time duration of observations: 10,000 s Ranges for min cost flow: { , , , } Network Flow Solver: RelaxIV from p.html

For comparison… Prior online Global Heuristic Selector node for each data item Selector keeps information of coherence requirements at repositories delays between the nodes in the network number of clients assigned to each repository Client is assigned to a repository where the sum of the delays is minimized. Two flavours: GHIS, GHES S. Agarwal et al. Construction of a Temporal Coherency Preserving Dynamic Data Dissemination Network. RTSS’04

Performance of the algorithms GHIS does better than MCF, GHES initially, but degrades rapidly unsatisfied requests source overloading! Augmentation performs very well GHES and MCF are comparable for small number of repositories 50% client requests between 0.01 to Remaining from 0.1 to 0.99 GHIS

MCF vs GHES (best effort) MCF does better as the number of repositories increase In fact for some simple inputs, MCF did better than GHES by a factor of 9! Topology: 1 source, 10 repositories, 50 data items GHIS GHES MCF

Augmentation helps, but… as the load increases, augmentation increases loss in fidelity As load increases, serving clients at less stringent coherence requirements might actually reduce the loss in fidelity!

Need to adapt to load– Fair vs. biased approaches Fair ApproachBiased Approach It is better to be biased than to be fair! MCF_aug MCF

Adaptive Algorithm For each data item, source maintains a list of unique coherences and the number of clients for each coherence If the queuing delay at any source/repository crosses a threshold th 1 For each data item, the source reduces the coherence of service for some clients If the queuing delays at any source/repository goes below a threshold th 2. Resume service at desired coherency to some of the clients

Performance of the adaptive algorithm Augmented adaptation performs the best!

Conclusions and Current Work Conclusions We prove that the client assignment problem is NP-Hard Develop two new heuristics for the client assignment problem Develop an adaptive algorithm for client assignment Current Work Investigation of the algorithms in real network settings – Planet Lab.

Thank You!