CS728 Lecture 17 Web Indexes III
Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of compact encodings for graphs and network problems Applications - Fast queries for path information - routing & routing table construction - topology control - spanning trees - dominating sets & clustering - hierarchical clustering
Main Problem Considered arbitrary topology goal small routing tables to find path to destination related problem: finding closest item of certain type Routing: how do I get there from here? source destination
Definitions: Spanner: subgraph whose distance between two nodes is close to that in the original graph We will see that radio networks need energy-spanners, i.e, subgraphs that contain energy-efficient paths
Spanning Trees: K-Dominating Sets: minimum connected subgraph useful for routing single point of failure non-minimal routes many variants set of nodes that are within K hops of every node used to defines partition of the network into zones 1-dominating set
Graph Clustering: Hierarchical Clustering K-center problem – find k nodes such that minimize the max distance to all nodes – Flat Clustering Hierarchical Clustering tree clustering with internal and border nodes and edges
Hierarchical Clustering The hierarchy imposes a natural addressing scheme Each node labeled with the path in the hierarchy tree Problem: give a compact labeling for a tree –Clearly need logn bits to identify some nodes. –Need to add information about tree structure –Complete binary tree –Other n-node trees
Interval labeling scheme –Label the leaves of the tree uniquely logn bits –Label each internal node with the range of its descendents 2log n bits. –Given two nodes x,y and their labels Can you test if x is an ancestor or y? Can you describe the path from x to y?
Greedy Dewey Labeling scheme Label each edge with small unique string Nodes are concatenation of edge labels
Theorem: Upper bound on GDL label length with unary delimiters is bits, - is the depth of v in T - n is number of nodes in T Alternative use binary (fixed length) for delimiting each edge –Seems to do worse in practice Can remove dependence on depth by converting encodings of long interior paths using count labels
Spanners and Stretch Stretch of a subgraph H is the maximum ratio of the distance between two nodes in H to that between them in G –Extensively studied in the graph algorithms and graph theory literature [Eppstein 96] Distance stretch and topological stretch A spanner is a subgraph that has constant stretch –The Delaunay triangulation yields a planar Euclidean distance-spanner –The Yao-graph [Yao 82] is also a simple distance- spanner
Energy Stretch and Energy Spanners Commonly adopted power attenuation model: – is between 2 and 4 Assuming uniform threshold for reception power and interference/noise levels, energy consumed for transmitting from to needs to be proportional to Power control: Radios have the capability to adjust their power levels so as to reach destination with desired fidelity Energy consumed along a path is simply the sum of the transmission energies along the path links Define energy-stretch analogous to distance-stretch
Energy-Aware Routing A path with many short hops consumes less energy than a path with a few large hops –Which edges to use? (Considered in topology control) –Can maintain “energy cost” information to find minimum- energy paths [Rodoplu-Meng 98] Routing to maximize network lifetime [Chang- Tassiulas 99] –Formulate the selection of paths and power levels as an optimization problem –Suggests the use of multiple routes between a given source- destination pair to balance energy consumption Energy consumption also depends on transmission rate –Schedule transmissions lazily [Prabhakar et al 2001] –Can split traffic among multiple routes at reduced rate [Shah- Rabaey 02]
Topology Control Given: –A collection of nodes in the plane –Transmission range of the nodes (assumed equal) Goal: To determine a subgraph of the transmission graph G that is –Connected –Low-degree –Small stretch, hop-stretch, and power- stretch
The Yao Graph Divide the space around each node into sectors (cones) of angle Each node has an edge to nearest node in each sector Number of edges is For any edge (u,v) in transmission graph – There exists edge (u,w) in same sector such that w is closer to v than u is Theorem: The Yao Graph has stretch u w v
Dominating Set Applications Facility location –A set of -dominating centers can be selected to locate servers or copies of a distributed directory –Dominating sets can serve as location database for storing routing information in ad hoc networks [Liang Haas 00] NP-hard for general graphs Reduces to the minimum set cover problem Recall last time: Greedy gives logn approximation Admits a PTAS for planar graphs [Baker 94]
An Example Greedy Algorithm
Hierarchical Network Decomposition Sparse neighborhood covers [Awerbuch-Peleg 89, Linial-Saks 92] –Applications in location management, replicated data management, routing –Provable guarantees, though difficult to adapt to a dynamic environment Routing scheme using hierarchical partitioning [Dolev et al 95] –Adaptive to topology changes –Weak guarantees in terms of stretch and memory per node
Sparse Neighborhood Covers An r-neighborhood cover is a set of overlapping clusters such that the r-zone of any node is in one of the clusters Aim: Have covers that are low diameter and have small overlap Overlap is measured by the max number of clusters a node is in Tradeoff between diameter and overlap –Set of all r-zones: Have diameter 2r but overlap n –The entire network single cluster: Overlap 1 but diameter could be n Sparse r-neighborhood with O(r log(n)) diameter clusters and O(log(n)) overlap [Peleg 89, Awerbuch- Peleg 90]
Sparse Neighborhood Covers Set of sparse neighborhood covers –{ -neighborhood cover: } For each node: –For any, the -zone is contained within a cluster of diameter –The node is in clusters Applications: –Tracking mobile users –Distributed directories for replicated objects
Online Tracking of Mobile Users Given a fixed network with mobile users Need to support location query operations Home location register (HLR) approach: –Whenever a user moves, corresponding HLR is updated –Inefficient if user is near the seeker, yet HLR is far Performance issues: –Cost of query: ratio with “distance” between source and destination –Cost of updating the data structure when a user moves
Mobile User Tracking: Initial Setup The sparse -neighborhood cover forms a regional directory at level At level, each node u selects a home cluster that contains the -zone of u Each cluster has a leader node. Initially, each user registers its location with the home cluster leader at each of the levels
The Location Update Operation When a user X moves, X leaves a forwarding pointer at the previous host. User X updates its location at only a subset of home cluster leaders –For every sequence of moves that add up to a distance of at least, X updates its location with the leader at level Amortized cost of an update is for a sequence of moves totaling distance
The Location Query Operation To locate user X, go through the levels starting from 0 until the user is located At level, query each of the clusters u belongs to in the -neighborhood cover Follow the forwarding pointers, if necessary Cost of query:, if is the distance between the querying node and the current location of the user
Comments on the Tracking Scheme Distributed construction of sparse covers in time [Awerbuch et al 93] The storage load for leader nodes may be excessive; use hashing to distribute the leadership role (per user) over the cluster nodes Distributed directories for accessing replicated objects [Awerbuch-Bartal-Fiat 96] –Allows reads and writes on replicated objects –An -competitive algorithm assuming each node has times more memory than the optimal Unclear how to maintain sparse neighborhood covers in a dynamic network
Bubbles Routing and Partitioning Scheme Adaptive scheme by [Dolev et al 95] Hierarchical Partitioning of a spanning tree structure Provable bounds on efficiency for updates 2-level partitioning of a spanning tree root
Bubbles (cont.) Size of clusters at each level is bounded Cluster size grows exponentially # of levels equal to # of routing hops Tradeoff between number of routing hops and update costs Each cluster has a leader who has routing information General idea: - route up the tree until in the same cluster as destination, - then route down - maintain by rebuilding/fixing things locally inside subtrees
Bubbles Algorithm A partition is an [x,y]-partition if all its clusters are of size between x and y A partition P is a refinement of another partition P’ if each cluster in P is contained in some cluster of P’. An (x_1, x_2, …, x_k)-hierarchical partitioning is a sequence of partitions P_1, P_2,.., P_k such that - P_i is an [x_i, d x_i] partitioning (d is the degree) - P_i is a refinement of P_(i-1) Choose x_(k+1) = 1 and x_i = x_(i+1) n 1/k
Clustering Construction Build a spanning tree, say, using BFS Let P_1 be the cluster consisting of the entire tree Partition P_1 into clusters, resulting in P_2 Recursively partition each cluster Maintenance rules: - when a new node is added, try to include in existing cluster, else split cluster - when a node is removed, if necessary combine clusters
memory requirement adaptability k hops during routing matching lower bound for bounded degree graphs Note: Bubbles does not provide a non-trivial upper bound on stretch in the non-hop model Performance Bounds