Application Layer Anycasting: A Server Selection Architecture and Use in a Replicated Web Service Presented in by Jayanthkumar Kannan On 11/26/03
Outline Motivation, Problem Definition Architecture Components Mechanisms API Experimental Results Conclusions
Motivation Anycast is a useful network primitive for accessing replicated content Service Model: Reach “best” node of a dynamic set of nodes Better suited to Application Layer deployment Metrics might be application-specific and routing might not be reactive enough. Routers need to be modified for anycast IPv4 address space needs to be reserved Stateless nature of IP: want to select server on flow-level, not per-packet
Goal An architecture for application-level anycast Client API Allow general metrics
Outline Motivation, Problem Definition Architecture Components Mechanisms API Experimental Results Conclusions
Architecture Main components Resolvers: Responsible for resolving anycast domain name to IP address Modified client: Domain name format modified Modified servers: Aid in monitoring and measurement
Anycast Resolvers Application-aware DNS-like resolvers Authoritative server for each domain Translate anycast domain name into IP address State required: List of IP addresses in each anycast group. Metrics associated with each IP address Authoritative resolver maintains definitive state for its groups, others cache it
Resolver operation Convert anycast domain name to IP address based on client-specified function of metrics Domain name format % points to authoritative resolver How do resolvers keep metrics up to date? Three kinds of metrics Server characteristics dependent (eg:load) Client-to-Server Path dependent (eg:total latency) Server and Path dependent
Metrics Maintanence Resolvers probe server periodically Works well if client is close to resolver Can measure all metrics Server Push Servers publish load data to multicast group subscribed to by resolvers Can measure server characteristics Resolvers probe for well known file at server User experience No additional traffic required
Client API Client uses gethostbyname(domain name) Domain name = FILTER % Filter: Given metrics for each server, which one is preferred? Content-Independent: Independent of metrics Metric-based: Relative/Absolute value of metrics Policy-based: Any general function
Client API (2) Client specifies filter and anycast domain name to resolver Does resolver have up to date information about group? Yes: Runs filter, and returns IP address Else: asks authoritative server, cache response, and return answer to client
Client API (3)
Observations O(S * G) state at each resolver G = # of groups, S = # of servers in each group Probe traffic could overwhelm server Scaling depends on number of resolvers/number of clients Resolvers replicate authoritative server’s state Generalize to multi-hop paths Essentially RON Can we use overlay routing protocol? Servers send periodic updates in say, DV. DV updated along overlay Unlikely to be reactive enough. Could potentially scale much better
Outline Motivation, Problem Definition Architecture Components Mechanisms API Experimental Results Conclusions
Case Study Web service replication Metric: Response Time = Path Latency + Server Processing Delay Metric Maintanence Server Push: Server publishes delay per request Agent Probe: Agents request well-known file Hybrid Techique: Agent measurements calibrated with server push data (adj factor = R/S) Note: Server Push more frequent than Agent Probing Found that hybrid technique tracks varying response time well
Refinements Hack to prevent oscillations Define ES = Equivalent set ES = set of servers with nearly same response times Form ES = ES + Minimum Response Time server Remove servers whose response time exceeds min by leave threshold Add servers whose response time within [min,min + join threshold] Hysteris used in selection criteriaES found using selection criteria Resolver returns randomly one element in ES Optimal Parameters dependent on several factors Oscillation definitely still a possibility
Experiment Configuration
Experimental Setup 4 anycast servers (1 in UCLA, 2 in GATECH, 1 in Washington U.) 2 Resolvers (UMCP and GATECH) 20 Clients (4 in UMCP, 16 in GATECH) Experiments Random vs Anycast choice Mixture of random choice, anycast choice Effect of Join Threshold
Random vs Anycast
Partial Deployment
Parameter Choice
Conclusions Architecture not very scalable Contributions Probing mechanisms to determine response time accurately with minimum overhead Possibly useful in P2P work