Self-Managing Anycast Routing for DNS NLnet Labs & SIDN Labs
Context
Providing High-Available & Reliable DNS Service DNS service for important zones (globally) reliable (trustworthy, security, …) high-availability reduce (average) latency Examples ccTLDs, gTLDs … Common solution distribute DNS name servers anycast addressing and routing (BGP and IGP)
Uni-, Multi-, and Anycast
Local/Global Anycast Nodes Local with IGP RIPv2, OSPF, IS-IS, EIGRP redundancy, load distribution, low latency within a network Global with BGP just BGP-4 redundancy, load distribution, low latency over global Internet
Research Question Very generic thesis Find optimal placement of nodes distribution mechanism for flexible, adaptive deployment of DNS services (authoritative) Find optimal placement of nodes availability (also in relation with DDoS) reliability (including security, integrity, trust) low (average) latency Alternative distribution mechanisms p2p or some hybrid, e.g., zone files hosted at an ISP anycast enhanced with self-management to support flexibility and adaptability
Project Plan
Plan & Approach Solution should integrate/interoperate with current operational practices Self-Managing Anycast Routing for DNS (SMARD) BGP anycast: availability, reduce latency self-* configuration: flexibility, adaptability, … optimization: load distribution, reduce latency, … healing: recover from failures protection: security, integrity, trust, …
Plan & Approach cont’d Anycast & self-* to achieve mentioned goals, but … Support for self-* loop monitor, analyse, plan, execute “Playground” to deploy anycast nodes at various/diverse topological locations IaaS, …?
Architectural Overview
Self-* Autonomic Computing
Autonomic Computing “The Vision of Autonomic Computing,” Jeff Kephart and D. Chess, IEEE Computer, January 2003. “...main obstacle to further progress in IT is a looming software complexity crisis.” computer systems are becoming too massive, complex, to be managed even by the most skilled IT professionals the workload and environment conditions tend to change very rapidly with time
Autonomic Computing cont’d System that can manage themselves given high-level objectives objectives can be expressed in term of service- level objectives or utility functions Analogy human autonomic nervous system “responsible for monitoring conditions in the internal environment and bringing about appropriate changes in them” autonomic nervous system functions in an involuntary, reflexive manner
Centralized vs. Distributed Coordination monitor analyse plan execute knowledge monitor analyse plan execute knowledge monitor analyse plan execute knowledge monitor analyse plan execute knowledge
Example: Hierarchical Coordination (2 Layer)
Example cont’d Anycast nodes SMARD global M-A-P-E their own operation monitor own behavior local actions, global notification SMARD global M-A-P-E global operation receive abstract/strategic monitor information plan global actions for anycast nodes
DNS Anycast Considerations
Operation of Anycast Services, RFC 4786 Load distribution (not load balancing) node placement “catchment” global/local anycast nodes … Monitor availability changes according to location of client signaling service availability routing policies and topology changes DNSMON and RIS/Route Views Consistent service (trustworthy, availability, …) data synchronization (consistent client response) node autonomy & self-sufficiency (no cascading failure, but more complex management) denial-of-service attack mitigation service compromise service hijacking
PERSPECTiVES
Results & Impact Infrastructure for flexible, adaptive placement and management of DNS authoritative name servers need a “playground” for placement and operational management Infrastructure as a Service (IaaS)? Full distributed vs. centralized coordination bounded by need to be operational or practical deployable operational costs vs. service and security DDoS & spoofed traffic DDoS mitigation trace spoofed traffic to “real” source