Self-Managed Networks: Dream or Reality? Jawad Khaki Corporate Vice President Windows Networking & Device Technologies
Current Situation Management is expensive Devices only understand low-level settings Diagnostics/monitoring is primitive Need a comprehensive network solution ISP, hotspot EnterpriseHome
IT Complexity & Cost IT Budgets
Pain Points Complexity due to inconsistency Heterogeneous world Different configuration models Variety of monitoring techniques Version/vendor specific repair procedures Hard to understand dependencies Networking problems are a significant cause of overall service failure (Oppenheimer, USITS’03) Network causes 15% of all problems resulting in downtime (Forrester survey of IT pros)
Not humanly solvable Operator error is largest cause of service failures in some environments (Oppenheimer, USITS’03) 40% of downtime is due to human operators (Candea, ’03) In many environments, operator may not be tech savvy (e.g., home) or even immediately available (e.g., space, sensor nets). Consumer networking support calls are time consuming, e.g., power cycle router/modem = avg 53 min (MS PSS)
End-to-End Approach Essential Apps/users understand behavior desired Network admins understand high-level design goals/constraints The dream is to integrate end-user knowledge and administrative goals
Big Dreams Self-managing networks Self-deploying and self-cleaning Self-configuring and self-adapting Self-optimizingSelf-protectingSelf-monitoringSelf-diagnosingSelf-healing Prevention more than cure A self-* system requires knowledge of itself and its environment, it is self-aware
Some Real Examples Today Policy distribution systems allow auto- deployment of configuration across a network Routing protocols auto-adapt to topology changes and failures TCP auto-adapts to congestion
Demos
Product Engineering Challenge Design for experience End user: Focus on the task not technology Network manager: Design, deploy, operate Must get the fundamentals right Essential to think through scenarios Work flow IntelligenceEnvironment Always keeping the customer in mind
Hard issues
Multiple administrative organizations Different relationships PeersCustomer-providerArbitrary Lack of trust motivates privacy constraints Unaligned goals means configuration is a challenge
Possibility of catastrophic failure Defect in automation can have disastrous results “Rogue equipment can create a monster headache. It can easily waste a million dollars of resources.” -IT admin, large LA corporation Broadcast storms due to protocol or software bugs (Spurgeon, 1989) One router vendor tried offering automated config repair features, but found that customers were afraid to deploy it Possibility of exploitation by malware
Tension between control and automation Flexibility of business models and preferred treatments Compliance requirements Job security for operators Natural aversion to loss of control Change to unfamiliar technology
Need to find the right balance Policy to express high-level constraints Self-management within those constraints Control Automation Static routes Static addresses etc Dynamic routing Dynamic addresses etc BALANCE
Summary Innovation in fundamentals just as important as new scenarios Make secure, effortless, reliable, efficient operation the forethought Let humans succeed at what they’re good at Let’s solve the hard issues
Dealing with heterogeneity of device types and vendors Hard to visualize existing state and dependencies Expensive to maintain multiple configuration/monitoring systems Need for common solutions Simplicity Heterogeneity
Dealing with poorly written applications “Some applications need to know what machine a person is on...we found that giving the docking stations a static IP address and the laptop a static IP address makes it easier for us.” (IT Admin, Medium Org, New York)