1 NetProfiler: Profiling Networks From the Edge Venkat Padmanabhan Microsoft Research June 2005 With Sharad Agarwal (MSR), Jitu Padhye (MSR), Dilip Joseph (UCB), Sriram Ramabhadran (UCSD)
2 Motivation: End Users Users have little info or recourse when they experience network problems Why the failure? website, ISP, client site? is it just me? How am I faring over the long term? switch ISPs?
3 Motivation: Network Operators AT&T Microsoft Sprint UUNe t MS SVC MS UK MS India Operators have little visibility into end-user network experience Enterprise networks: adequately provisioned? health of wireless LAN? Consumer ISPs how are users in Boston faring? Network health?
4 NetProfiler Goal: remedy the situation by leveraging passive observation of normal end-to-end network communication at the “edge” to “profile” the network. Edge = client hosts distributed around the network Profile = monitor + deconstruct (+ diagnose) Turn the Internet into a sensor network
5 NetProfiler Overview Key idea: leverage peer cooperation share network experience info across end hosts draw inferences based on correlation Observations automate what expert users do manually unlike traditional P2P applications Complements previous work network infrastructure monitoring active probing server-based monitoring network tomography
6 Architecture Sensing: glean info from existing communication TCP, web, , streaming, etc. quantify the user’s network experience −web download failure, e2e delay Aggregation: based on attributes (website, proxy, domain pair) tradeoff between privacy and data integrity Inference: distributed blame attribution assign credit/blame equally to all entities involved use mass of info from diverse vantage points to make inference
7 Measurement Study Goal: characterize end-to-end web access failures make inferences based on shared observations Testbed: 134 clients worldwide −academic, corporate, dialup, broadband 80 websites worldwide Month-long experiment (Jan ‘05) synthetic workload: each client downloads top level “index” file from each website ~4 times an hour
8 Basic Findings Findings based on local observations Transaction failure rate: % TCP conn failures: 57-64%, DNS failures: 34-42% −DNS: dominated by LDNS reachability problems (76-83%) −TCP: dominated by conn establishment failures (41-79%) Correlation analyses to shed more light on the nature of failures Server-side or client-side Proxy-related
9 Classification of Connection Failures Connection failures are dominated by server-side problems
10 End-to-End Failures vs. BGP Instability Severe BGP instability is rare but has E2E impact when it happens.
11 Proxy-related Problem Clients behind proxy see significantly higher failure rate Server:
12 Conclusion NetProfiler leverages edge perspective to monitor network health & infer cause of problems Targeted at both end users and operators More info: