Hierarchical Clustering and Network Topology Identification Rui Castro Mark Coates Rob Nowak Department of Electrical and Computer Engineering Copyright © 2004 - Rui Castro
Topology Identification Ratnasamy & McCanne (1999) Duffield, et al (2000,01,02) Bestravos, et al (2001) Coates, et al (2001) Shih & Hero (2002) Pairwise delay measurements reveal topology Copyright © 2004 - Rui Castro
Topology Identification Challenges: 12 % never respond,15 % multiple interfaces - Barford et al (2000) detect level-2 topology “invisible” to IP layer (e.g., switches) Copyright © 2004 - Rui Castro
Relationship between Topology ID and Hierarchical Clustering Copyright © 2004 - Rui Castro
Do not need clock synchronization!! Sandwich Probing Do not need clock synchronization!! Copyright © 2004 - Rui Castro
Sandwich Probing we can infer that receivers 3 & 4 have a longer shared path than 3 & 5 Topology imposes constraints more shared queues larger Copyright © 2004 - Rui Castro
Delay Covariance more shared queues larger covariance Copyright © 2004 - Rui Castro
Measurement Framework Key Assumptions: stationarity fixed (but unknown) routes temporal independence spatial independence Multiple measurements individual measurement CLT Copyright © 2004 - Rui Castro
Maximum Likelihood Tree - MLT The maximum likelihood tree (MLT) is defined as where product of Gaussian densities Two Approaches: Binary tree construction based on bottom-up, recursive selection and pair-merging process Markov Chain Monte Carlo (MCMC) tree search measurements unknown similarity metric values, measurement likelihood forest of possible trees, monotonicity constrain set, for tree Copyright © 2004 - Rui Castro
Internet Experiments – Sandwich Probing Traceroute topology MCMC topology UNO ALT topology Copyright © 2004 - Rui Castro
Internet Experiments – RTT Delay Covariance Traceroute topology Estimated topology Thanks to Yolanda Tsang & Mehmet Yildiz Copyright © 2004 - Rui Castro
Final Remarks and Comments Clever probing and sampling schemes reveal “hidden” network structure and behavior Likelihood based methods are a natural choice to account for uncertainty in the data Sampling methods relying solely on RTT can be devised R. Castro, M. Coates and R. Nowak, "Likelihood Based Hierarchical Clustering", IEEE Transactions in Signal Processing, August 2004. R. Castro, M. Coates, G. Liang, R. Nowak and B. Yu, "Network Tomography: Recent Developments", Statistical Science, 2004 (invited paper, to appear). Complex interplay between measurement/probing techniques, statistical modeling, and computational methods for optimization Copyright © 2004 - Rui Castro