Inference, monitoring and recovery of large scale networks CSE Department PennState University Institute for Networking and Security Research Faculty:

Slides:



Advertisements
Similar presentations
ABSTRACT Due to the Internets sheer size, complexity, and various routing policies, it is difficult if not impossible to locate the causes of large volumes.
Advertisements

Shi Bai, Weiyi Zhang, Guoliang Xue, Jian Tang, and Chonggang Wang University of Minnesota, AT&T Lab, Arizona State University, Syracuse University, NEC.
Markov Game Analysis for Attack and Defense of Power Networks Chris Y. T. Ma, David K. Y. Yau, Xin Lou, and Nageswara S. V. Rao.
Cascading failures in interdependent networks and financial systems -- Departmental Seminar Xuqing Huang Advisor: Prof. H. Eugene Stanley Collaborators:
1 Sensor Relocation in Mobile Sensor Networks Guiling Wang, Guohong Cao, Tom La Porta, and Wensheng Zhang Department of Computer Science & Engineering.
PROMISE: Peer-to-Peer Media Streaming Using CollectCast Mohamed Hafeeda, Ahsan Habib et al. Presented By: Abhishek Gupta.
1/14 Ad Hoc Networking, Eli M. Gafni and Dimitri P. Bertsekas Distributed Algorithm for Generating Loop-free Routes in Networks With Frequently.
1 Estimating Shared Congestion Among Internet Paths Weidong Cui, Sridhar Machiraju Randy H. Katz, Ion Stoica Electrical Engineering and Computer Science.
 Don Towsley 2000 Network Tomography for the Internet: Open Problems D. Towsley U. Massachusetts.
Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.
An Algebraic Approach to Practical and Scalable Overlay Network Monitoring Yan Chen, David Bindel, Hanhee Song, Randy H. Katz Presented by Mahesh Balakrishnan.
NetQuest: A Flexible Framework for Internet Measurement Lili Qiu Joint work with Mike Dahlin, Harrick Vin, and Yin Zhang UT Austin.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Informed Detour Selection Helps Reliability Boulat A. Bash.
Path Protection in MPLS Networks Ashish Gupta Design and Evaluation of Fault Tolerance Algorithms with Performance Constraints.
Exploring Tradeoffs in Failure Detection in P2P Networks Shelley Zhuang, Ion Stoica, Randy Katz HIIT Short Course August 18-20, 2003.
Tomography-based Overlay Network Monitoring and its Applications Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley.
1 A Suite of Schemes for User-level Network Diagnosis without Infrastructure Yao Zhao, Yan Chen Lab for Internet and Security Technology, Northwestern.
Exploring Tradeoffs in Failure Detection in P2P Networks Shelley Zhuang, Ion Stoica, Randy Katz Sahara Retreat January, 2003.
Network Tomography through End- End Multicast Measurements D. Towsley U. Massachusetts collaborators: R. Caceres, N. Duffield, F. Lo Presti (AT&T) T. Bu,
Tomography-based Overlay Network Monitoring and its Applications Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley.
Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,
Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC.
Root cause analysis of BGP routing dynamics Matt Caesar, Lakshmi Subramanian, Randy H. Katz.
Comparison Between Communication Infrastructures of Centralized and Decentralized Wide Area Measurement Systems Mohammad Shahraeini, Mohammad Hossein Javidi,
University of Kansas A KTEC Center of Excellence 1 Soshant Bali *, Yasong Jin **, Victor S. Frost * and Tyrone Duncan ** Information and Telecommunication.
A victim-centric peer-assisted framework for monitoring and troubleshooting routing problems.
Scalable and Deterministic Overlay Network Diagnosis Yao Zhao, Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Dept. of Computer.
Network Measurement Bandwidth Analysis. Why measure bandwidth? Network congestion has increased tremendously. Network congestion has increased tremendously.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
ACTION PROPOSAL FOR FLYWHEEL ENERGY TECHNOLOGY Enhance future grid reliability, interoperability, & extreme event protection In 20 years, the flywheel.
1 Meeyoung Cha, Sue Moon, Chong-Dae Park Aman Shaikh Placing Relay Nodes for Intra-Domain Path Diversity To appear in IEEE INFOCOM 2006.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Optimal Power Control, Rate Adaptation and Scheduling for UWB-Based Wireless Networked Control Systems Sinem Coleri Ergen (joint with Yalcin Sadi) Wireless.
Overlay Network Physical LayerR : router Overlay Layer N R R R R R N.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.
Wireless Mesh Network 指導教授:吳和庭教授、柯開維教授 報告:江昀庭 Source reference: Akyildiz, I.F. and Xudong Wang “A survey on wireless mesh networks” IEEE Communications.
Network Survivability Against Region Failure Signal Processing, Communications and Computing (ICSPCC), 2011 IEEE International Conference on Ran Li, Xiaoliang.
Keynote Presentation to Networking and Security Research Center (NSRC) Industry Day 2012 Robert A. Kehlet Basic and Applied Sciences J9 Research and Development.
Korea Advanced Institute of Science and Technology Network Systems Lab. 1 Dual-resource TCP/AQM for processing-constrained networks INFOCOM 2006, Barcelona,
Impact of Topology on Overlay Multicast Suat Mercan.
Using Virtual Links to Discover Network Topology Brett Holbert, Thomas F. La Porta Topology Discovery -Network topology may only be partially known -Want.
On Survivable Routing of Mesh Topologies in IP-over-WDM Networks Maciej Kurant, Patrick Thiran EPFL, Switzerland Infocom 2005, March 13-17, Miami.
Urban Infrastructure and Its Protection Responding to the Unexpected Interest Group Report Group Members G. Giuliano (USC), Jose Holguin-Veras (CUNY),
Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan.
A Light-Weight Distributed Scheme for Detecting IP Prefix Hijacks in Real-Time Lusheng Ji†, Joint work with Changxi Zheng‡, Dan Pei†, Jia Wang†, Paul Francis‡
Ahmed Osama Research Assistant. Presentation Outline Winc- Nile University- Privacy Preserving Over Network Coding 2  Introduction  Network coding 
1 A Framework for Measuring and Predicting the Impact of Routing Changes Ying Zhang Z. Morley Mao Jia Wang.
Resilient Overlay Networks Robert Morris Frans Kaashoek and Hari Balakrishnan MIT LCS
N. Hu (CMU)L. Li (Bell labs) Z. M. Mao. (U. Michigan) P. Steenkiste (CMU) J. Wang (AT&T) Infocom 2005 Presented By Mohammad Malli PhD student seminar Planete.
On Selfish Routing In Internet-like Environments Lili Qiu (Microsoft Research) Yang Richard Yang (Yale University) Yin Zhang (AT&T Labs – Research) Scott.
Lecture 8: Wireless Sensor Networks
Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.
Research Direction Advisor: Frank,Yeong-Sung Lin Presented by Jia-Ling Pan 2010/10/211NTUIM OPLAB.
University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.
Smart Grid Schneider Electric Javier Orellana
NetQuest: A Flexible Framework for Large-Scale Network Measurement Lili Qiu University of Texas at Austin Joint work with Han Hee Song.
Tunable QoS-Aware Network Survivability Presenter : Yen Fen Kao Advisor : Yeong Sung Lin 2013 Proceedings IEEE INFOCOM.
Locating network monitors: complexity, heuristics, and coverage Kyoungwon Suh Yang Guo Jim Kurose Don Towsley.
Indian Institute of Technology Bombay 1 Communication Networks Prof. D. Manjunath
Urban Infrastructure and Its Protection Responding to the Unexpected Interest Group Report.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
Smart Grid Vision: Vision for a Holistic Power Supply and Delivery Chain Stephen Lee Senior Technical Executive Power Delivery & Utilization November 2008.
Placing Relay Nodes for Intra-Domain Path Diversity Meeyoung Cha Sue Moon Chong-Dae Park Aman Shaikh Proc. of IEEE INFOCOM 2006 Speaker 游鎮鴻.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Lecture 8: Wireless Sensor Networks By: Dr. Najla Al-Nabhan.
Zeyu You, Raviv Raich, Yonghong Huang (presenter)
Population lost resource
ISP and Egress Path Selection for Multihomed Networks
End-user Based Network Measurement and Diagnosis
Yiannis Andreopoulos et al. IEEE JSAC’06 November 2006
Presentation transcript:

Inference, monitoring and recovery of large scale networks CSE Department PennState University Institute for Networking and Security Research Faculty: Thomas La Porta Post-Doc: Simone Silvestri Ph.D. Students: Srikar Tati, Brett Holbert, Michael Lin

Problems and challenges in large scale networks 2 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 Research problems Inferencing Monitoring Recovery Challenges Large scale Partial information Interdependent networks Constraints (time, cost,..) This research is sponsored by: Defense Threat Reduction Agency (DTRA) Army Research Lab and UK Ministry of Defence - ITA Program Internet router level topology Merlin Tool

Inferencing: motivation 3 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 The lack of global knowledge of the Internet topology  Hinders network diagnostics (losses, failures, bottlenecks)  Inflates IP path lengths  Reduces accuracy of models  Encourages overlay networks to ignore underlay Network operators rarely publish their topologies Current inference approaches rely on tools such as Traceroute Traceroute provides only partial information The network is only partially observable Previous approaches fail or peform poorly Our problem : infer the routing topology in the presence of partial information

Inferencing: our approach - iTop 4 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 iTop algorithm:  Fills unobservable parts of the network with virtual links/routers  Analyzes the traces to determine properties of the real topology  Iteratively merges links to infer the real network Ground Truth topology Virtual topology Merging algorithm iTop + Inferred topology Trace analysis

Inferencing: our approach - Results 5 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 We compare our approach to state-of-art inferencing approaches:  X. Jin, W.-P. Yiu, S.-H. Chan, and Y. Wang, “Network topology inference based on end-to-end measurements,” IEEE Journal on Selected Areas in Communications, vol. 24, no. 12, pp. 2182–2195, 2006  B. Yao, R. Viswanathan, F. Chang, and D. Waddington, “Topology inference in the presence of anonymous routers,” IEEE Infocom, We consider realistic networks We also show how iTop improves the performance of failure diagnosis algorithms in the presence of partial information

Monitoring: motivation (1) 6 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 Accurate knowledge of the internal network state enables  Performance diagnosis  Resoruce allocation  Efficient routing  Congestion control Monitoring large scale networks may incur high overhead Network tomography  Infer internal network from end-to-end measurements  Solve a linear system  Enables efficient monitoring probing only a basis of the system =

Monitoring: motivation (2) 7 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 Failures are common events in modern networks Failures can significantly affect the performance of network tomography Probing incurs a cost, often a maximum budget is available Our problem : select a set of probing paths to maximize the performance of network tomography under failures with a limited budget

Monitoring: our approach 8 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 We translate the problem into a maximization of a submodular function under budget constraint We propose the algorithm RoMe  Makes use of recent advances in submodular maximiztion theory  Has an approximation factor (1-1/e)/2  It is optimal with additional constraint of linear independency  Assumes knowledge of the failure distribution We consider the case of unknown failure distribution We propose the algorithm LSR (Learning with Submodular Rewards)  Reinforcement learning approach  Learns path availabilities  Performance guarantees Init Update path availabilities Select paths Collect measurements

Monitoring: results 9 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 We compare our approach to state-of-art path selection algorithms  Y. Chen, D. Bindel, H. Song, and R. H. Katz, “An algebraic approach to practical and scalable overlay network monitoring,” ACM SIGCOMM Comp. Com. Rev., We consider realistic topologies and failure models

Recovery: motivation 10 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 Modern networks are highly interdependent  The Internet and the smart grid  Water supply, transportaion, fuel and power stations are coupled together Interdependent networks are extremely sensitive to failures Failures may create performance degradation Degradation can also propagate in the surviving network Electrical blackout that occurred in Italy in September 2003

Recovery: research problems (1) 11 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 Recovery algorithms for overlay networks Two networks sharing the same infrastructure Failures occur in the underlay network and affect the overlay Models an emergency urban communication network after a weapon of mass destruction attack We aim at restoring the functionality of the overlay network repairing the underlay Objectives & constrains  Bandwith  Time  Cost  Utility

Recovery: research problems (2) 12 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 Models for temporal propagation of failures Two general interdependent networks Failures propagate over time  Backup batteries/generators  Local solar plant supply Given the initial failure our model will:  Estimate the probability that one element fails at a given time  Estimate the expected time at which one element fails  Estimate the expected number of failed elements at a given time These information will be used to design recovery strategies These models will be mapped and validated with real interdependent networks

Recovery: research problems (3) 13 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014 Improve network robustness:  Re-design existing networks  Design new networks less prone to cascading effects Models and recovery strategies for performance degradation over time Partial knowledge Partial control Multiple interdependent networks

Thank you! Any question? 14 Inference, monitoring and recovery of large scale networks INSR Industry Day 2014