Traffic Monitoring, Estimation, and Engineering Nick Feamster CS 7260 February 19, 2007.

Slides:



Advertisements
Similar presentations
Ch. 12 Routing in Switched Networks
Advertisements

Multihoming and Multi-path Routing
Multihoming and Multi-path Routing
Network Monitoring and Security Nick Feamster CS 4251 Spring 2008.
Cisco S3 C5 Routing Protocols. Network Design Characteristics Reliable – provides mechanisms for error detection and correction Connectivity – incorporate.
1 Traffic Engineering (TE). 2 Network Congestion Causes of congestion –Lack of network resources –Uneven distribution of traffic caused by current dynamic.
1 EL736 Communications Networks II: Design and Algorithms Class3: Network Design Modeling Yong Liu 09/19/2007.
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Trajectory Sampling for Direct Traffic Observation Matthias Grossglauser joint work with Nick Duffield AT&T Labs – Research.
Traffic Engineering With Traditional IP Routing Protocols
Towards More Adaptive Internet Routing Mukund Seshadri Prof. Randy Katz.
1 Adapting Routing to the Traffic COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109) Jennifer Rexford Teaching Assistant: Mike Wawrzoniak.
Traffic Engineering Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
1 Traffic Engineering for ISP Networks Jennifer Rexford IP Network Management and Performance AT&T Labs - Research; Florham Park, NJ
Traffic Engineering in IP Networks Jennifer Rexford Computer Science Department Princeton University; Princeton, NJ
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Traffic Engineering for ISP Networks
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
CSEE W4140 Networking Laboratory Lecture 4: IP Routing (RIP) Jong Yul Kim
1 Deriving Traffic Demands for Operational IP Networks: Methodology and Experience Anja Feldmann*, Albert Greenberg, Carsten Lund, Nick Reingold, Jennifer.
Traffic Measurement for IP Operations Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Traffic Measurement for IP Operations Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Internet Routing (COS 598A) Today: Intradomain Traffic Engineering Jennifer Rexford Tuesdays/Thursdays.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Internet Routing (COS 598A) Today: Interdomain Traffic Engineering Jennifer Rexford Tuesdays/Thursdays.
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Impact of BGP Dynamics on Intra-Domain Traffic Patterns in the Sprint IP Backbone Sharad Agarwal, Chen-Nee Chuah, Supratik Bhattacharyya, Christophe Diot.
Measurement and Monitoring Nick Feamster Georgia Tech.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
Routing.
1 Traffic Engineering for ISP Networks Jennifer Rexford IP Network Management and Performance AT&T Labs - Research; Florham Park, NJ
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
Network-Wide Traffic Models for Managing IP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
MATE: MPLS Adaptive Traffic Engineering Anwar Elwalid, et. al. IEEE INFOCOM 2001.
Traffic: Monitoring, Estimation, and Engineering Nick Feamster CS 7260 February 14, 2007.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Transit Traffic Engineering Nick Feamster CS 6250: Computer Networks Fall 2011.
Traffic Measurement for IP Operations
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
Network Flow-Based Anomaly Detection of DDoS Attacks Vassilis Chatzigiannakis National Technical University of Athens, Greece TNC.
IP is a Network Layer Protocol Physical 1 Network DataLink 1 Transport Application Session Presentation Network Physical 1 DataLink 1 Physical 2 DataLink.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Using Measurement Data to Construct a Network-Wide View Jennifer Rexford AT&T Labs—Research Florham Park, NJ
Lecture 4: BGP Presentations Lab information H/W update.
Chapter 9. Implementing Scalability Features in Your Internetwork.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
1 Internet Routing. 2 Terminology Forwarding –Refers to datagram transfer –Performed by host or router –Uses routing table Routing –Refers to propagation.
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
Controlling the Impact of BGP Policy Changes on IP Traffic Jennifer Rexford IP Network Management and Performance AT&T Labs – Research; Florham Park, NJ.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Trajectory Sampling for Direct Traffic Oberservation N.G. Duffield and Matthias Grossglauser IEEE/ACM Transactions on Networking, Vol. 9, No. 3 June 2001.
Open-Eye Georgios Androulidakis National Technical University of Athens.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Backbone.
Mike Freedman Fall 2012 COS 561: Advanced Computer Networks Traffic Engineering.
1 Slides by Yong Liu 1, Deep Medhi 2, and Michał Pióro 3 1 Polytechnic University, New York, USA 2 University of Missouri-Kansas City, USA 3 Warsaw University.
1 Chapter 14-16a Internet Routing Review. Chapter 14-16: Internet Routing Review 2 Introduction Motivation: Router performance is critical to overall.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
ROUTING ON THE INTERNET COSC Jun-16. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
Controlling the Impact of BGP Policy Changes on IP Traffic
Routing.
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
Backbone Traffic Engineering
Routing.
Presentation transcript:

Traffic Monitoring, Estimation, and Engineering Nick Feamster CS 7260 February 19, 2007

2 Administrivia Turn in project proposals by midnight tonight If you are still at a loss for project ideas, come see me at office hours today

3 Today’s Topics Traffic monitoring –Flow monitoring Mechanics Limitations –Estimation / Problem formulation Traffic engineering –Intradomain (today’s paper) –Interdomain

4 Traffic Flow Statistics Flow monitoring (e.g., Cisco Netflow) –Statistics about groups of related packets (e.g., same IP/TCP headers and close in time) –Recording header information, counts, and time More detail than SNMP, less overhead than packet capture –Typically implemented directly on line card

5 What is a flow? Source IP address Destination IP address Source port Destination port Layer 3 protocol type TOS byte (DSCP) Input logical interface (ifIndex)

6 Cisco Netflow Basic output: “Flow record” –Most common version is v5 –Latest version is v10 (RFC 3917) Current version (10) is being standardized in the IETF (template-based) –More flexible record format –Much easier to add new flow record types Core Network Collection and Aggregation Collector (PC) Approximately 1500 bytes flow records Sent more frequently if traffic increases

7 Flow Record Contents Source and Destination, IP address and port Packet and byte counts Start and end times ToS, TCP flags Basic information about the flow… …plus, information related to routing Next-hop IP address Source and destination AS Source and destination prefix

8 flow 1flow 2flow 3 flow 4 Aggregating Packets into Flows Criteria 1: Set of packets that “belong together” –Source/destination IP addresses and port numbers –Same protocol, ToS bits, … –Same input/output interfaces at a router (if known) Criteria 2: Packets that are “close” together in time –Maximum inter-packet spacing (e.g., 15 sec, 30 sec) –Example: flows 2 and 4 are different flows due to time

9 Netflow Processing 1.Create and update flows in NetFlow Cache Inactive timer expired (15 sec is default) Active timer expired (30 min (1800 sec) is default) NetFlow cache is full (oldest flows are expired) RST or FIN TCP Flag Heade r Export Packet Payload (flows) 2.Expiration 3.Aggregation? e.g. Protocol-Port Aggregation Scheme becomes 4.Export Version Yes No Aggregated Flows – export Version 8 or 9 Non-Aggregated Flows – export Version 5 or 9 5.Transport Protocol

10 Reducing Measurement Overhead Filtering: on interface –destination prefix for a customer –port number for an application (e.g., 80 for Web) Sampling: before insertion into flow cache –Random, deterministic, or hash-based sampling –1-out-of-n or stratified based on packet/flow size –Two types: packet-level and flow-level Aggregation: after cache eviction –packets/flows with same next-hop AS –packets/flows destined to a particular service

11 Packet Sampling Packet sampling before flow creation (Sampled Netflow) –1-out-of-m sampling of individual packets (e.g., m=100) –Create of flow records over the sampled packets Reducing overhead –Avoid per-packet overhead on (m-1)/m packets –Avoid creating records for a large number of small flows Increasing overhead (in some cases) –May split some long transfers into multiple flow records –… due to larger time gaps between successive packets time not sampled two flows timeout

12 Problems with Packet Sampling Determining size of original flows is tricky –For a flow originally of size n, the size of the sampled flow follows a binomial distribution –Extrapoliation can result in big errors –Much research in reducing such errors (upcoming lectures) Flow records can be lost Small flows may be eradicated entirely

13 Flow-Level Sampling Sampling of flow records evicted from flow cache –When evicting flows from table or when analyzing flows Stratified sampling to put weight on “heavy” flows –Select all long flows and sample the short flows Reduces the number of flow records –Still measures the vast majority of the traffic Flow 1, 40 bytes Flow 2, bytes Flow 3, 8196 bytes Flow 4, bytes Flow 5, 532 bytes Flow 6, 7432 bytes sample with 100% probability sample with 0.1% probability sample with 10% probability

14 Accuracy Depends on Phenomenon Even naïve random sampling probably decent for capturing the existence of large flows Accurately measuring other features may require different approaches –Sizes of large flows –Distribution of flow sizes –Existence of small flows (coupon collection) –Size of small flows –Traffic “matrix”

15 Recent Work: Targeted Sampling Set sampling rates based on estimate of flow sizes or classes Flow sampling: Size-dependent flow sampling (Duffield) Packet sampling: Sketch-Guided Sampling (Kumar), Flexible Sampling (Ramachandran)

16 Motivation: Recover Structure Structure of “who’s ralking to whom” can reveal helpful information about traffic patterns Can be difficult to get this from sampled flow records if the flows themselves are small “BLINC: Multi-level Traffic Classification in the Dark”, SIGCOMM 2004

17 Idea: Augment Sampling with Counting Estimate, online, the type/size of a flow to which a packet belongs Adjust sampling rate according to corresponding class

18 Stealing Packets from Large Flows Some packet samples can be “stolen” from large flows without incurring much estimation error Reallocating to small flows means more unique “conversations” captured

Traffic Engineering

20 Traffic Engineering Motivation Efficient use of resources (capacity, etc.) Response to dynamic network conditions –Changes in offered traffic load –Changes in capacity –Link failures, etc. Routing protocols do not (typically) automatically adapt to these changing conditions

21 Traffic Engineering Formulation Topology Routing configuration eBGP routes Offered traffic Routing model Traffic flow through the network

22 Two Types of Traffic Engineering Intradomain: Within a single AS –More “traditional” optimization problem –Challenges are in updating link weights Interdomain: Between multiple ASes –Ill-formed problem –Many more unknowns

23 Challenge: Dynamic Conditions Link state updates –High update rate leads to high overhead –Low update rate leads to oscillation Many connections are short –Average Web transfer is just 10 packets –Requires high update rates to ensure stability Information goes stale as network conditions change. Approaches for dealing with this?

24 Challenge: Cost of Reconfiguration Minimize number of changes to the network –Changing just 1 or 2 link weights is often enough Limit frequency of changes to the weights –Joint optimization for day & night traffic matrices Tolerate failure of network equipment –Weights settings usually remain good after failure –… or can be fixed by changing one or two weights Limit dependence on accuracy and dynamics

25 “Traditional” Intradomain TE Routers flood information to learn topology –Determine “next hop” to reach other routers… –Compute shortest paths based on link weights Link weights configured by network operator

26 Approaches: Local vs. Network-Wide Local –Proportional to physical distance Cross-country links have higher weights Minimizes end-to-end propagation delay –Inversely proportional to link capacity Smaller weights for higher-bandwidth links Attracts more traffic to links with more capacity Network-Wide –Network-wide optimization of the link weights –Directly minimizes metrics like max link utilization –Assumes accesss to network-wide information

27 Approaches: Offline vs. Dynamic Offline –Monitor trends and macro patterns in utilization –Perform offline optimization –Advantage: No protocol modification. No stability issues. Flexibility –Disadvantage: Perhaps less reactive. Dynamic –Routers themselves monitor link utilization –Distributed/local reoptimization of link weights and route selection –Advantage: More reactive. –Disadvantage: Protocol modifications, stability issues, etc.

28 Traffic Engineering Approaches OfflineDynamic CentralizedIntradomain: Fortz/Rexford02 Interdomain: Feamster03 Routing control platform (Open questions in this area) DistributedIntradomain: Katabi (TeXCP) Interdomain: Mahajan (NP, nexit)

29 Formalization Input: Graph G with unidirectional links l and capacities for each link Input: Objective function Input: traffic matrix –M i,j : traffic load from source i to destination j –For intradomain TE, source/destination are routers Output: setting of the link weights –W l : weight on unidirectional link l –P i,j,l : fraction of traffic from i to j traversing link l

30 Objective Functions Take 1: Minimize maximum utilization –Individual bottleneck links may impose unrealistic/overly strong constraints Take 2: Fractional relation between load and total capacity

31 4Mbps 3Mbps5Mbps Traffic Matrix Estimation Sources Destinations From link counts to the traffic matrix

32 Formalization Source-destination pairs –p is a source-destination pair (of nodes) –x p is the (unknown) traffic volume for this pair Links in the network –l is a unidirectional edge –y l is the observed traffic volume on this link Routing –R lp = 1 if link l is on the path for src-dest pair p –Or, R lp is the proportion of p’s traffic that traverses l y = Rx (now work backwards to get x)

33 Estimation is Under-Constrained Linear system is underdetermined –Number of nodes n –Number of links e is around O(n) –Number of src-dest pairs c is O(n 2 ) –Dimension of solution sub-space at least c - e Multiple observations can help –k independent observations (over time) –Stochastic model with src-dest counts Poisson & i.i.d –Maximum likelihood estimation to infer traffic matrix –Use NetFlow to augment byte counts can help constrain the problem Lots of recent work on traffic matrix estimation

34 Questions for Intra-Domain TE Is the objective function reasonable? What about: –Resilience to failure? –Use of available capacity? How stable/fragile is the resulting optimal solution when traffic demands change? –Fluctuating traffic demands Setting link weights in this fashion is unintuitive and often not used in practice? Is there a better way to set link weights?

Interdomain Traffic Engineering

36 Interdomain Traffic Engineering Why? –Alleviating congestion on edge links –Adapting to provisioning changes –Achieving good end-to-end performance How? –Directing traffic to a different neighbor AS –Directing traffic to different links to the same neighbor

37 Overview of Method (Outbound) Change outbound traffic using BGP import policy Requirements –Minimal overhead (management and messages) –Predictable changes in traffic –No effect on neighboring ASes’ routing decisions

38 Why Interdomain TE is Hard Scale: Can’t set independent policy for 200k+ prefixes –Configuration overhead –Traffic instability Predictability: Policy-based adjustments are indirect Control: Neighbors’ behavior can affect traffic volumes in unpredictable and uncontrollable ways

39 Why Interdomain TE with BGP is Hard Protocol problems –No performance metrics in advertisement attributes Configuration problems –Not possible to express conjunction between attributes –Indirect influence Route selection problems –One best route per prefix per router –Can’t split traffic to a prefix over paths of different lengths –Interaction with IGP Commercial relationship problems

40 Managing Scale Problem: Large number of prefixes Solution: Change policies for small fraction of prefixes that are responsible for most traffic 10% of prefixes responsible for 70% of traffic More concentrated for origin ASes.

41 Achieving Predictability Problem: Traffic volumes change over time Solution: Change policies for prefxes that have more stable traffic volumes. Origin ASes responsible for 10% of inbound traffic don’t fluctuate by more than a factor or two between weeks.

42 Acheving Predictability Problem: Internal changes that are externally visible can change inbound traffic volumes Solution: Shift traffic among paths –To the same AS –To a different AS, but one with the same path length