CS 4700 / CS 5700 Network Fundamentals Lecture 17: Network Modeling (Not Everyone has a Datacenter)

Wide-Area Network Research  Most research now focused on large-scale systems  Challenges: testing and evaluation  How to perform wide-area tests in a repeatable, reliable manner  ModelNet, Emulab  Challenge: understanding/capturing Internet topologies  Graph characterization: dK-series 2

 ModelNet  dK Outline 3

A Case for Network Emulation  Need a way to test large-scale Internet services Peer-to-peer, overlay networks, novel protocols  Testing in the real world PlanetLab… Results not reproducible or predictable Difficult to deploy and administer research software  Simulation tools Allows control over test environment May miss important system interactions  Emulation Emulators subject application traffic to end-to-end bandwidth constraints, latency, and loss rate of user specified topology Previous implementations not scalable 4

ModelNet  A scalable, cluster-based, comprehensive network emulation environment 5

Design  User run configurable number of instances of application on Edge Nodes within cluster  Each instance is a Virtual Edge Node (VN)  Each VN has a unique IP address  Edge nodes route traffic through cluster of Core Routers  Equipped with large memories, modified FreeBSD kernels  Core routers route traffic through emulated links or “pipes”  Each pipe has own packet queue and queuing discipline 6

ModelNet Phases  Create  Generates a network topology as a graph  From Internet traces, BGP dumps, synthetic topology generators, etc.  Annotate graph with loss rates, failure distributions…  Distillation  Transforms GMLgraph into pipe topology  Assignment  Maps pipe topology to core nodes, distributing emulation load across core nodes  Finding ideal mapping is NP-complete  ModelNet uses greedy k-clusters assignment For k core nodes, randomly select k nodes in distilled topology. Greedily select links from connected component in round robin 7

ModelNet Phases  Binding  Multiplex multiple VNs to each physical edge nodes  Bind each physical edge node to a core router  Generate shortest path routes between all VNs and install in core routing tables  Run  Executes target application code on edge nodes 8

Inside the Core  Route traffic through emulated “pipes”  Each route is an ordered list of pipes  Packets move through pipes by reference  Routing table requires O(n 2 ) space  Packet Scheduling  When packet arrives, put at tail of first pipe in its route.  Scheduler stores heap of pipes sorted by earliest deadline - exit time for first packet in its queue  Once every clock tick Traverse pipes in heap for packets that are ready to exit Move packets to tail of next pipe or schedule for delivery Calculate new deadlines  Multi-core Configuration  Next pipe in route may be on different machine  If so, core node tunnels packet descriptor to next node 9

Scalability Issues  Traffic traversing core is limited by cluster’s physical internal bandwidth  ModelNet must buffer up to full bandwidth-delay product of target network. 250 MB of packet buffer space to carry flows at aggregate bandwidth of 10 GB/s with 200 ms roundtrip latency.  Assumes perfect routing protocol 10

Baseline Accuracy  Want to insure that under load, packets are subject to correct end-to-end delays  Used kernel logging to track ModelNet performance and accuracy  Results show that by running ModelNet scheduler at highest kernel priority  Packets are delivered within 1ms of target end-to-end value  Accuracy is maintained up to 100% CPU usage 11

Scalability  Additional Cores  Adding core routers allows ModelNet to deliver higher throughput  Communication between core routers introduces overhead. Higher cross-core communication results in less throughput benefit  VN Multiplexing  Higher degrees of multiplexing enable larger network emulation  Inaccuracies introduced due to context switching, scheduling, resource contention, etc 12

Accuracy vs. Scalability  Reduce overhead by deviating from target network requirements  Changes should minimally impact application behavior  Ideally, system reports degree and nature of emulation inaccuracy 13

Scalability via Distillation  Pure hop-by-hop emulation  Distilled topology is isomorphic to target network  High per packet overhead  End-to-end distillation  Remove all interior network nodes  Collapse each path into single pipe  Latency = sum of latencies along path  Reliability = product of link reliabilities along path  Low per packet overhead  Does not emulate link contention along path 14

Time Dilation on Modelnet  The challenge  Need to emulate networks with more resources  E.g. fast CPU (20Ghz), large b/w networks (TB/s)  But only commodity machines available  Solution  Modelnet + time dilation via virtual machines  Run application inside single VMs  Slow down time inside VM  Result: everything looks faster/bigger/fatter More CPU cycles/time, packets/time, disk I/O /time 15

How It’s Done  Must isolate VM from outside measures of time  Time based on shared data structure provided by VMM  Scale data structure by a Time Dilation Factor (TDF)  Also scale hardware timer by TDF  How do we scale only some resources?  Slow the others back down!!  Example: speed up network by TDF=10  B/w increases by 10, but delay dec by 10 So inc delay by 10 Virtual Machine Monitor (VMM) Node Mgr Local Admin VM 1 VM 2 VM n … 16

ModelNet Summary  ModelNet, antithesis of PlanetLab  Testing of unmodified applications  Reproducible results  Experimentation using broad range of network topologies and characteristics  Large scale experiments (thousands of nodes and gigabits of cross traffic)  Can scale to emulate non-existent resource levels  But what if you want real deployment on-demand?  Emulab / NetBed 17

Emulab / NetBed  A shared configuration on-demand testbed  What if you don’t have your own cluster  What if you need to test specific environments/HW?  What if you need this in 5 mins?  Emulab / NetBed  Hardware: 328 PCs, high speed Gb Cisco switches  Software: OS-loader and manager via web interface Wipe all disks, load OS-images, configure routers in <2 mins Reboot and give ssh access 18

Emulab Web Interface 19

 ModelNet  dK Outline 20

Importance of Network Topology  Access to real-world network topologies is vital for research  New routing and other protocol design, development, testing, etc.  Analysis: performance of a routing algorithm strongly depends on topology  Generation: empirical estimation of scalability  Network robustness, resilience under attack, worm spreading, etc. 21

Network Topology Research 22 Static Topologies DynamicTopologies

Trade Secrets 23  Unfortunately, large scale network topologies are often proprietary  Think about BGP  ISPs want to hide their internal topology  Real datasets are rare  Small scale  Out of date  Static (i.e. not dynamic)

Towards Synthetic Topologies 24  Question: can we use graph models to capture real network topologies?  Fit a model to a real topology  Use a generator to produce synthetic topologies that are similar, but not identical to the real topology  Benefits  Privacy – synthetic graphs are not proprietary  Randomization – produce an infinite number of stochastic snapshots  Scalable – generator can produce similar topologies of any size

Important Topology Metrics  Degree distribution  Clustering  Assortativity  Distance distribution  Betweenness distribution Problems  No way to reproduce most of the important metrics  No guarantee there will not be any other/new metric found important Problems  No way to reproduce most of the important metrics  No guarantee there will not be any other/new metric found important 25

The Approach  Look at inter-dependencies among topology characteristics  See if by reproducing most basic, simple, characteristics, we can also reproduce all other characteristics, including practically important  Try to find the characteristic(s) that define all others 26

Definition of dK-distributions dK-distributions are degree correlations within simple connected graphs of size d  For example  1K distribution correlations between node degree distribution  2K distribution correlations on joint node degree distribution  3K distribution correlations on clustering coefficient 27

An Example of dK  xK is distribution of subgraphs with particular degrees  dK-1 describes node degree distribution  dK-2 describes joint node degree distribution  dK-3 captures clustering coefficient 28 dk-0: average degree=2 dk-1: P(1)=1, P(2)=2, P(3)=1 dk-2: P(1,3)=1, P(2,2)=1, P(2,3)=2 dk-3: P(1,3,2)=2, P(2,2,3)=1 28

Nice properties of dK-series  Constructability: we can construct graphs having properties P d (dK-graphs)  Inclusion: if a graph has property P d, then it also has all properties P i, with i < d (dK-graphs are also iK-graphs)  Convergence: the set of graphs having property P n consists only of one element, G itself (dK-graphs converge to G) Guarantees that all (even not yet defined!) graph metrics can be captured by sufficiently high d 29

Inclusion and dK-randomness 2K 0K 0K-random 1K Given G 1K-random nK 2K-random 30

How Do We Generate Graphs?  A number of different approaches  Stochastic  Pseudograph  Matching  Rewriting  Some are extensible to d=3, others are not  New research proposed d=2.5, to make generation tractible 31

Stochastic approach  Classical (Erdos-Renyi) random graphs are 0K-random graph in the stochastic approach  Easily generalizable for any d:  Reproduce the expected value of the dK-distributions by connecting random d-plets of nodes with (conditional) probabilities extracted from G  Best for theory  Worst in practice 32

Pseudograph approach  Reproduces dK-distributions exactly  Constructs not necessarily connected pseudographs  Extended for d = 2  Failed to generalize for d > 2: d-sized subgraphs start overlap over edges at d = 3 33

Pseudograph details 1K 1. dissolve graph into a random soup of nodes 2. crystallize it back 2K 1. dissolve graph into a random soup of edges 2. crystallize it back k1k1 k2k2 k1k1 k2k2 k3k3 k4k4 k1k1 k1k1 k 1 k 1 -ends 34

dK-Randomizing Rewiring  Can generate random graphs from original  Generalizes to any d  But cannot generate desired graph from dK-distributions 35

Algorithms  All algorithms deliver consistent results for d = 0  All algorithms, except stochastic(!), deliver consistent results for d = 1 and d = 2  Both rewiring algorithms deliver consistent results for d = 3  Eventual choice  Use pseudograph to construct 1K graphs  Use targeted rewriting to build higher d graphs 36

Skitter Scalar Metrics Metric0K1K2K3Kskitter 6.316.346.29 r0-0.24 0.0010.250.290.46 d5.173.113.083.093.12 dd 0.270.40.35 0.37 1 0.20.030.150.1 n-1 1.81.971.851.9 37

HOT Scalar Metrics Metric0K1K2K3KHOT 2.472.592.182.10 r-0.05-0.14-0.23-0.22 0.0020.0090.00100 d8.484.416.326.556.81 dd 1.230.720.710.840.57 1 0.010.0340.0050.004 n-1 1.9891.9671.9961.997 38

HOT 0K 39 True HOT GraphHOT 0K

CS 4700 / CS 5700 Network Fundamentals Lecture 17: Network Modeling (Not Everyone has a Datacenter)

Similar presentations

Presentation on theme: "CS 4700 / CS 5700 Network Fundamentals Lecture 17: Network Modeling (Not Everyone has a Datacenter)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 4700 / CS 5700 Network Fundamentals Lecture 17: Network Modeling (Not Everyone has a Datacenter)

Similar presentations

Presentation on theme: "CS 4700 / CS 5700 Network Fundamentals Lecture 17: Network Modeling (Not Everyone has a Datacenter)"— Presentation transcript:

Similar presentations

About project

Feedback