B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13
Motivation Netbed structure Validation and testing Netbed contribution Conclusion 2
Motivation Netbed structure Validation and testing Netbed contribution Conclusion 3
Researchers need a platform in which they can develop, debug, and evaluate their systems One lab is not enough, lack of resources Need more computers Scalability in terms of distance and number of nodes can’t be reached Requires a huge amount of time to develop large scale experiments 4
Simulation: NS Live networks: PlanetLab Emulation: Dummynet, NSE controlled, repeatable environment Achieves realismNot easy to repeat the experiment again controlled packet loss and delay Manual configuration is boring Loses accuracy due to abstraction 5
Derives from “Emulab Classic” A universally-available time- and space-shared network emulator Automatic configuration from NS script Add Virtual topologies for network experimentations Integrates simulation, emulation, and live-network with wide-area nodes experimentation in a single framework 6
Accuracy Provide artifact-free environment Universality Anyone can use anything the way he wants conservative policy for the resource allocation No multiplexing (virtual machine) The resource of one node can be fully utilized 7
Local-Area Resources Distributed Resources Simulated Resources Emulated Resources WAN emulator (integrated yet) PlanetLab ModelNet (still in work) 8
Motivation Netbed structure Validation and testing Netbed contribution Conclusion 9
Resource Life cycle 10
3 clusters 168 in Utah, 48 PCs in Kentucky & 40 in Georgia Each node can be used as Edge node, router, traffic-shaping node, traffic generator Exclusivity of a machine during an experiment The OS is given but entirely replaceable 11
12
Also called wide-area resources nodes in approximatively 30 sites provides characteristic live network Very few nodes These nodes are shared between many users FreeBSD Jail mechanism (kind of Virtual machine) Non-root access 13
14
Based on nse (NS-emulation) Enables interaction with real traffics Provides scalability beyond physical resources Many simulated nodes can be multiplexed 15
VLANs Emulate wide-area links within a local-area Dummynet Emulates queue & bandwidth limitation, introducing delays and packet loss between physical nodes nodes act as Ethernet bridges transparent to experimental traffic 16
Resource Life cycle 17
18
$ns duplex-link $A $B 1.5Mbps 20ms BA DB ABBA SpecificationGlobal Resource AllocationNode Self-ConfigurationExperiment ControlSwap OutParsingSwap In 19
Experiment creation A project leader propose a project on the web A netbed staff accept or reject the project All the experiment will be accessible from the web Experiment managment Log on allocated nodes or on the usershost (fileserver) The fileserver send the OS images, home and project directories to the other nodes 20
21
Experimenters use ns scripts with Tcl can do as many functions & loops as they want Netbed defines a small set of ns extension Possibility of chosing a specfic hardware simultation, emulation, or real implementation Program objects can be defined using a Netbed- specific ns extension Possibility of using graphical UI 22
Front-end Tcl/ns parser Recognizes subset of ns relevant to topology & traffic generation Database Store an abstraction of everything about the exeriment ▪ Fixed generated events ▪ Information about Hardwares, users & experiments ▪ procedures 23
24
Binds abstractions from the database to physical or simulated entities Best effort to match with specifications On-demand allocations (no reservations) 2 different algorithms for local and distributed nodes (different constraints) Simulated annealing Genetic algorithm 25
Over-reservation of the bottleneck inter-switch bandwith is to small (2 Gbps) Against their conservative policy Dynamic changes of the topology are allowed Add and remove nodes Consistent naming across instantiations Virtualization of IP addresses and host names 26
Dynamic linking and loading from the DB Let have the proper context (hostname, disk image, script to start the experiment) No persistent configuration states Only volatile memory on the node If requiered, the current soft state can be stored in the DB as a hard state Swap out / Swap in 27
Local Nodes All nodes are rebooted in parallel Contact the masterhost which loads the kernel directed by the database A second level boot may be requiered Distributed nodes Boot from a CD-ROM then contact the masterhost A new FreeBSD Jail is instantiated Tested Master Control Client 28
Netbed supports dynamic experiment control Start, stop and resume processes, traffic generators and network monitors Signals between nodes Used of a Publish/Subscribe event routing system The static events are retrieved from the DB Dynamics events are possible 29
ns configuration files is only high-level control Experimenters can made some low-level controls On local node: root privileges ▪ Kernel modification & access to raw sockets On distributed: Jail-restricted root privileges ▪ Access to raw socket with a specific IP address Each local node support separated network isolated from the experimental one Enable to control a node via a tunnel as we where on it without interfering 30
Netbed try to prevent idling 3 metrics: traffic, use of pseudo-terminal devices & CPU load average To be sure, a message is sent to the user who can disapprove manually A challenge for distributed nodes with several Jails Netbed proposes automated batch experiments When no interaction is required Enables to wait for available resources 31
Motivation Netbed structure Validation and testing Netbed contribution Conclusion 32
1 st row : emulation overhead Dummynet gives better results than nse 33
They expect to have better results with future improvements of nse 34
5 nodes are communicating with 10 links Evaluation of a derivative of DOOM Their goal is to sent 30 tics/sec 35
Challenges Depends on physical artifacts (cannot be cloned) Should evaluate arbitrary programs Must run continuoustly Minibed: 8 separated Netbed nodes Test mode: prevent hardware modifications Full-test mode: provides isolated hardware 36
Motivation Netbed structure Validation and testing Netbed contribution Conclusion 37
All-in-one set of tools Automated and efficient realization of virtual topologies Efficient use of resources through time-sharing and space-sharing Increase of fault-tolerance (resource virtualization) 38
Examples The “dumbbell” network ▪ 3h15 --> 3 min Improvement in the utilization of a scarce and expensive infrastructure: 12 months & 168 PC in Utah ▪ Time-sharing (swapping): 1064 nodes ▪ Space-sharing (isolation): 19,1 years Virtualization of name and IP addresses ▪ No problem with the swappings 39
Experiment creation and swapping Mapping Reservation Reboot issuing Reboot Miscellaneous Double time to boot on a custom disk image 40
Mapping local resources: assign Match the user’s requirements Based on simulated annealing Try to minimizes the number of switch and inter- switch bandwidth Less than 13 seconds 41
Mapping local resources: assign 42
Mapping distributed resources: wanassign Different constraints ▪ Fully connected via the internet ▪ “Last mile”: type instead of topology ▪ Specific topologies may be guaranteed by requesting particular network characteristics (bandwidth, latency & loss) ▪ Based on a genetic algorithm 43
Mapping distributed resources: wanassign 16 nodes 100 edges : ~1sec 256 nodes & 40 edges/nodes : 10min~2h 44
Disk reloading 2 possibilities ▪ complete disk image loading ▪ incremental synchronization (hash tables on files or blocks) Good ▪ Faster (in their specific case) ▪ No corruption Bad ▪ Waste of time when similar images are needed repeatly ▪ Pace reloading of freed node (reserved for 1 user) 45
Disk reloading Frisbee Performance techniques: ▪ Uses a domain-specific algorithm to skip unused blocks ▪ Delivers images via a custom reliable multicast protocol 117 sec for 80 nodes, write 550MB instead of 3GB 46
Scaling of simulated resources Simulated nodes are multiplexed on 1 physical node ▪ Must deal with real time taking into account the user’s specification : rate of events Test of a live TCP at 2Mb CBR ▪ 850MHz PC with UDP background 2Mb CBR / 50ms ▪ Able to have 150 links for 300 nodes ▪ Problem of routing in very complex topologies 47
Possibility to program different batch experiment, with the modification of only 1 parameter by 1 The Armada file system from Oldfield & Kotz 7 bandwidths x 5 latencies x 3 application settings x 4 configs of 20 nodes 420 tests in 30 hrs (4.3 min ~ per experiment) 48
Motivation Netbed structure Validation and testing Netbed contribution Conclusion 49
Netbed deals with 3 test environments Reuse of ns script Quick setup of the test environment Virtualization techniques provide the artifact-free environment Enables qualitatively new experimental techniques 50
Reliability/Fault Tolerance Distributed Debugging: Checkpoint/Rollback Security “Petri Dish” 51