THE OVERLAY NETWORKS WITH VIRTUAL RESOURCES NEED NEW FORMS OF MONITORING Jiri Navratil, Tomas Kosnar, Jan Furman, Tomas Mrazek, Vojtech Krmicek CESNET
Create playground for network engineers and other scientists who need experimental network s for testing new designs, proof new concepts and idea needed for Future Internet Main feature: Main goal: Each group of users should play in FEDERICA independently, fully separated from other players
You have to know well the system which you want to monitor, measure, manage Motto:
From Federica.eu as ibm.com or hp.com
Virtualization helps us make a identical copies Virtualization helps us make a identical copies Virtualization helps us make a identical copies Virtualization helps us make a identical copies
How we can achieve this ?
Typical site infrastructure From Core POP Non Core POP Via virtual networks Via virtual nodes
a/ Multiply nodes b/ Allocate new nodes to slices (users playgrounds)
c/ Create new networks
Slice centric (users) Node centric (node managers) Physical centric (FEDERICA NOC)
Physical level - Basic Orientation Map (based on SNMP)
Tools available from SNMP Last 24 hours week months
FEDERICA Host reachability RTT
FEDERICA core VLAN-114 Uplink VLAN-111 VLAN-113 VLAN-112 Site A Site C Site B Site D Practical Slicing Making new networking substrate in selected Vnodes Example for two slices with different network structure Star network: A-D, A-C, A-B (VLAN 101,102,103) Circle network: A-D-B-C-A (VLAN 111,112,113,114) VLAN-103 VLAN-101 VLAN-102 Uplink 802.1q Uplink
Variability in FEDERICA sites Location of MP - Measuring points / edges Slice-8 Slice-1 Ext. Switch JUNIPER MX/EX 104 Slice-3 Ports Uplink Vnode (ESXi Server) vmnic physical Eth. adapters VSwitch-1 VSwitch-3 VNIC (max 4 for VM) OR VSwitch-2 Mgmt.. network 112 VLANsVLANs Networking virtual infrastructure Tagging ends in ext. switches 802.1Q trunk Slice-8 Slice-1 Slice-3 Vnode (ESXi Server) VNIC (max 4 for VM) VSwitch-1 VNIC Networking virtual infrastructure VMware virtual infrastructure Tagging ends in Vswitches Ext. Switch JUNIPER MX/EX 104 Uplink VLANsVLANs vmnic physical Eth. adapters 802.1Q trunk Mgmt.. network Vmware MPs VMware virtual infrastructure SNMP MP eth0 eth1 eth0 This solution has limit in # of VMNICThis solution has no practical limit (due to VMWARE variability) VM
Vmware networking infrastructure opens high variability
11 Slice-8 Slice-1 Ext. Switch ESXi Server FEDERICA core VLAN-113 VLAN-114 Slice-3 Ports Uplink ESXi Server VLAN Slice-8 Slice VLAN-111 VLAN-114 VLAN-113 VLAN Site A Site C Site B Site D Slice-1 Slice-4 Configuration example when tagging ends on external switches (Slices appear and disappear - mapping problem) Slice-3 vmnic Slice-8 ESXi Server Slice-1 Slice-4 Slice VLAN-103 VLAN-104 VLAN-101 Probably the simplest solution but it allows make only (N-1) slices on Vnode if we don’t allow port sharing. ( In our case 7 slices on Vnode) ,103, 104 SNMP VMware Star network: slice 3 Circle network: slice 8
(CESNET SLICE} AS (Poz) VN3 (Mil) VN1 (Prg) VN2 (Erl) Internet private IP addr. space management network The simplest slice example Internet and SLICE Two independent worlds lightly coupled via AS !
Tools available from VMware
Tools available from Vmware (good for managers not users)
Tools available from VMware
Tools available from Vmware - resxtop
Our new tools (node-centric mode)
More details (node-centric mode)
Monitoring tools (slice-centric mode) We still need more work on it We need: - configuration data from central database - more detail coordination with NOC - automatic links between admin tools
The experiments on the SLICE: SNMP - Vmware match SNMP G3 Monitor (CESNET SLICE} AS (Poz) VN3 (Mil) VN1 (Prg) VN2 (Erl) Internet private IP network management network
The experiments on the SLICE: Correlation: CPU load versus network App1: scp 4G file App2: Data collection with processing Port – slice mapping ? (from DB ?)
The experiments on the SLICE: Resource sharing (CPU) CPU utilization on host Capacity: 8x20 K units: ms Multi-line versus STACK CPU distribution on host - SMP random - rules for dedication to VM
The experiments on the SLICE: Resource sharing (CPUs) 5 VM - 1 dominate 4 CPU active Total CPU activity ~ 0,13% 9 VM - 1 dominate 4 CPU active Total CPU activity ~ 0,14% 7 VM - 1 dominate 4 CPU active Total CPU activity ~ 0,05%
The experiments on the SLICE: Resource capacity test (CPU) + utilization - Idle Integrated value all CPU on host Individual CPUs on host
Source sharing - compare to Planetlab Total # slices running on this node: 73 Actively using CPU: 9 (only one CPU) Actively using Network: 18 Conclusion: Most slices running randomely It allows stronger overallocation !
FEDERICA - CESNET POP External PROBE Monitoring (more details in 4D section - Jiri Novotny and in the poster)
More details in the POSTER: …..
Thank You Any questions ?