Download presentation
Presentation is loading. Please wait.
Published byJack Black Modified over 8 years ago
1
Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Update Focus on PlanetLab integration and booting Fred Kuhns fredk@arl.wustl.edu Applied Research Laboratory Washington University in St. Louis
2
2 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Documents Control documentation http://www.arl.wustl.edu/projects/techX/ppt/ –This presentation http://www.arl.wustl.edu/projects/techX/ppt/ControlUpdate.ppt –SRM interface http://www.arl.wustl.edu/projects/techX/ppt/srm.ppt –RMP interface http://www.arl.wustl.edu/projects/techX/ppt/rmp.ppt –SCD interface (ingress, egress and npe) http://www.arl.wustl.edu/projects/techX/ppt/scd.ppt Datapath documentation http://www.arl.wustl.edu/projects/techX/design/SPP/ –NAT overview (Interface??) http://www.arl.wustl.edu/projects/techX/design/SPP/SPP_V1_NAT_design.ppt –FlowStats (Interface??) http://www.arl.wustl.edu/projects/techX/design/SPP/FlowStats_Control.ppt
3
3 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Traditional View of a PlanetLab Node Node Manager (“root” VM) System Services (VMs) VM 1 VM N... Virtual Machine Monitor (VMM) NICCPUDRAMDisk Hardware Platform (General Purpose PC) Planetlab node: site, owner, model, ssh_host_key, groups Host = XXX, Domain = YYY IPAddress = A.B.C.D Linux OS, vserver System services –pl_netflow –sirius: brokerage service –stork: environmental service –CoMon: monitoring and discovery Resource model –focused on PCs with single device instances (CPU, NIC) –standard Linux/UNIX tools to measure utilization –homogeneous environment with single vmm to manage all vm instances on a platform –local node manager interface through loopback interface User requests slice on a set of distributed nodes –assigned VM instance on each node –Fedora Linux environment –per slice flowstats host.domain A.B.C.D Internet
4
4 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 An SPP Node SPP/PlanetLab node: site, owner, model ssh_host_key, groups Host = XXX Domain = YYY IPAddress = A.B.C.D *Node Manager *System Services VM 1 VM X-1... Virtual Machine Monitor (VMM) DRAMDisk Hardware Platform (General Purpose PC) GPE1 *Node Manager *System Services VM X VM N... Virtual Machine Monitor (VMM) Hardware Platform (General Purpose PC) GPE2 CPU NIC data control DRAMDisk CPU NIC data control data NPE vm 1 :fast path 1 vm 1 :fast path 2 vm X :fast path 1... NPE vm X-1 :fast path 1 vm Y :fast path 2 vm N :fast path 1... data Line Card External Interface HUB: 1GbE Control (Base); 10GbE Data (fabric) FwdDB/Filters datapath CP *Node Manager *System Services NAT spp_host.domain A.B.C.D Internet
5
5 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Challenges Provide the standard PlanetLab slice environment –configure and boot individual GPEs with standard planetlab software and supporting the standard operational environment Support standard interfaces –boot manager –node managers internal and external interfaces –resource monitoring Create interface for allocating and managing fast-paths –allocate/free NPE resources –manage meta-interface mappings to externally visible IP address and UDP port –slice control of allocated fastpath resources
6
6 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 xmlrpc PLCAPI proxy FlowStats PCI NPU-A NPU-B SPI TCAM ingress SCD xscale PCI SPP Node Base Ethernet Switch (1Gbps, control) NPE GPE interfaces I 2 C (IPMI) CP Fabric Ethernet Switch (10Gbps, data path) External Interfaces vnet NMPRMP user slivers Hub pl_netflow PXE, dhcpd tftp httpd user info/ home dirs /var/www/ boot files node DB Resource DBSlice DB nodeconf.xml Boot Files: dhcpd.conf ethers tftpboot: bootcd.img overlay_gpeX.img pxelinux.0 pxelinux.cfg C0A82031 C0A82041 overlay.img: plnode.txt plc_config ethers spp_conf.txt spp_netinit.py server*, certs LC NPU-A NPU-B SPI TCAM ingress SCD NATD xscale egress SCD xscale 10x1G/1x10G RTM IP 1 IP 2 IP N... SLM sliceDB flowDB System Resource Manager (SRM) and node manager (GNM) sshd* ntpd ntp Shelf manager NPE SCD ntp
7
7 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Software Components Control Processor (CP): –Boot and Configuration Control (BCC): Node configuration, management and local state management (DB) httpd, dhcpd, tftp and PXE server for GPE and NPE boards; maintain config files Boot CD and distribution file management (overlay images, RPM and tar files) for GPEs and CP PLCAPI proxy ( plc_api ) and system level BootManager (part of gnm ) –System Resource Manager ( SRM ): Centralized resource management responsible for all resource allocation decisions and maintaining dynamic system state delegates local operations to individual board-level managers –System Node Manager ( SNM, aka GNM ): “top-half” of the PlanetLab node manager –Slice login manager (SLM) and ssh forwarding (modified sshd) -- Ritun –Flow Statistics (FS): aggregates pl_netflow data and translates NAT records –Set default (static) routes in line card –What about dynamic route management (BGP/OSPF/RIP)? For now assume single next hop router for all routes. General purpose Processing Element ( GPE ) –Local Boot Manager ( LBM ): Modified PlanetLab BootManager running on the GPEs –Resource Manager Proxy ( RMP ) –Node Manager Proxy ( NMP ), lower-half of PlanetLab’s node manage Network Processor Element ( NPE ) –Substrate Control Daemon ( SCD ): manages all NPE resources and provides mappings form slice to global name spaces –Kernel module to read/write memory locations ( wumod ) –Command interpreter for configuring NPU memory ( wucmd ) Line Card, Ingress –Substrate Control Daemon (scd_ingress) implements interface to srm manage tcam access for ingress and egress reads/writes scratch rings for NATD –Network Address Translation daemon (NATD), port only Line Card Egress: –Substrate Control Daemon (scd_egress) implements interface to srm reads/writes scratch rings and communicates with the FS and NATD.
8
8 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Boot and Configuration Control Read node configuration DB: currently this is an xml file –Allocate IP subnets and addresses for all boards –Assign external IP addresses to GPE fabric interfaces with default VLAN id –Create per GPE configuration DB: currently this is written to files. Create dhcp configuration file and start dhcpd, httpd and system sshd –assigns control IP subnets and addresses; assigns internal substrate IP subnet on fabric Ethernet Start PLCAPI proxy (plc_api) server and system node manager –read node DB for initialization data: currently use static configuration data and/or re-read xml file –Create GPE overlay images: currently this is done manually –Currently the SNM is split between the plc_api server and srm due to not having a DB and not wanting to implement transaction-like interface for the snm. –begin periodic slice updates and gpe assignments, maintain DB Start SRM and bring up boards as they “report in” –Initialize Line Card to forward “default” (i.e. ssh and icmp) to CP –Initialize Hub: base and fabric switches; Initialize any switches not within the chassis Start SLM and the ssh daemon –Remove the SLM configuration file for slices, may contain old mappings
9
9 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 eth1:0 Booting SPP1: Example Configuration srm f1/0 b1 eth0 eth0.2 dnr05.arl.wustl.edu vlan 2 cp_ctrl 192.168.32.1/20 cp_data = 171.16.1.1/26 CP noarp eth2 /etc/ dhcpd.conf ethers /tftpboot/ ramdisk.gz zImage.ppm10 bootcd.img overlay_gpe1.img overlay_gpe2.img pxelinux.0 pxelinux.cfg/ C0A82031 C0A82041 /var/www/html/boot/ index.html bootmanager.sh bootstrapfs-planetlab-i386.tar.bz2 dhcpd httpd plc_api gnm* fs f1/0 f1/1 b1 eth0 eth0.2 dnr05.arl.wustl.edu vlan 2 gpe1_ctrl = 192.168.32.65/20 gbe1_data = 171.16.1.3/26 gpe1_int = 172.16.1.65/26 GPE1 (Slot 4) noarp eth1 eth2 rmpnm f1/0 f1/1 b1 eth0 eth0.2 dnr05.arl.wustl.edu vlan 2 gpe2_ctrl = 192.168.32.49/20 gbe2_data = 171.16.1.4/26 gpe2_int = 172.16.1.66/26 GPE2 (Slot 3) noarp eth1 eth2 rmpnm f1/0 b1aeth0 lc_b1a = 192.168.32.97/20 lc1_data = 171.16.1.6/26... Line Card (Slot 6) scd b1beth0 lc_b1b = 192.168.32.98/20 Ingress XScale Egress XScale f1/0 b1aeth0 lc_b1a = 192.168.32.81/20 lc1_data = 171.16.1.5/26... NPE (Slot 5) scd b1beth0 lc_b1b = 192.168.32.82/20 XScale A XScale B Hub Ebony 128.252.153.31 eth0 eth0:0 192.168.32.2 IP Routing proxy arp for drn05 128.252.153.78 eth2.2 128.252.153.31 the ARL network drn05.arl.wustl.edu 128.252.153.209 192.168.32.17 natd myPLC drn06.arl.wustl.edu vlan 2 b2 f2/0 f2/1 b2 f2/0 f2/1
10
10 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Example Configuration, SPP3 srm f1/0 b1 eth0 eth0.2 spp3.arl.wustl.edu vlan 2 cp_ctrl 192.168.0.1/20 cp_data = 171.16.1.1/26 CP noarp eth2 /etc/ dhcpd.conf ethers /tftpboot/ ramdisk.gz zImage.ppm10 bootcd.img overlay_gpe1.img overlay_gpe2.img pxelinux.0 pxelinux.cfg/ C0A82031 C0A82041 /var/www/html/boot/ index.html bootmanager.sh bootstrapfs-planetlab-i386.tar.bz2 dhcpd httpd plc_api gnm* fs f1/0 f1/1 b1 eth0 eth0.2 spp3.arl.wustl.edu vlan 2 gpe1_ctrl = 192.168.0.49/20 gbe1_data = 171.16.1.3/26 gpe1_int = 172.16.1.65/26 GPE1 (Slot 3) noarp eth1 eth2 rmpnm f1/0 f1/1 b1 eth0 eth0.2 spp3.arl.wustl.edu vlan 2 gpe2_ctrl = 192.168.0.65/20 gbe2_data = 171.16.1.4/26 gpe2_int = 172.16.1.66/26 GPE2 (Slot 4) noarp eth1 eth2 rmpnm f1/0 b1aeth0 lc_b1a = 192.168.0.97/20 lc1_data = 171.16.1.6/26... Line Card (Slot 6) scd b1beth0 lc_b1b = 192.168.0.98/20 Ingress XScale Egress XScale f1/0 b1aeth0 lc_b1a = 192.168.0.81/20 lc1_data = 171.16.1.5/26... NPE (Slot 5) scd b1beth0 lc_b1b = 192.168.0.82/20 XScale A XScale B Hub cp5.arl.wustl.edu 128.252.153.39 eth0 eth0:0 eth1:0 192.168.0.2 IP Routing proxy arp for drn05 128.252.153.34 eth2.2 128.252.153.39 the ARL network spp3.arl.wustl.edu 128.252.153.3 192.168.0.17 natd myPLC drn06.arl.wustl.edu vlan 2 b2 f2/0 f2/1 b2 f2/0 f2/1
11
11 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 bootcd file system / bin/ dev/ home/ lib/... etc/ init.d/ pl_boot pl_netinit pl_validateconf pl_sysinit pl_hwinit... root/ selinux/ sys/ usr/ pl_boot: modified to not use ssl or pgp to retrieve BootManager script from the cp pl_netinit: sets boot_server to reference the cp pl_validateconf: added SPP specific variables
12
12 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 overlay image / etc/{issue, passwd} kargs.txt pl_version usr/ isolinux boot/ spp_netinit.py ethers spp_conf.txt boot_server boot_server_port boot_server_path plnode.txt cacert.pem plc_config pubring.gpg backup/ boot_server boot_server_path boot_server_port cacert.pem pubring.gpg bootme/ BOOTPORT BOOTSERVER BOOTSERVER_IP ID cacert/drn06.arl.wustl.edu/cacert.pem Changed to list cp as boot server and port as 81 Added SPP initialization script and config files Changed plnode.txt to list this GPEs mac address for control interface
13
13 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 GPE Configuration file: spp_conf.txt # Config name: spp1.txt [ nserv ] ctrl_ipaddr=192.168.32.1 ctrl_hwaddr=00:1E:C9:FE:76:22 data_ipaddr=172.16.1.1 data_hwaddr=00:1E:C9:FE:76:23 [ domain ] hostname=drn05 domain=arl.wustl.edu dns1=128.252.133.45 dns2=128.252.120.1 gateway=128.252.153.31 [ hosts ] nserv_f1.0=172.16.1.1 nserv=192.168.32.1 nserv_gbl=192.168.48.1 shmgr=192.168.48.2 hub=192.168.32.17 hub1_f1.0=172.16.1.2 hub1_m.0=192.168.48.17 gpe1_f1.0=172.16.1.3 gpe1_f1.1=172.16.1.65 gpe1_b1.0=192.168.32.65 gpe2_f1.0=172.16.1.4 gpe2_f1.1=172.16.1.66 gpe2_b1.0=192.168.32.49 npe1_f1.0=172.16.1.5 npe1_b1.0=192.168.32.81 npe1_m.0=192.168.48.81 npe1_b1.1=192.168.32.82 lc_f1.0=172.16.1.6 lc_b1.0=192.168.32.97 lc_m.0=192.168.48.97 lc_b1.1=192.168.32.98 drn05.arl.wustl.edu=128.252.153.209 [ iface ] __name__=eth0 dev=eth0 name=gpe1_f1.0 hwaddr=00:0e:0c:85:e4:40 type=data lanid=fabric1 port=0 vlan=0 ipaddr=172.16.1.3 ipnet=172.16.1.0 ipbcast=172.16.1.63 ipmask=255.255.255.192 arp=no enable=yes [ iface ] __name__=eth0.2 dev=eth0.2 name=gpe1_f1.0 hwaddr=00:0e:0c:85:e4:40 vlan=2 type=data lanid=fabric1 port=0 ipaddr=128.252.153.209 ipnet=128.252.0.0 ipbcast=128.252.255.255 ipmask=255.255.0.0 arp=no enable=yes [ iface ] __name__=eth1 dev=eth1 name=gpe1_f1.1 hwaddr=00:0e:0c:85:e4:42 type=data lanid=fabric1 port=1 vlan=0 ipaddr=172.16.1.65 ipnet=172.16.1.64 ipbcast=172.16.1.127 ipmask=255.255.255.192 arp=no enable=yes [ iface ] __name__=eth2 dev=eth2 name=gpe1_b1.0 hwaddr=00:0e:0c:85:e4:3e type=control lanid=base1 port=0 vlan=0 ipaddr=192.168.32.65 ipnet=192.168.32.0 ipbcast=192.168.39.255 ipmask=255.255.248.0 arp=yes enable=yes
14
14 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 ethers # ---------------------------------------------------------------------- # Board Type cp, Name cp1, Slot 0 # nserv_f1.0 fabric1/0 00:1E:C9:FE:76:23 172.16.1.1 # nserv base1/0 00:1E:C9:FE:76:22 192.168.32.1 # nserv_gbl maint/0 00:10:18:32:00:76 192.168.48.1 # ---------------------------------------------------------------------- # Board Type shmgr, Name shmgr1, Slot 0 # shmgr maint/0 00:50:C2:3F:D2:74 192.168.48.2 # ---------------------------------------------------------------------- # Board Type hub, Name hub1, Slot 1 # hub base1/0 00:00:50:3D:10:6B 192.168.32.17 # hub1_f1.0 fabric1/0 00:00:50:3D:10:B0 172.16.1.2 # hub1_m.0 maint/0 00:00:50:3D:10:6C 192.168.48.17 # ---------------------------------------------------------------------- # Board Type gpe, Name gpe1, Slot 4 # gpe1_f1.0 fabric1/0 00:0e:0c:85:e4:40 172.16.1.3 # gpe1_f1.1 fabric1/1 00:0e:0c:85:e4:42 172.16.1.65 # gpe1_b1.0 base1/0 00:0e:0c:85:e4:3e 192.168.32.65 # ---------------------------------------------------------------------- # Board Type gpe, Name gpe2, Slot 3 # gpe2_f1.0 fabric1/0 00:0E:0C:85:E6:08 172.16.1.4 # gpe2_f1.1 fabric1/1 00:0E:0C:85:E6:0A 172.16.1.66 # gpe2_b1.0 base1/0 00:0E:0C:85:E6:06 192.168.32.49 # ---------------------------------------------------------------------- # Board Type npe, Name npe1, Slot 5 # npe1_f1.0 fabric1/0 00:00:00:00:00:00 172.16.1.5 # npe1_b1.0 base1/0 00:00:50:3d:07:3e 192.168.32.81 # npe1_m.0 maint/0 00:00:50:3D:07:3C 192.168.48.81 # npe1_b1.1 base1/1 00:00:50:3D:07:3D 192.168.32.82 # ---------------------------------------------------------------------- # Board Type lc, Name lc1, Slot 6 # lc_f1.0 fabric1/0 00:00:50:3d:0b:d4 172.16.1.6 # lc_b1.0 base1/0 00:00:50:3D:08:26 192.168.32.97 # lc_m.0 maint/0 00:00:50:3D:08:24 192.168.48.97 # lc_b1.1 base1/1 00:00:50:3D:08:25 192.168.32.98 # ---------------------------------------------------------------------- # Gateway for drn05 (128.252.153.209), VLAN 2 00:00:50:3d:0b:d4 128.252.153.31
15
15 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 BootAPI calls made by the BootManager PLCAPI/BootAPI calls 1.GetSession(node_id, auth, node_ip) returns new session key for node 2.BootCheckAuthentication(Session) returns true if Session id is valid 3.GetNodes(Session, node_id, [‘nodegroup_ids’,‘nodenetwork_ids’,‘model’,‘site_id’]) returns the indicated parameters for this node (ie. node_id). 4.GetNodeNetworks(Session, node_id, nodenetwork_ids) returns list of interfaces [ broadcast, network, ip, dns1, dns2, hostname, netmask, gateway, nodenetwork_id, method, mac, node_id, is_primary, type, bwlimit, nodenetwork_settings_ids ] 5.GetNodes(Session, node_id, ‘nodegroup_ids’) returns list of group ids associated with this node 6.GetNodeGroups(Session, nodegroup_id, ‘name’) returns the name string for each node group (in out case ‘SPP’) 7.GetNodeNetworkSettings() 8.BootUpdateNode(Session, boot_state) Sets node’s boot state at PLC 9.BootNotifyOwners(Session, “event”, params) causes email to be sent to the list of node owners. 10.BootUpdateNode(Session, ssh_host_key) records the latest ssh public key for node.
16
16 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Other PLC/Server interactions HTTP/HTTPS –Upload alpina boot logs: BOOT_SERVER_URL += /alpina-logs/upload.php –Compatibility step (we don’t use) BOOT_SERVER_URL +=/alpina-BootLVM.tar.gz BOOT_SERVER_URL +=/alpina-PartDisk.tar.gz –Download file system tar file containing basic plab node environment BOOT_SERVER_URL += /boot/bootstrapfs-”group”-”arch”.tar.bz2 –If not in config file get node id BOOT_SERVER_URL += /boot/getnodeid.php –Get yum update configuration file: BOOT_SERVER_URL += /PlanetLabConf/yum.conf.php
17
17 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 System Initialization: Stage 1 Use PXE boot and download pxelinux and config file: –boot using basic initial ramdisk, overlay and kernel –Use dhcp, tftp and pxe server on the cp, files stored in the tfptboot directory. pxelinux.o, pxelinux.cfg/ bootcd.img, overlay_gpeX.img, kernel –The overlay image is modified for each GPE to include it’s configuration file, modified planetlab config files and an spp node python script. Currently this is a manual step but ultimate (long term) plan is for the gnm daemon to create the individual images The overlay image contains several files that identify the node and provide the name and address for the PLC and Boot servers. I have modified these to point o the cp. Just before booting the final kernel I change these values to refer to the “real” plc/api servers.
18
18 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 System initialization: Stage 2 Boot into basic, intermediate environment Initial configuration information obtained from the overlay image –Includes spp_conf.txt defines gpe interfaces –Includes ethers file contains mac addresses for static arp entries –Updated plnode.txt with GPE’s control interface mac address –Modified bootserver files listing the cp as the bootserver –Includes spp_netinit.py, a python script to configure the interfaces and update system configuration files. Enables “primary” interface and key network configuration files such as resolv.conf Downloads BootManager source from the “boot_server” –In our case we download from the CP –I explicitly disable the use of ssl and certs (the certifictes on the overlay image are for the PLC server and not the CP) –Our assumption is that the control (base) network is “secure” plus within an SPP node we don’t have to worry about authentication issues.
19
19 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 BootManager Opens connection to PLCAPI on bootserver –Opens connection to our proxy plcapi/bootapi server running on the CP Get node session key: GetSession(node_id, auth, node_ip) –Since each call to create a session invalidates any existing keys we intercept this call on the cp and use a common session key for all gpes. Determines node’s configuration –reads plnode.txt for node_id, node_key and the primary interface settings we use DHCP to configure the control interface but I do not define a dns server –if node_id is not found then reads URL=BootServer/boot/getnodeid.php Call BootCheckAuthentication(Session) to verify session key Calls GetNodes to get the boot_state, node_groups, model, site_id Calls GetNodeNetworks to get configuration information for all interfaces –in our case the call would return the externally visible network parameters, which differ from how each GPE is configured –long term, we can intercept this call and return GPE specific interface config info. –Short term we use a configuration file in the overlay image with similarly formatted information. I have replaced the BootManager code that reads the config info and configures the interfaces. –I had to add support for VLANs and our internal interfaces.
20
20 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 BootManager Continued Download the nodes final filesystem image from the boot_server –in our case this is the CP, http://CP/boot/bootstrap-planetlab-i386-tar.bz2 Download yum config file –I am not currently downloading, http://CP/PlanetLabConf/yum.conf Call BootUpdateNode with new boot_state –we will need to intercept this call and both report and set node state based on all GPEs. Call BootNotifyOwners with new state –forward to PLC Update network configuration in new “sysimg” –downloads //BootServer/ PlanetLabConf/plc_config file In our case I have copied onto the overlay image in the /usr/boot directory. –calls GetNodeNetworkSettings for a list of any additional interface attributes then creates various configuration files: hosts, resolv.conf, network, ifcfg-eth* I have replaced this step with our own script spp_netinit.py and configuration file spp_conf.txt which I use to create the same config files in both the current environment and the new sysimg. –updates devices and creates the initrd image used for the next stage –finally boots a new kernel using the bootstrap file system
21
21 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Boot States The list of boot states is changing as I write this In our version of the plc the states are shown on the right StateNext stateDescription new install verified -> rins error->dbg new instal: verify install with user. inst install verified -> rins error->dbg Install: same as new rins success->boot error->dbg reinstall: reformat disk and reinstall all software and files. boot error -> dbg boot: boot using existing partitions dbg Success: same as boot Fail: bootcd image debug: boot node diaguser controlled diagnostics: bootcd image disableuser controlleddisable: bootcd image
22
22 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 PLC Database The PlanetLab central database keeps a database describing all nodes, slices and users/people. Slice data base keeps track of all slices and their node bindings The Node database includes externally visible properties and the ability to associate general attributes with these properties –the current (or next) node state (boot_state) –node identifier (node_id) –list of interface configuration parameters ip address information, mac address, generic list of attributes –node’s owner –node’s site identifier (site_id) –model, can be used to specify a set of attributes forthe node. For example: minhw, smp –current ssh host key (ssh_host_key) –node groups: I believe this is being depricated in favor of associate a generic set of attributes with a node or its interfaces.
23
23 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 SPP Specific Information On an SPP node the resource manager needs to know what kind of board is inserted in each slot and its I/O characteristics Needs to associate interface MAC addresses with boards and interfaces. Or with standalone system connected to an RTM or front panel (for example the CP). Also need to know which interfaces are connected to the base and which to the fabric switch when bringing up general purpose systems. There is not a convenient mechanism for determining this at run time so I have a configuration file. Also need to know what resources are available on each board and allocation policies. Must also have a list of external links, their addresses and the address of any peers (Ethernet). Need to keep track of current nodes state (as kept by PLC) as well as the state of each individual board. Need to share state between different daemons
24
24 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Node Configuration File 1024............ 10000000000............ Radisys_7010 NPEv1...... Radisys_7010 LCv1...............
25
25 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 CP Record 172.16.1.1 172.16.1.0 255.255.255.192 172.16.1.63 eth0 00:1E:C9:FE:76:23 Interface connected to HUB's fabric port 192.168.32.1 192.168.32.0 255.255.248.0 192.168.39.255 eth1 00:1E:C9:FE:76:22 System control processor's Base Ethernet connection 192.168.48.1 192.168.48.0 255.255.248.0 192.168.55.255 eth2 00:10:18:32:00:76 Connection to the maintenance ports
26
26 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 GPE Record -- IP Address Info -- eth0 00:0e:0c:85:e4:40 (Device Data) 1000000000 2 (Resource Policy) MAC=N+2, Fabric 1/0 or AMC Port 0 -- IP Address Info -- eth1 00:0e:0c:85:e4:42 MAC=N+4, Fabric 1/1 or Maintenance Port 1 -- IP Address Info -- eth2 00:0e:0c:85:e4:3e MAC=N, Base connection to Primary HUB -- IP Address Info -- eth3 00:0e:0c:85:e4:3f MAC=N+1, Base connection to alternate HUB -- IP Address Info -- eth4 00:0e:0c:85:e4:41 MAC=N+3, Fabric 2/0 or AMC Port 1 -- IP Address Info -- eth5 00:0e:0c:85:e4:43 MAC=N+5, Fabric 2/1 or Maintenance Port 2
27
27 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 NPE Record Radisys_7010 NPEv1 -- IP Address Info -- -- Device Data -- -- Resource Policy -- Fabric interface used for both NPUs -- IP Address Info -- -- Device Data -- Primary control interface associated with NPUA -- IP Address Info -- -- Device Data -- NPUA Front Maintenance Port -- IP Address Info -- -- Device Data -- NPUB Front Maintenance Port -- But it's been patched to the Base switch
28
28 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 LC Record Radisys_7010 LCv1 (Model Data) -- IP Address Info -- -- Device Data -- -- Resource Policy -- -- IP Address Info -- -- Device Data -- -- IP Address Info -- -- Device Data -- -- IP Address Info -- -- Device Data -- 00:00:50:29:b1:46 -- Link IP Address Info -- -- Device Data -- -- Resource Policy -- arl.wustl.edu drn05 128.252.133.45 128.252.120.1 128.252.153.31 00:0F:B5:FB:D8:67 2 p2p link from drn05 to drn06, the plc
29
29 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 SRM Interface NATD to SRM: [egress_map, ingress_map] get_sched_map(LinkIP, BoardMAC) Depricated: original natd interface! {fid, port} alloc_epmap(map) status free_epmap(fid) FS to SRM: ?? (map vlan to slice id) RMP to SRM: Interfaces (Line Card Links): if_list get_interfaces(plabID) ifn get_ifn(plabID, ipaddr) if_entry get_ifattrs(plabID, ifn) : ipaddr get_ifpeer(plabID, ifn) : retcode resrv_fpath_ifbw(bw, ifn) retcode reles_fpath_ifbw(bw, ifn) To be implemented: retcode resrv_slice_ifbw(plabID, bw, ifn) retcode reles_slice_ifbw(plabID, bw, ifn) EndPoints (local IP and Port number): NATD changes may have broken these ep alloc_endpoint(PlabID, ep) status free_endpoint(PlabID, ipaddr, port, proto) Fast Path: fp_params alloc_fastpath(PlabID, copt, bwspec,rcnts, mem) status free_fastpath() Fast-Path Meta-Interfaces: [mi, ep] alloc_udp_tunnel(bw, ipaddr, port) ep get_endpoint(mi) status free_udp_tunnel(ipaddr, port)
30
30 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 RMP Interface Prototype completed: 1.result noop() 2.version get_version() 3.result add_slice(plabID, len, name) 4.result rem_slice(plabID) 5.ret_t alloc_fastpath(copt, bw, rcnts, mem) 6.void free_fastpath() 7.if_list get_interfaces() 8.ifn get_ifn(ipaddr) 9.if_entry get_ifattrs(ifn) 10.ipaddr get_ifpeer(ifn) 11.retcode alloc_pl_ifbw(ifn, bw) 12.retcode reles_pl_ifbw(ifn, bw) 13.retcode alloc_fpath_ifbw(fpid, ifn, bw) 14.retcode reles_fpath_ifbw(fpid, ifn, bw) 15.retcode bind_queue(fpid, miid, list_type, qids) 16.actual_bw set_queue_params(fpid, qid, threshold, bw) 17.[threshold, bw] get_queue_params(fpid, qid) 18.[u32 Pkts, u32 Bytes] get_queue_len(fpid, qid) To do: 19.ep alloc_endpoint(ep) 20.status free_endpoint(ipaddr, port, proto) 21.-- alloc_tunnel -- 22.-- free_tunnel -- 23.[mi, ep] alloc_udp_tunnel(fpid, bw, ip, port) 24.status free_udp_tunnel(ipaddr, port) 25.ep get_endpoint(fpid, mi) 26.retcode write_fltr(fpid, fid, fltr) 27.retcode update_result(fpid, fid, result) 28.fltr_t get_fltr_bykey(fpid, key) 29.fltr_t get_fltr_byfid(fpid, fid) 30.result lookup_fltr(fpid, key) 31.retcode rem_fltr_bykey(fpid, key) 32.retcode rem_fltr_byfid(fpid, fid) 33.stats_t read_stats(fpid, sindx, flags) 34.result clear_stats(sindx) 35.handle create_periodic(fp,indx,P,cnt,flags) 36.retcode delete_periodic(fpid, handle) 37.retcode set_callback(fpid, handle, xport) 38.stats_t get_periodic(fpid, handle) 39.retcode mem_write(fpid, offset[, len], data) 40.data mem_read(fpid, offset, len)
31
31 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 NPE SCD Interface SRM to SCD status set_fastpath(fpid, copt, VLAN, params, mem) status enable_fastpath(fpid) status disable_fastpath(fpid) status rem_fastpath(fpid) status set_sched_params(sid, ifn, BW max, BW min ) status set_encap_cb(sid, srcIP, dMAC) status set_fpmi_bw(fpid, sid, miid, bw) status start_mes() status stop_mes() status set_encap_gpe(fpid, gpeIP, npeIP) result write_mem(kpa, len, data) data read_mem(kpa, len) SRM & RMP to SCD ret_t write_fltr(dbid, fid, key, mask, result) ret_t update_result(dbid, fid, result) fltr get_fltr_bykey(dbid, key) fltr get_fltr_byfid(dbid, fid) result lookup_fltr(dbid, key) retcode rem_fltr_bykey(dbid, key); retcode rem_fltr_byfid(dbid, fid) RMP to SCD status set_gpe_info(exPort, ldPort, exQID, ldQID) u32 result bind_queue(u16 miid, u8 list_type, u16[] qid_list) u32 bw set_queue_params(u16 qid, u32 threshold, u32 bw) {u32 threshold, u32 bw} get_queue_params(u16 qid) {u32 pktCnt, u32 byteCnt} get_queue_len(u16 qid) result write_sram(offset, len, data) data read_sram(offset, len) stats = read_stats(sindx, flags) result = clear_stats(sindx) handle create_periodic(sindx, P, cnt, flags) retcode del_periodic(handle) retcode set_callback(handle, udp_port) stats = get_periodic(handle)
32
32 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 LC SCD Interface SRM to SCD status set_sched_params(sid, ifn, BW max, BW min ) status set_sched_mac(sid, MAC dst, MAC src ) u32 result set_queue_sched(u16 qid, u16 sid) result write_mem(kpa, len, data) data read_mem(kpa, len) SRM and RMP to SCD: ret_t write_fltr(dbid, fid, key, mask, result) ret_t update_result(dbid, fid, result) fltr get_fltr_bykey(dbid, key) fltr get_fltr_byfid(dbid, fid) result lookup_fltr(dbid, key) retcode rem_fltr_bykey(dbid, key); retcode rem_fltr_byfid(dbid, fid) RMP to SCD u32 actual_bw set_queue_params(u16 qid, u32 threshold, u32 bw) {u32 threshold, u32 bw} get_queue_params(u16 qid) {u32 pktCnt, u32 byteCnt} get_queue_len(u16 qid) stats = read_stats(sindx, flags) result = clear_stats(sindx) handle create_periodic(sindx, P, cnt, flags) retcode del_periodic(handle) retcode set_callback(handle, udp_port) stats = get_periodic(handle)
33
33 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Slice Example Get list of interfaces, their Ip addresses and available bandwidth if_list = {if_entry,...} if_entry = {u16 ifn, // logical interface number u16 type, // peering or multi-access u32 ipaddr, // interface’s IP address u32 linkBW, // Link’s native BW u32 availBW} // BW available for allocation struct epoint_t {u32 bw, u32 ipaddr; // interface’s IP address u16 port, // UDP port number for meta-interface u32 bw;} // total BW required for meta-interface iflist = get_interfaces(iflist); // return list of all available interfaces Estimate the computational complexity and memory bandwidth requirements on NPE. bwSpec = {BW max =totalBW, BW min =0}; // fast path total BW requirement max general NPE resource counts for this example I just assume a max number but in general it may be that a user scales it by the number of meta-interfaces they will use. fpCounts = {FLTR_CNT, QID_CNT, BUFF_CNT, STATS_CNT}; Request substrate to allocate a fastpath instance for the IPv4 code option, assume we will use the default sram buffer sizes. Will also need to listen to returned sockes. [fpid, sockets] = alloc_fastpath(ipv4_copt, bwSpec, fpCnts, {IPV4_SRAM_SZ, 0});
34
34 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Slice Example - Continued allocate one meta-interfaces for each external interface and assign our default UDP port number and BW requirement struct mi_t {uint_t mi; epoint_t rp;}; mi_t milist[iflist.len()]; for (indx = 0, mi = 0; indx < len(iflist); ++indx) { if (miBW > iflist[indx].availBW) throw Error; // allocate total BW required on this interface if ( alloc_fpath_ifbw(fpid, iflist[indx].ifn, miBW)==- 1) throw Error; // Allocate one meta-interface on this interface milist[indx] = alloc_udp_tunnel(fpid, miBW, iflist[indx].ipaddr, myPort) my_bind_queues(milist+indx); my_add_routes(milist+indx); }
35
35 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Test SPP Node srm f1/0 b1 eth0 eth0.2 vlan 2 cp_ctrl = 192.168.64.1/20 cp_data = 171.16.1.1/26 CP noarp eth2 /etc/ dhcpd.conf ethers hosts dhcpd f1/1* f2/1* b2 eth0 eth0.2 vlan 2 gpe1_ctrl = 192.168.64.33/20 gbe1_data = 171.16.1.2/26 gpe1_int = 172.16.1.66/26 GPE1 (Slot 2) noarp eth1 eth2 f1/0 f1/1 b1 eth0 eth0.2 keystone.arl.wustl.edu vlan 2 gpe2_ctrl = 192.168.64.49/20 gbe2_data = 171.16.1.3/26 gpe2_int = 172.16.1.67/26 GPE2 (Slot 3) noarp eth1 eth2 f1/0 b1aeth0 lc_b1a = 192.168.64.97/20 lc1_data = 171.16.1.6/26... Line Card (Slot 6) scd b1beth0 lc_b1b = 192.168.64.98/20 Ingress XScale Egress XScale “Router” 128.252.153.XXX eth0 eth0:0 eth1 192.168.64.2 IP Routing proxy arp for keystone 128.252.153.YYY eth2.2 128.252.153.XXX the ARL network 128.252.153.* keystone.arl.wustl.edu 128.252.153.81 natd vlan 2 f1/0 f1/1 b1 eth0 eth0.2 gpe2_ctrl = 192.168.64.65/20 gbe2_data = 171.16.1.4/26 gpe2_int = 172.16.1.68/26 GPE3 (Slot 4) noarp eth1 eth2 keystone.arl.wustl.edu f1/0 f1/1 b1 eth0 eth0.2 gpe2_ctrl = 192.168.64.81/20 gbe2_data = 171.16.1.5/26 gpe2_int = 172.16.1.69/26 GPE4 (Slot 5) noarp eth1 eth2 keystone.arl.wustl.edu /tftpboot/ ramdisk.gz zImage.ppm10 Issue Mounting /opt/crossbuild/* from ebony. Could export dirs form the “Router” host. Or could use ebony rather than “Router”. In that case will need an external switch connecting line cards of spp? to ebony’s eth2.2. /etc/{ethers,hosts} /etc/sysconfig/network-scripts/ifcfg-eth* /etc/{ethers,hosts} /etc/sysconfig/network-scripts/ifcfg-eth* /etc/{ethers,hosts} /etc/sysconfig/network-scripts/ifcfg-eth* /etc/{ethers,hosts} /etc/sysconfig/network-scripts/ifcfg-eth* vlan 2 Hub 192.168.64.17 0/3 0/4 0/5 0/6 RTM 3/2 RTM 3/1 FP 1/6 FP 1/9 0/3 0/4 0/5 0/6 FP 1/7 2/1
36
36 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 6/4/2016 Test Bed Use Core platform issues: –Can we use the second fabric port on the GPE boards? –The hub does not display stats or mac fwd entries for the slots with GPEs. It used to work. –The radisys shelf manager does not reliably reset boards Base1 interface disabled on slot 2 NAT/Line Card testing –Overall reliability –Add support for aging –Specific issues (jdd) restarting line card (without reboot) occasionally results in data-path thinking the scratch ring to the xscale is full. looping iperf test from cp occasionally stalls with no packets getting through LC Lookup needs fix to not use DONE bit to indicate a tcam lookup is done. GPE/Intel board testing
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.