Lessons learned and proposals Sylvain Desbureaux, Morgan Richomme November 21, 2017
Lessons learned (1) Heat deployment is not reliable when cloud-init is heavily used Don’t know simply what happened (and at least if it’s ongoing, OK or error) Lots of stuff just because the VM doesn’t know simply the other VMs K8S deployment is a lot less error prone It may be complicated to use K8S in “Big Telco” today As Heat is not reliable, very hard for people who installs to know if that went well or not. Except the DNS VM, all the other VM are “simply” container ships but their installation is always a bit different: Some are Ubuntu 14, others are Ubuntu 16. Some (re)build the dockers, others are pulling from the repo
Lessons learned (2) Some containers are huge, which significantly slows down the installation process Example : dgbuilder is 980M whereas node-red “official” docker is 269M (latest), 92M (slim) or 23M (alpine)! Robot tests are good for end 2 end but not good to know the states of each component “live” OOM is using consul and it is very efficient
Lessons learned (3) Automation in 1 reference lab is not enough to be trustable Manual workarounds not reported Humane feedback is humane… Real CI/CD & DevOPS principles must be implemented (src OPNFV XCI) Fail fast, fix fast Always have working software Small and frequent commits Work against the trunk, shortening development time Fast and tailored feedback Everything is visible to everyone all the time
Orange proposal for Beijing (1) Reliable installation Use Ansible to deploy ONAP In VM mode (on top of OpenStack first, deployment on other cloud types later) In K8S mode (roughly wrap all the work of OOM) Orange can make a demo during Santa Clara event on some components Same installation for all VM, except the flavor, again thanks to Ansible Proposal to use Ubuntu 16.04 at first (other like centos later) ” Slimification” of all containers to speed up the installation process Use Alpine based Docker at first (+ Ubuntu based later on, and CentOS if someone wants)
Orange proposal for Beijing (2) In parallel of a reliable process for installation, we need to strengthen gating Automated tests (with sufficient footprint), build and deploy docker components (so the interest of small ones) Automated installation every night at integration lab + robot tests if installation is OK
Orange proposal for Beijing (3) Use of Consul to: Distribute components address (consul is also DNS). We may have to keep the DNS to know consul address Supervise components with one consul agent per component (so one per VM) Have the configuration for each components in order to simplify boot process and horizontal scaling