Bending Ironic for Big Iron Doug Szumski, Mark Goddard, Forest Godfrey
Overview OpenStack We need to support: The foundation of Cray’s next generation system management software We need to support: Booting large numbers of diskless compute nodes Cinder integration for Ironic Flexible provisioning of diskful nodes Bareon agent for Ironic
Cinder integration for Ironic Based on upstream spec by Satoru Moriya https://review.openstack.org/#/c/200496 Configured by instance_info fields No additional database tables or changes to any APIs Supports: Booting disklessly from Cinder via iSCSI (no FC) In-band connection to the iSCSI target Attachment of additional volumes at boot time (but not dynamically) Extended support for Dracut based ramdisks through the generation of the PXE config file We’ve shared our implementation here: https://review.openstack.org/#/c/265856 XC series compute blade (4 nodes) up to 48 blades per cabinet 100s of cabinets
Diskless boot 1. Nova boot from CLI 2. Request IP for instance 3. Get storage port MAC address (or IP) and IQN from Ironic node driver info 4. Lookup IP address of the node on the storage network from the storage port MAC address 5. Prepare Cinder volumes and retrieve iSCSI target info 6. Patch Ironic with block device info 7. Call Ironic to begin deployment 8. Cache kernel and ramdisk, build the kernel cmdline using Jinja2 9. Configure TFTP server 10. Setup DHCP for PXE boot 11. Set boot device to PXE 12. Reboot target node 13. Target node broadcasts, DHCP server responds with an IP and the location of the bootloader 14. PXE boot the kernel and ramdisk 15. Mount iSCSI targets and pivot into the rootfs
Bareon (Fuel) Agent What is Bareon? Why does Cray use Bareon? “flexible and data driven interface to perform actions which are related to operating system installation” - wiki.openstack.org/wiki/Bareon In particular, Cray uses the Bareon agent with Ironic Similar in concept to the Ironic Python Agent (IPA) Why does Cray use Bareon? Deploy baremetal nodes in a flexible, perhaps non-cloud like way Deploying multiple images / multi-boot Support complex partitioning schemes Eg. Creation of shared partitions, LVM groups, consistent identification of block devices. Rsync deploy – useful for upgrades / updates Run arbitrary actions during or post deploy https://github.com/openstack/bareon
Bareon agent 1. Nova boot from CLI 2. Request IP 3. Nova calls Ironic 4. Configure TFTP server 5. Cache images (deploy kernel & ramdisk, filesystem, cloud_default_deploy_config, deploy_config and driver_actions) and write provision script for Bareon agent 6. Update MAC and PXE config 7. Set boot device to PXE 8. Reboot target node 9. Target node gets IP 10. PXE boot the Bareon agent 11. Bareon agent calls back 12. SFTP across provision script and forward rsync server port by SSH 13. Trigger provisioning by SSH: provision --data_driver ironic --deploy_driver rsync 14. Partition and clean local storage, mount partitions, rsync filesystem across, write fstab, configure bootloader and unmount partitions 15. Run driver actions over SSH, eg update BIOS, SFTP file across from Swift 16. Set boot device to local disk 17. Reboot node
Scaling Ironic Where are we at? Immediate focus point Diskless boot tested on a 128 node system Read only Cinder volume with multi-attach and overlay filesystem Ironic multi-conductor Immediate focus point Deploying OpenStack with Kolla Support scaling of OpenStack services Where do we want to go? 100,000 (?) nodes by 2018 for Shasta http://www.cray.com/blog/the-cray-shasta-system/
Thank you for listening