OpenStack:Ironic
Recommended kolla-ansible overrides
NOTE: Stein or newer is recommended for this, but when multitenancy won't be enabled, Rocky will also work.
Overrides in /etc/kolla/globals.yml enabling Ironic and its Neutron agent (OpenStack Stein):
enable_ironic: "yes" enable_ironic_neutron_agent: "yes"
Overrides in /etc/kolla/config/ironic/ironic-conductor.conf:
[DEFAULT] enabled_network_interfaces = noop,flat default_network_interface = flat [pxe] tftp_server = <ip-of-a-dedicated-interface> pxe_append_params = nofb nomodeset vga=normal console=ttyS1,115200 console=tty0 sshkey="ssh-rsa AAAA..." ipa-debug=1 coreos.autologin [deploy] default_boot_option = local [agent] deploy_logs_collect = on_failure [conductor] clean_callback_timeout = 300
The IP for "tftp_server" can be the same as for the interface on which internal OpenStack APIs are running on the host that's hosting Ironic (only setup with one controller hosting all Ironic components has been tested). Also set "sshkey" to the public key of your deploy node or headnode to enable SSH access to the node that's being deployed.
=== Multitenancy Deployments with multitenancy require a few more overrides.
First, enable the "neutron" network interface by replacing
[DEFAULT] enabled_network_interfaces = noop,flat </nowkik> with <nowiki> [DEFAULT] enabled_network_interfaces = noop,flat,neutron
in /etc/kolla/config/ironic/ironic-conductor.conf.
You can also change default_network_interface to neutron if you want the multitenant driver to be the default for all nodes. The network driver can be set on each node individually, so you might as well leave the default to be flat and only enable the driver on selected (or all) nodes.
The "neutron" Ironic driver requires specifying provisioning and cleaning networks upfront in the config file, so add these lines to /etc/kolla/config/ironic/ironic-conductor.conf:
[neutron] cleaning_network = <provisioning-network-UUID> provisioning_network = <provisioning-network-UUID>
and add UUIDs of your provisioning network. The provisioning network should be a network (preferably flat, but can also be VALN tagged) with external router set to a gateway for this network set on the physical switch.
There are also extra overrides for the ML2 driver which should be added to /etc/kolla/config/neutron/ml2_conf.ini. Here is an example for a Supermicro switch (confirmed working on SSE-G24-TG4 and MBM-GEM-004):
[ml2] mechanism_drivers = openvswitch,baremetal,l2population,genericswitch [genericswitch:supermicro_1] #device_type = netmiko_supermicro device_type = netmiko_cisco_s300 ip = <IP-for-ssh-access-to-baremetal-switch> username = <admin-user-on-switch> password = <admin-password-on-switch> [ml2_type_vlan] network_vlan_ranges = physnet1:100:200
Here mechanism_drivers adds genericswitch to the list of enabled drivers, which is then configured in the [genericswitch:] subsection. The existing netmiko driver for Cisco IOS is used and the driver excepts SSH access to be configured on the switch. Notice that password specified in this config is plain text (not encrypted). Lastly, a range of VLANs is specified on the provider interface that connects to baremetal nodes, so only a subset of VLAN IDs is used for tenant networks.
Recommended post-deployment configuration
NOTE: Steps described here are for localboot nodes and Ironic without multitenancy only.
First, build CoreOS-based Ironic Python Agent (IPA) deploy images. Here are commands to set up your IPA image building environment on a Ubuntu Xenial:
$ sudo apt-get update $ sudo apt-get install docker.io gzip uuid-runtime cpio findutils grep gnupg cgroup-lite git build-essential python-pip python-dev -y $ sudo service docker start $ git clone https://git.openstack.org/openstack/ironic-python-agent $ cd ironic-python-agent/imagebuild/coreos $ git checkout cf30024f96e798f5607c9664b0b6236db3232119 $ sudo pip install -r ~/ironic-python-agent/requirements.txt $ sudo make
Transfer/mount images from the UPLOAD subdirectory to your controller/headnode and add them to Glance:
$ openstack image create --public --container-format aki --disk-format aki --file ~/coreos_production_pxe.vmlinuz ironic-deploy_kernel $ openstack image create --public --container-format ari --disk-format ari --file ~/coreos_production_pxe_image-oem.cpio.gz ironic-deploy_ramdisk
Also add regular operating system images - no special options are required here and regular cloud images can be used. Here is an example for a CentOS image:
$ openstack image create --public --container-format bare --disk-format qcow2 --file ~/CentOS-7-x86_64-GenericCloud-1907.qcow2 centos7-1907
Create a flavour for your baremetal nodes:
$ openstack flavor create --ram 1024 --vcpus 2 --disk 100 baremetal.small $ openstack flavor set --property resources:CUSTOM_BAREMETAL=1 baremetal.small $ openstack flavor set --property resources:VCPU=0 baremetal.small $ openstack flavor set --property resources:MEMORY_MB=0 baremetal.small $ openstack flavor set --property resources:DISK_GB=0 baremetal.small
Create a provisioning network and a subnet. These can be just regular flat provider networks, or can be using VLANs. Here is an example for an interface on a VLAN:
$ openstack network create public304 --provider-physical-network physnet1 --provider-network-type vlan --provider-segment 304 $ openstack subnet create --dhcp --allocation-pool start=10.6.44.101,end=10.6.47.254 --network public304 --subnet-range 10.6.44.0/22 --gateway 10.6.44.1 public304-subnet
Finaly, add your nodes to Ironic's database:
$ openstack baremetal node create --name <node-name> \ --driver ipmi --driver-info ipmi_username=<ipmi-username> --driver-info ipmi_password=<ipmi-password> --driver-info ipmi_address=<ipmi-address> \ --driver-info cleaning_network=<uuid-of-the-provisioning-network> --driver-info provisioning_network=<uuid-of-the-provisioning-network> \ --driver-info deploy_kernel=<uuid-of-the-ironic-deploy_kernel-image> --driver-info deploy_ramdisk=<uuid-of-the-ironic-deploy_ramdisk-image> \ --resource-class baremetal --network-interface flat $ openstack baremetal port create <mac-address-of-the-node's-provisioning-interface> --node <node-uuid> $ openstack baremetal node manage <node-name> $ openstack baremetal node provide <node-name>
If everything went well, a deploy by launching instances, like so:
$ openstack server create --image <image-to-provision-node-with> --flavor baremetal.small --security-group ping-and-ssh --key-name mykey --network <name-of-the-provisioning-network> <instance-name>
=== Multitenancy
When adding a node to Ironic, use --network-interface neutron instead of --network-interface flat.
Next up, when adding a port, you'll need to specify details about the switch port this baremetal node is connected to:
openstack baremetal port set --local-link-connection switch_info="supermicro_1" --local-link-connection switch_id="<MAC-address-of-the-switchport>" --local-link-connection port_id="gi 0/<port-number>" <port-UUID>
Here switch_info is the name given to the genericswitch config section in ironic-conductor.conf, port_id is the name of the port, for example on Supermicro switches this can be gi 0/1, gi 0/2, etc. (this is basically the name that you pass when running the interface <intname> command on the switch). The switch_id is not really that important (as it's not part of configuration of the switch port), but it has to be a MAC address -- makes sense for it to be the MAC of the switch port the node is connected to.
With these in place, you can now create a tenant network on the same provider as the provisioning network and with VLAN as provider type. You don't need to specify VLAN ID, as this one will be selected by Neutron (from the range of VLANs in the ML2 config) and created on the baremetal switch. Also, add to this network a subnet with whatever IP range and DHCP enabled.
Then when launching a baremetal instance, in the --network parameter you specify the tenant network (instead of the provisioning network). Ironic will first provision the node in the provisioning network and when this is done, Neutron will add switchport of the node to the VLAN of the tenant network, so after a reboot the instance will be available in the tenant network.
NOTE: In Stein there is a bug where genericdriver (for Cisco IOS) skips one of the switch command. As a workaround, make sure that all baremetal switchports are set to "access" mode.
References
- https://docs.openstack.org/kolla-ansible/rocky/reference/ironic-guide.html#post-deployment-configuration
- https://docs.openstack.org/ironic/rocky/install/configure-glance-images.html
- https://github.com/openstack/ironic-python-agent/tree/cf30024f96e798f5607c9664b0b6236db3232119/imagebuild/coreos
- https://docs.openstack.org/ironic/stein/admin/multitenancy.html
Troubleshooting
First, make sure all Ironic containers are up and running. Here is a list from a working Stein-based environment:
# docker ps | grep ironic 20505ef5312b registry.vscaler.com:5000/kolla/centos-source-ironic-pxe:stein "dumb-init --single-…" 6 weeks ago Up 4 weeks ironic_pxe 8f4280b17715 registry.vscaler.com:5000/kolla/centos-source-ironic-api:stein "dumb-init --single-…" 6 weeks ago Up 6 weeks ironic_api e9e080891270 registry.vscaler.com:5000/kolla/centos-source-ironic-conductor:stein "dumb-init --single-…" 6 weeks ago Up 8 days ironic_conductor 1ccbd8b0e0e5 registry.vscaler.com:5000/kolla/centos-binary-ironic-neutron-agent:stein "dumb-init --single-…" 6 weeks ago Up 6 weeks ironic_neutron_agent 5313cc87fff0 registry.vscaler.com:5000/kolla/centos-binary-nova-compute-ironic:stein "dumb-init --single-…" 6 weeks ago Up 8 days nova_compute_ironic
Re-run kolla-ansible deploy with tags "ironic" if ironic_pxe, ironic_api or ironic_conductor are missing, "neutron" if ironic_neutron_agent is missing or "nova" if nova_compute_ironic is missing.
Compare the list of baremetal nodes outputted by openstack baremetal node list with the list of hypervisors from openstack hypervisor list and make sure each node's UUID is on the list of hypervisors. Check ironic-conductor logs if they're not. Fix the problem, remove the problematic nodes and add them back in.
Keep an eye on the iKVM virtual console when building an instance and deal with any DHCP, TFTP or PXE errors that show up. {TODO: Add examples of such problems and solutions to them} If the machine PXE boots successfully, the machine will boot the deploy image that gives you access to the coreos user. Make sure that the machine can reach internal OpenStack API.
If your instance shows up as ACTIVE, but you can't ping it, you should delete the instance and create it again using an image that has root password enabled so you can log into the machine through its OOB console. To add a root password to regular cloud images, you can use the virt-customize tool. Here is a command adding a root password to a CentOS 7 image:
virt-customize -a CentOS-7-x86_64-GenericCloud-1907.qcow2 --root-password password:vScaler2019?!
Cloud-init only brings up the first connected interface, so in most cases this means an onboard interface. If this is not the interface you PXE boot from, you may want to bake a custom config for cloud-init, like this one:
network:
version: 2
ethernets:
enp59s0f0:
dhcp4: true
or this more advanced one - with a bond:
network:
version: 2
ethernets:
eno1:
dhcp4: true
bonds:
bond0:
interfaces:
- enp175s0f0
- enp175s0f1
dhcp4: true
parameters:
mode: 802.3ad
mii-monitor-interval: 100
Put your config in /etc/cloud/cloud.cfg.d/custom-networking.cfg
Baremetal instances have their ports marked as DOWN in Neutron, like this:
# openstack server list +--------------------------------------+------------+--------+-----------------------------+--------------+-----------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+------------+--------+-----------------------------+--------------+-----------------+ | 8314b478-310b-4d2a-a316-180c380ab5cf | compute001 | ACTIVE | ironic-prov=192.168.202.104 | centos7-1907 | baremetal.small | +--------------------------------------+------------+--------+-----------------------------+--------------+-----------------+ # openstack port list | grep 192.168.202.104 | 2d43f9c2-e2a1-420c-8bb3-63a2d01e73c0 | | b8:59:9f:e2:37:0e | ip_address='192.168.202.104', subnet_id='c54051ef-6344-4026-80da-529f57df6213' | DOWN | # openstack baremetal port list | grep b8:59:9f:e2:37:0e | 957ed1c1-416e-468e-a87b-f14a7eaf5255 | b8:59:9f:e2:37:0e | # openstack baremetal port show 957ed1c1-416e-468e-a87b-f14a7eaf5255 | grep node | node_uuid | 1907746d-4663-48d5-9b93-bd63165db382 | # openstack baremetal node list | grep 1907746d-4663-48d5-9b93-bd63165db382 | 1907746d-4663-48d5-9b93-bd63165db382 | node011 | 8314b478-310b-4d2a-a316-180c380ab5cf | power on | active | False |
This is normal and nothing to worry about. {TODO: Why is this port marked as DOWN?}
References
Ironic POC basic test environment
Revision of kolla-ansible used (branch stable/rocky):
commit 668da3c332fcd58fa2b023e8bb74ca8225e222bc
Author: Jeffrey Zhang <zhang.lei.fly@gmail.com>
Date: Tue Dec 11 16:01:03 2018 +0800
Add cache configuration for ceilometer project
when using ceilometer+gnocchi, for every notification sample, ceilometer
will update the resource even if is not updated.
We should add [cache] section to make ceilometer cache the resource, and
stop send the useless update request.
Closes-Bug: #1807841
Change-Id: Ic33b4cd5ba8165c20878cab068f38a3948c9d31d
(cherry picked from commit 55bf29ec6c459dc46cefdee69acb8e427763e409)
Standard all-in-one inventory has been used.
kolla-ansible config (/etc/kolla/globals.yml):
---
config_strategy: "COPY_ALWAYS"
kolla_base_distro: "centos"
kolla_install_type: "binary"
openstack_release: "7.0.2"
kolla_internal_vip_address: "192.168.10.254"
kolla_external_vip_address: "172.28.128.254"
docker_registry: "registry.vscaler.com:5000"
network_interface: "enp131s0f1.10"
kolla_external_vip_interface: "eno1"
neutron_external_interface: "enp131s0f0"
neutron_bridge_name: "br-ironic"
neutron_plugin_agent: "openvswitch"
enable_cinder_backup: "no"
enable_haproxy: "yes"
enable_heat: "yes"
enable_horizon: "yes"
enable_horizon_ironic: "{{ enable_ironic | bool }}"
enable_ironic: "yes"
enable_ironic_neutron_agent: "yes"
enable_swift: "no"
tempest_image_id:
tempest_flavor_ref_id:
tempest_public_network_id:
tempest_floating_network_name:
neutron_tenant_network_types: "vlan,flat"
enable_neutron_provider_networks: yes
Config overrides for Ironic (/etc/kolla/config/ironic/ironic-conductor.conf):
[DEFAULT] my_ip=192.168.10.10 enabled_network_interfaces=noop,flat,neutron default_network_interface=flat [deploy] default_boot_option = netboot
Here, eno1 is the interface providing access to the Ironic host from inside the Labs:
[root@alanis ~]# cat /etc/sysconfig/network-scripts/ifcfg-eno1 TYPE="Ethernet" PROXY_METHOD="none" BROWSER_ONLY="no" BOOTPROTO="none" DEFROUTE="yes" IPV4_FAILURE_FATAL="no" IPV6INIT="yes" IPV6_AUTOCONF="yes" IPV6_DEFROUTE="yes" IPV6_FAILURE_FATAL="no" IPV6_ADDR_GEN_MODE="stable-privacy" NAME="eno1" UUID="7e04ebd9-2e2a-4297-b417-b9cb5e498259" DEVICE="eno1" ONBOOT="yes" IPADDR="172.28.128.1" PREFIX="16" GATEWAY="172.28.0.2" DNS1="172.28.0.2" DNS2="8.8.8.8" IPV6_PRIVACY="no"
enp131s0f0 is an interface that is up, but has no IP set (this will be used by Neutron to put external bridge on) and enp131s0f1.10 is a tagged secondary interface used for hosting internal OpenStack APIs and Ironic's TFTP server.
[root@alanis ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp131s0f1.10 DEVICE=enp131s0f1.10 NAME=enp131s0f1.10 BOOTPROTO=none ONBOOT=yes IPADDR=192.168.10.10 PREFIX=24 NETWORK=192.168.10.0 VLAN=yes
IPMI to the baremetal host is available through another tagged interface, enp131s0f1.201:
[root@alanis ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp131s0f1.201 DEVICE=enp131s0f1.201 NAME=enp131s0f1.201 BOOTPROTO=none ONBOOT=yes IPADDR=192.168.201.254 PREFIX=24 NETWORK=192.168.201.0 VLAN=yes
Note that because OpenStack here is deployed on a single machine, HAProxy is not strictly required.
If your playbook creates the ironic_dnsmasq container, stop/remove it, so you don't run into potential problems with 2 DHCPs on the same network.
Post deployment setup
For building baremetal images follow https://docs.openstack.org/ironic/rocky/install/configure-glance-images.html Here is an example of an Ubuntu image with a heat agent allowing to run script-based software deployments:
$ disk-image-create baremetal ubuntu dhcp-all-interfaces os-collect-config os-refresh-config os-apply-config heat-config heat-config-cfn-init heat-config-script -o ubuntu-software-config-ironic.qcow2
Add images to Glance:
$ openstack image create --container-format aki --disk-format aki --file ~/ubuntu-software-config-ironic.vmlinuz ubuntu-software-config-ironic_kernel $ openstack image create --container-format ari --disk-format ari --file ~/ubuntu-software-config-ironic.initrd ubuntu-software-config-ironic_initramfs $ openstack image create --public --container-format bare --disk-format qcow2 --file ~/ubuntu-software-config-dgx.qcow2 \ --property kernel_id=a8743614-38dc-43ed-8d3b-f4e1b4240eb0 --property ramdisk_id=1d370924-762b-49a0-8619-7f13aa8dafe1 ubuntu-software-config-dgx
In the last command kernel_id and ramdisk_id point to UUIDs of kernel and ramdisk images assigned them by Glance.
Note on localboot:
The above setup is susceptible to this bug: https://storyboard.openstack.org/#!/story/2002929. To avoid the problem you can set default_boot_option = local in Ironic overrides, so that your baremetal servers will be able to boot from their local disk after they are done provisioning. More importantly, with local boot you can use regular cloud images - without having to extract kernel and ramdisk out of them first (you'll still need the kernel and ramdisk for initial deploy).
This approach will be used in the next section.
Ironic POC with multi-tenancy
TODO: Add diagram
kolla-ansible config (/etc/kolla/globals.yml):
---
config_strategy: "COPY_ALWAYS"
kolla_base_distro: "centos"
kolla_install_type: "binary"
openstack_release: "7.0.2"
kolla_internal_vip_address: "192.168.10.254"
kolla_external_vip_address: "172.28.128.254"
docker_registry: "registry.vscaler.com:5000"
network_interface: "enp131s0f1.10"
kolla_external_vip_interface: "eno1"
neutron_external_interface: "enp131s0f0"
neutron_bridge_name: "br-ironic"
neutron_plugin_agent: "openvswitch"
enable_cinder_backup: "no"
enable_haproxy: "yes"
enable_heat: "yes"
enable_horizon: "yes"
enable_horizon_ironic: "{{ enable_ironic | bool }}"
enable_ironic: "yes"
enable_ironic_neutron_agent: "yes"
enable_swift: "no"
neutron_tenant_network_types: "vlan,flat"
neutron_server_image: "registry.vscaler.com:5000/kolla/centos-source-neutron-server-with-genericswitch"
neutron_server_tag: "7.1.0"
tempest_image_id:
tempest_flavor_ref_id:
tempest_public_network_id:
tempest_floating_network_name:
enable_neutron_provider_networks: yes
Ironic-specific overrides (/etc/kolla/config/ironic/ironic-conductor.conf):
[DEFAULT] my_ip=192.168.10.10 enabled_network_interfaces=noop,flat,neutron default_network_interface=neutron [deploy] default_boot_option = local
Network interface config is exactly the same as in the previous iteration of the deployment (without multi-tenancy).