OpenStack:Neutron
Jumbo frames MTU in Neutron
First of all, enable jumbo frames (MTU 9000) on your switch and tunnel interfaces on hypervisors.
Then enable jumbo frames on Neutron SDN networks using kolla-ansible with these overrides:
# cat /etc/kolla/config/neutron.conf [DEFAULT] global_physnet_mtu = 9000 # cat /etc/kolla/config/neutron/ml2_conf.ini [ml2] path_mtu = 9000
To apply these changes run a kolla-ansible reconfigure with the "neutron,openvswitch" tags.
When this is done, you can either create a new network:
openstack network create <name-for-the-new-network> --mtu 8950
or change MTU on an existing network:
openstack network set --mtu 8950 <name-of-an-existing-network>
NOTE: MTU on Neutron networks is set to 8950 (and not 9000) to account for an overhead from the VXLAN encapsulation header.
NOTE: MTU value set on interfaces of existing instances won't change until the next DHCP lease renewal. You can either log into your instance and force a DHCP lease renewal or simply restart the instance. You can also change the MTU manually using the ip link set <interface-name> mtu 8950 command.
References
L2 Gateway (L2GW)
L2GW POC diagram:
Traffic coming out of the instance running on the hypervisor is encapsulated with VXLAN and arrives on the switch over the 10.1.10.0/24 network where the VXLAN VNI is translated to a VLAN tag by the OVSDB server and traffic is sent to the baremetal node on a VLAN tagged interface.
Switch config:
switch-2e63f4 [standalone: master] (config) # show version concise X86_64 3.9.0300 2020-02-26 19:25:24 x86_64 switch-2e63f4 [standalone: master] (config) # vlan 11 switch-2e63f4 [standalone: master] (config vlan 11) # exit switch-2e63f4 [standalone: master] (config) # interface vlan 11 ip address 10.1.10.1/24 switch-2e63f4 [standalone: master] (config) # interface vlan 11 no shutdown switch-2e63f4 [standalone: master] (config) # interface ethernet 1/2 switchport mode access switch-2e63f4 [standalone: master] (config) # interface ethernet 1/2 switchport access vlan 11 switch-2e63f4 [standalone: master] (config) # interface loopback 1 ip address 1.1.1.1/32 switch-2e63f4 [standalone: master] (config) # ip routing vrf default switch-2e63f4 [standalone: master] (config) # protocol nve switch-2e63f4 [standalone: master] (config) # interface nve 1 switch-2e63f4 [standalone: master] (config interface nve 1) # vxlan source interface loopback 1 switch-2e63f4 [standalone: master] (config interface nve 1) # exit switch-2e63f4 [standalone: master] (config) # interface ethernet 1/4 no switchport access vlan switch-2e63f4 [standalone: master] (config) # interface ethernet 1/4 nve mode only force switch-2e63f4 [standalone: master] (config) # ovs ovsdb server switch-2e63f4 [standalone: master] (config) # ovs ovsdb server listen tcp port 6640 switch-2e63f4 [standalone: master] (config) #
On the baremetal node:
[centos compute-intel03 ~]$ sudo ip link set ens1f0 up
[centos compute-intel03 ~]$ sudo ip address flush ens1f0
[centos compute-intel03 ~]$ ip address show ens1f0
4: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
link/ether 98:03:9b:91:d7:ae brd ff:ff:ff:ff:ff:ff
[centos compute-intel03 ~]$ sudo ip link add link ens1f0 name ens1f0.8 type vlan id 8
[centos compute-intel03 ~]$ sudo ip link set ens1f0.8 up
[centos compute-intel03 ~]$ sudo ip address add 192.168.8.13/24 dev ens1f0.8
[centos compute-intel03 ~]$ sudo ip link set ens1f0.8 mtu 9000
[centos compute-intel03 ~]$ ip address show ens1f0.8
20: ens1f0.8'@'ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether 98:03:9b:91:d7:ae brd ff:ff:ff:ff:ff:ff
inet 192.168.8.13/24 scope global ens1f0.8
valid_lft forever preferred_lft forever
inet6 fe80::9a03:9bff:fe91:d7ae/64 scope link
valid_lft forever preferred_lft forever
[centos compute-intel03 ~]$
On the hypervisor:
[centos compute-intel02 ~]$ ip address show ens1f0
4: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
link/ether 98:03:9b:91:c4:ae brd ff:ff:ff:ff:ff:ff
inet 10.1.10.12/24 brd 10.1.10.255 scope global ens1f0
valid_lft forever preferred_lft forever
inet6 fe80::9a03:9bff:fe91:c4ae/64 scope link
valid_lft forever preferred_lft forever
[centos compute-intel02 ~]$ sudo ip r add 1.1.1.1/32 via 10.1.10.1
[centos compute-intel02 ~]$ ping -c 3 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=0.144 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=64 time=0.131 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=64 time=0.133 ms
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.131/0.136/0.144/0.005 ms
[centos compute-intel02 ~]$
On the controller (there are no L2GW agent Kolla images available currently):
(venv) [centos baremetal-network0 ~]$ sudo docker ps -a | grep neutron_server
2e12640b0c9b registry.vscaler.com:5000/kolla/centos-binary-neutron-server:train "dumb-init --single-…" 3 months ago Up 8 seconds neutron_server
(venv) [centos baremetal-network0 ~]$ sudo cp -a /etc/kolla/neutron-server/neutron.conf{,.orig}
(venv) [centos baremetal-network0 ~]$ sudo vim /etc/kolla/neutron-server/neutron.conf
(venv) [centos baremetal-network0 ~]$ sudo diff -U0 /etc/kolla/neutron-server/neutron.conf{.orig,}
--- /etc/kolla/neutron-server/neutron.conf.orig
+++ /etc/kolla/neutron-server/neutron.conf
@@ -15 +15 @@
-service_plugins = router
+service_plugins = router,networking_l2gw.services.l2gateway.plugin.L2GatewayPlugin
(venv) [centos baremetal-network0 ~]$ sudo cp -a /etc/kolla/neutron-server/config.json{,.orig}
(venv) [centos baremetal-network0 ~]$ sudo diff -U0 /etc/kolla/neutron-server/config.json{.orig,}
--- /etc/kolla/neutron-server/config.json.orig
+++ /etc/kolla/neutron-server/config.json
@@ -2 +2 @@
- "command": "neutron-server --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --config-file /etc/neutron/neutron_lbaas.conf --config-file /etc/neutron/neutron_vpnaas.conf",
+ "command": "neutron-server --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --config-file /etc/neutron/neutron_lbaas.conf --config-file /etc/neutron/neutron_vpnaas.conf --config-file=/etc/neutron/l2gw_plugin.ini",
@@ -24,0 +25,12 @@
+ "owner": "neutron",
+ "perm": "0600"
+ },
+ {
+ "source": "/var/lib/kolla/config_files/l2gw_plugin.ini",
+ "dest": "/etc/neutron/l2gw_plugin.ini",
+ "owner": "neutron",
+ "perm": "0600"
+ },
+ {
+ "source": "/var/lib/kolla/config_files/l2gateway_agent.ini",
+ "dest": "/etc/neutron/l2gateway_agent.ini",
(venv) [centos baremetal-network0 ~]$ grep -E -v '^$|^#' /etc/kolla/neutron-server/l2gw_plugin.ini
[DEFAULT]
[service_providers]
service_provider=L2GW:l2gw:networking_l2gw.services.l2gateway.service_drivers.rpc_l2gw.L2gwRpcDriver:default
(venv) [centos baremetal-network0 ~]$ sudo grep -E -v '^#|^$' /etc/kolla/neutron-server/l2gateway_agent.ini
[DEFAULT]
debug = True
[ovsdb]
ovsdb_hosts = 'ovsdb1:10.1.5.1:6640'
(venv) [centos baremetal-network0 ~]$ sudo docker restart neutron_server
neutron_server
(venv) [centos baremetal-network0 ~]$ sudo docker exec -it -u root neutron_server neutron-db-manage upgrade heads
...
(venv) [centos baremetal-network0 ~]$ sudo mkdir -p /opt/neutron /opt/neutron/log
(venv) [centos baremetal-network0 ~]$ sudo cp -a /etc/kolla/neutron-server/neutron.conf /opt/neutron/neutron.conf
(venv) [centos baremetal-network0 ~]$ sudo vim /opt/neutron/neutron.conf
(venv) [centos baremetal-network0 ~]$ sudo diff -U0 /etc/kolla/neutron-server/neutron.conf /opt/neutron/neutron.conf
--- /etc/kolla/neutron-server/neutron.conf
+++ /opt/neutron/neutron.conf
@@ -3 +3 @@
-log_dir = /var/log/kolla/neutron
+log_dir = /opt/neutron/log
@@ -7 +7 @@
-api_paste_config = /usr/share/neutron/api-paste.ini
+api_paste_config = /opt/neutron/api-paste.ini
@@ -13 +13 @@
-metadata_proxy_socket = /var/lib/neutron/kolla/metadata_proxy
+metadata_proxy_socket = /opt/neutron/metadata_proxy
@@ -36 +36 @@
-lock_path = /var/lib/neutron/tmp
+lock_path = /opt/neutron/tmp
@@ -39 +39 @@
-root_helper = sudo neutron-rootwrap /etc/neutron/rootwrap.conf
+root_helper = sudo sh -c ". /home/centos/venv/bin/activate; which neutron-rootwrap /opt/neutron/rootwrap.conf"
@@ -74,2 +74 @@
-helper_command = sudo neutron-rootwrap /etc/neutron/rootwrap.conf privsep-helper
-
+helper_command = sudo sh -c ". /home/centos/venv/bin/activate; which neutron-rootwrap /opt/neutron/rootwrap.conf privsep-helper"
(venv) [centos baremetal-network0 ~]$ sudo cp -a /etc/kolla/neutron-server/l2gw_plugin.ini /opt/neutron/l2gw_plugin.ini
(venv) [centos baremetal-network0 ~]$ sudo cp -a /etc/kolla/neutron-server/l2gateway_agent.ini /opt/neutron/l2gateway_agent.ini
(venv) [centos baremetal-network0 ~]$ sudo chown -R centos /opt/neutron
(venv) [centos baremetal-network0 ~]$ pip install networking-l2gw==15.0.0
...
(venv) [centos baremetal-network0 ~]$ screen -dmS l2gw-agent neutron-l2gateway-agent --debug --config-file /opt/neutron/neutron.conf --config-file /opt/neutron/l2gateway_agent.ini
(venv) [centos baremetal-network0 ~]$ openstack network create l2gwnet --provider-network-type vxlan --internal --mtu 1450
...
(venv) [centos baremetal-network0 ~]$ openstack subnet create --dhcp --allocation-pool start=192.168.8.100,end=192.168.8.200 --network l2gwnet --subnet-range 192.168.8.0/24 --dns-nameserver 8.8.8.8 l2gwsubnet
...
(venv) [centos baremetal-network0 networking-l2gw]$ openstack server create --image centos7-1907-scsi --flavor m1.small --security-group ping-and-ssh --key-name mykey --network l2gwnet test_l2gw --availability-zone nova:compute-intel02.novalocal
...
(venv) [centos baremetal-network0 ~]$ openstack l2gw create --device name="vtep0",interface_names="eth4" MLNX-GW-ETH4
...
(venv) [centos baremetal-network0 ~]$ openstack l2gw connection create --default-segmentation-id 8 MLNX-GW-ETH4 l2gwnet
...
(venv) [centos baremetal-network0 ~]$ openstack l2gw connection list
+--------------------------------------+----------------------------------+--------------------------------------+--------------------------------------+-----------------+
| ID | Tenant | L2 GateWay ID | Network ID | Segmentation ID |
+--------------------------------------+----------------------------------+--------------------------------------+--------------------------------------+-----------------+
| 8f89f1b8-50fe-4aa5-95a4-3b6d0c45b828 | 5c2c12b1d2ce4ae899aa671652428732 | 87bf7475-4edf-4524-9bc6-89849e2e6ed8 | f3c12c25-4fab-46a3-bf82-03350cb91d9e | 8 |
+--------------------------------------+----------------------------------+--------------------------------------+--------------------------------------+-----------------+
(venv) [centos baremetal-network0 ~]$
To test the setup, launch an instance in the l2gwnet network and try pinging the baremetal node at its VLAN tagged IP from the 192.168.8.0/24 network.
(venv) [centos baremetal-network0 ~]$ openstack server create --image centos7-1907-scsi --flavor m1.small --security-group ping-and-ssh --key-name mykey --network l2gwnet test_intel02 --availability-zone nova:compute-intel02.novalocal ... (venv) [centos baremetal-network0 ~]$ openstack server add floating ip test_intel02 172.28.139.44 (venv) [centos baremetal-network0 ~]$ ssh centos'@'172.28.139.44 ... [centos test-intel02 ~]$ ping -c 3 192.168.8.13 PING 192.168.8.13 (192.168.8.13) 56(84) bytes of data. 64 bytes from 192.168.8.13: icmp_seq=1 ttl=64 time=0.491 ms 64 bytes from 192.168.8.13: icmp_seq=2 ttl=64 time=0.244 ms 64 bytes from 192.168.8.13: icmp_seq=3 ttl=64 time=0.224 ms --- 192.168.8.13 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2000ms rtt min/avg/max/mdev = 0.224/0.319/0.491/0.123 ms [centos test-intel02 ~]$
References
- https://community.mellanox.com/s/article/howto-configure-openstack-l2-gateway-with-mellanox-spectrum-switch--vtep-x
- https://opendev.org/x/networking-l2gw
ASAP^2
ASAP^2 POC diagram:
[centos compute-intel02 ~]$ cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [centos compute-intel02 ~]$ uname -a Linux compute-intel02.novalocal 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux [centos compute-intel02 ~]$ lspci -nn | grep Mellanox 81:00.0 Ethernet controller [0200]: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [15b3:1019] 81:00.1 Ethernet controller [0200]: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [15b3:1019] [centos compute-intel02 ~]$
First, log into your hypervisors and add intel_iommu=on iommu=pt to GRUB_CMDLINE_LINUX in /etc/default/grub to enable IOMMU. Run grub2-mkconfig -o /boot/grub2/grub.cfg to generate a new grub config.
Next up, reboot your hypervisors and make sure SR-IOV is enabled in BIOS ("Advanced" -> "PCIe/PCI/PnP Configuration" -> "SR-IOV Support").
When the node is back up, you can enable SR-IOV on your ConnectX-5 Mellanox cards (you may need to update NIC firmware to at least version 16.21.0338 first):
[centos compute-intel02 ~]$ sudo yum install mstflint -y
...
[centos compute-intel02 ~]$ rpm -qa | grep mstflint
mstflint-4.13.3-2.el7.x86_64
[centos compute-intel02 ~]$ lspci -nn | grep Mellanox
81:00.0 Ethernet controller [0200]: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [15b3:1019]
81:00.1 Ethernet controller [0200]: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [15b3:1019]
[centos compute-intel02 ~]$ sudo mstflint -d 81:00.0 q
Image type: FS4
FW Version: 16.23.1020
FW Release Date: 10.7.2018
Product Version: 16.23.1020
Rom Info: type=UEFI version=14.16.17 cpu=AMD64
type=PXE version=3.5.504 cpu=AMD64
Description: UID GuidsNumber
Base GUID: 98039b030091c4ae 8
Base MAC: 98039b91c4ae 8
Image VSD: N/A
Device VSD: N/A
PSID: MT_0000000013
Security Attributes: N/A
[centos compute-intel02 ~]$ sudo mstconfig -d 81:00.0 set SRIOV_EN=1 NUM_OF_VFS=8
...
The last command will ask you to reboot the node. Do this and after your nodes are back up, run these commands to create VFs and enable OVS offload:
[centos compute-intel02 ~]$ echo 2 | sudo tee /sys/class/net/ens1f0/device/sriov_numvfs
2
[centos compute-intel02 ~]$ ip l sh ens1f0
4: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether 98:03:9b:91:c4:ae brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state auto, trust off, query_rss off
vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto, trust off, query_rss off
[centos compute-intel02 ~]$ sudo ip l set ens1f0 vf 0 mac e4:11:22:33:44:60
[centos compute-intel02 ~]$ sudo ip l set ens1f0 vf 1 mac e4:11:22:33:44:61
[centos compute-intel02 ~]$ ip l sh ens1f0
4: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether 98:03:9b:91:c4:ae brd ff:ff:ff:ff:ff:ff
vf 0 MAC e4:11:22:33:44:60, spoof checking off, link-state auto, trust off, query_rss off
vf 1 MAC e4:11:22:33:44:61, spoof checking off, link-state auto, trust off, query_rss off
[centos compute-intel02 ~]$ ls -l /sys/class/net/ | grep 81:00
lrwxrwxrwx. 1 root root 0 Nov 13 13:27 ens1f0 -> ../../devices/pci0000:80/0000:80:02.0/0000:81:00.0/net/ens1f0
lrwxrwxrwx. 1 root root 0 Nov 13 13:27 ens1f1 -> ../../devices/pci0000:80/0000:80:02.0/0000:81:00.1/net/ens1f1
lrwxrwxrwx. 1 root root 0 Nov 13 13:29 eth0 -> ../../devices/pci0000:80/0000:80:02.0/0000:81:00.2/net/eth0
lrwxrwxrwx. 1 root root 0 Nov 13 13:29 eth1 -> ../../devices/pci0000:80/0000:80:02.0/0000:81:00.3/net/eth1
[centos compute-intel02 ~]$ echo '0000:81:00.2' | sudo tee /sys/bus/pci/drivers/mlx5_core/unbind
0000:81:00.2
[centos compute-intel02 ~]$ echo '0000:81:00.3' | sudo tee /sys/bus/pci/drivers/mlx5_core/unbind
0000:81:00.3
[centos compute-intel02 ~]$ sudo devlink dev eswitch set pci/0000:81:00.0 mode switchdev
[centos compute-intel02 ~]$ sudo ethtool -K ens1f0 hw-tc-offload on
[centos compute-intel02 ~]$ echo '0000:81:00.2' | sudo tee /sys/bus/pci/drivers/mlx5_core/bind
0000:81:00.2
[centos compute-intel02 ~]$ echo '0000:81:00.3' | sudo tee /sys/bus/pci/drivers/mlx5_core/bind
0000:81:00.3
[centos compute-intel02 ~]$ sudo docker exec -it -u root neutron_openvswitch_agent ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
[centos compute-intel02 ~]$ sudo docker exec -it -u root neutron_openvswitch_agent ovs-vsctl set Open_vSwitch . other_config:max-idle=30000
[centos compute-intel02 ~]$ sudo docker restart neutron_openvswitch_agent openvswitch_vswitchd openvswitch_db
neutron_openvswitch_agent
openvswitch_vswitchd
openvswitch_db
[centos compute-intel02 ~]$
NOTE: The above config will not survive a reboot.
TODO: Find a way to make SR-IOV config persistent.
Finally, enable ASAP^2 in Nova and create instances making use of it:
On hypervisors:
[centos compute-intel02 ~]$ sudo vim /etc/kolla/nova-compute/nova.conf
[centos compute-intel02 ~]$ sudo grep -B1 15b3 /etc/kolla/nova-compute/nova.conf
[pci]
passthrough_whitelist = { "vendor_id": "15b3", "product_id": "*" }
[centos compute-intel02 ~]$ sudo docker restart nova_compute
nova_compute
[centos compute-intel02 ~]$
On the controller:
(venv) [centos baremetal-network0 ~]$ openstack network create asap2net --provider-network-type vxlan --internal --mtu 8950
...
(venv) [centos baremetal-network0 ~]$ openstack subnet create --dhcp --allocation-pool start=192.168.200.100,end=192.168.200.200 --network asap2net --subnet-range 192.168.200.0/24 asap2subnet
...
(venv) [centos baremetal-network0 ~]$ openstack port create --network asap2net --vnic-type direct --binding-profile '{"capabilities": ["switchdev"]}' --no-security-group --disable-port-security asap2port1
...
(venv) [centos baremetal-network0 ~]$ openstack port create --network asap2net --vnic-type direct --binding-profile '{"capabilities": ["switchdev"]}' --no-security-group --disable-port-security asap2port2
...
(venv) [centos baremetal-network0 ~]$ openstack server create --image centos7-rootpass-dhcp-on-all-nics-raid-v2-ofed --flavor bs.4x8Gx40GiB --security-group ping-and-ssh --security-group IPERF --key-name mykey --network internal --port asap2port1 test_asap_intel02 --availability-zone nova:compute-intel02.novalocal
...
(venv) [centos baremetal-network0 ~]$ openstack server create --image centos7-rootpass-dhcp-on-all-nics-raid-v2-ofed --flavor bs.4x8Gx40GiB --security-group ping-and-ssh --security-group IPERF --key-name mykey --network internal --port asap2port2 test_asap_intel03 --availability-zone nova:compute-intel03.novalocal
...
(venv) [centos baremetal-network0 ~]$ openstack server add floating ip test_asap_intel02 172.28.139.44
(venv) [centos baremetal-network0 ~]$ openstack server list | grep asap
| 636644f3-45de-4894-bb8b-815c0bc46aeb | test_asap_intel03 | ACTIVE | asap2net=192.168.200.154; internal=192.168.100.163 | centos7-rootpass-dhcp-on-all-nics-raid-v2-ofed | bs.4x8Gx40GiB |
| 6cf9c75f-47f4-4c5d-8965-31345652d804 | test_asap_intel02 | ACTIVE | asap2net=192.168.200.159; internal=192.168.100.70, 172.28.139.44 | centos7-rootpass-dhcp-on-all-nics-raid-v2-ofed | bs.4x8Gx40GiB |
(venv) [centos baremetal-network0 ~]$
NOTE: For OVS offloading to work, both security groups and port security must be disabled on the ASAP^2 port.
Log into the instance with the floating IP and run tests (ping, iperf, sockperf, etc) against the IP of the other instance that's in the asap2net.
To confirm OVS flows are being offloaded, you can either run tcpdump on the Mellanox NIC PF hosting VFs (in this case you should only see the first ICMP package on the capture) or you can run this command to list all the offloaded flows:
[centos compute-intel02 ~]$ sudo docker exec -it -u root openvswitch_vswitchd ovs-appctl dpctl/dump-flows type=offloaded tunnel(tun_id=0x68,src=10.1.10.12,dst=10.1.10.13,tp_dst=4789,flags(+key)),in_port(4),eth(src=fa:16:3e:2a:c9:67,dst=fa:16:3e:d5:5f:3e),eth_type(0x0800),ipv4(frag=no), packets:7722773, bytes:68951928142, used:8.650s, actions:9 [centos compute-intel03 ~]$