Difference between revisions of "VRRP tests in mixed environment"

Latest revision as of 23:48, 8 January 2019

Motivation

To find a method for replacing a single point of failure created by headnodes in the trial platform. Keepalived already uses VRRP for failover of HAProxy virtual IPs (VIPs) -- can we make use of this?

All you need to know about VRRP (for this task)

Hosts first join the 224.0.0.18 multicast group. VRRP advertisements are send to this address by the Master and received by Backups. If there are multiple hosts sending advertisements, the one with the highest priority is selected as the Master. If all senders have the same priority, the host with the highest IP address is selected as the Master.

There can be multiple VRRP groupings of Masters and backups (called "instances") and each of them is identified by its vrID.

Each advertisement also includes a list of virtual IPs and authentication data (for example secrets shared between hosts being part of the instance).

A sample VRRP packet:

11:04:02.958005 52:54:00:a7:12:5c > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 58: (tos 0xc0, ttl 255, id 98, offset 0, flags [none], proto VRRP (112), length 44)
    172.28.0.136 > 224.0.0.18: vrrp 172.28.0.136 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl 1s, length 24, addrs(1): 100.100.100.100 auth "mvDvaixF"

VRRP across servers and switches POC

Because VRRP is an open standard, servers with keepalived installed can send valid VRRP packets that are then interpreted by physical switches.

Here a test setup consisting of a physical L3 switch and 2 virtual machines running on the same host and connected to a L2 bridge was created.

Error creating thumbnail: File missing

HAProxy and keepalived were installed and configured on the VMs using kolla-ansible.

VRRP was configured on the switch to conform to the keepalived config. Here are used commands:

SMIS(config)# router vrrp
SMIS(config-vrrp)# interface vlan 1
SMIS(config-vrrp-if)# vrrp 51 ipv4 100.100.100.100
SMIS(config-vrrp-if)# vrrp 51 priority 1
SMIS(config-vrrp-if)# vrrp 51 text-authentication mvDvaixF
SMIS(config-vrrp-if)# vrrp 51 timer 1

(100.100.100.100 is the HAProxy VIP)

The already configured management VLAN (ID 1) was used here for simplicity:

SMIS(config)# show ip interface vlan 1

vlan1 is up, line protocol is up
Internet Address is 172.28.0.198/16
Broadcast Address  172.28.255.255
IP address allocation method is dynamic
IP address allocation protocol is dhcp

With all the above done, one of the VMs becomes the Master and gets the VIP:

root@ubuntu:~# ip a s ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:a7:12:5c brd ff:ff:ff:ff:ff:ff
    inet 172.28.0.136/16 brd 172.28.255.255 scope global ens3
       valid_lft forever preferred_lft forever
    inet 100.100.100.100/32 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fea7:125c/64 scope link
       valid_lft forever preferred_lft forever

The other server has only its internal IP on corresponding interface and the switch becomes a Backup:

SMIS(config-vrrp-if)# show vrrp detail

vlan1  - vrID 51
 ---------------
  State is Backup
  Virtual IP address is 100.100.100.100
  Virtual MAC address is 00:00:5e:00:01:33
  Master router is 100.100.100.100
 Associated IpAddresses :
 ----------------------
100.100.100.100
  Advertise time is 1 secs
  Current priority is 1
  Configured priority is 1
  Configured Authentication
  Authentication key is mvDvaixF

Stopping keepalived on the Master causes the VIP to switch to the other server and when there are no servers with a VIP, the switch becomes the Master. The moment keepalived goes up on any of the servers, the switch transitions to Backup again and this server becomes the new Master.

Unfortunately, with this setup, the switch doesn't seem to keep track of the internal IP of the current master (as evident on the previous listing where the "Master router is" field shows the VIP) and it doesn't add a route telling how to reach the VIP:

SMIS(config-vrrp-if)# show ip route

S 0.0.0.0/0  [1] via 172.28.0.2
C 172.28.0.0/16 is directly connected, vlan1

VIP advertisement with Quagga

The basics

This description assumes that the reader has an understanding of the following:

A routing table is a list of routes a router knows about.
Any network a router is connected to directly (meaning the network is configured on one of its interfaces), gets an automatic entry in the routing table specifying which interface the network is configured on. Example: C 172.28.0.0/16 is directly connected, vlan1 means the network 172.28.0.0/16 is configured on interface vlan1.
The goal of routing is to exchange routing tables between routers, so that each router knows which of its neighbour is the "next hop" for a packet destined to the network specified by the destination IP address of each packet. This includes updating routing tables if the route fails or changes.
RIP is one of so-called "routing protocols" that provide an algorithm for facilitating the exchange of routing tables. RIP (like many other routing protocols) is an open standard, i.e it's implementation is not tied to a specific router vendor.
Quagga is simply a set of software daemons that implement routing protocols, so that any Linux machine can participate in the exchange of routes. Quagga's daemon implementing RIP his called ripd.

The solution

The same setup as for the VRRP part -- 2 VMs on a bridge and a physical L3 switch, but this time Quagga is installed on both VMs.

root@ubuntu:~# apt-get install quagga

The ripd daemon is used because it's the easiest to configure:

root@ubuntu:~# grep -E -v '^#|^$' /etc/quagga/daemons
zebra=yes
bgpd=no
ospfd=no
ospf6d=no
ripd=yes
ripngd=no
isisd=no
babeld=no

Quagga's configuration (the same on both servers):

root@ubuntu:~# cp /usr/share/doc/quagga/examples/zebra.conf.sample /etc/quagga/zebra.conf
root@ubuntu:~# cat /etc/quagga/ripd.conf
router rip
 network 100.100.100.100/32
root@ubuntu:~# chown quagga.quaggavty /etc/quagga/*.conf
root@ubuntu:~# chmod 640 /etc/quagga/*.conf
root@ubuntu:~# systemctl restart quagga

This server keeps hold of the 100.100.100.100 VIP:

root@ubuntu:~# ip address show ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:8b:d5:79 brd ff:ff:ff:ff:ff:ff
    inet 172.28.1.27/16 brd 172.28.255.255 scope global ens3
       valid_lft forever preferred_lft forever
    inet 100.100.100.100/32 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe8b:d579/64 scope link
       valid_lft forever preferred_lft forever
root@ubuntu:~# ip route
default via 172.28.0.2 dev ens3
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 linkdown
172.28.0.0/16 dev ens3  proto kernel  scope link  src 172.28.1.27
root@ubuntu:~# vtysh -c "show ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, A - Babel,
       > - selected route, * - FIB route

K>* 0.0.0.0/0 via 172.28.0.2, ens3
C>* 100.100.100.100/32 is directly connected, ens3
C>* 127.0.0.0/8 is directly connected, lo
C>* 172.17.0.0/16 is directly connected, docker0
C>* 172.28.0.0/16 is directly connected, ens3

Notice that there is no kernel route to 100.100.100.100/32, but Quagga's table shows the IP as directly connected.

On the switch RIP needs to be enabled:

SMIS# configure terminal
SMIS(config)# interface vlan 1
SMIS(config-if)# ip rip enable
SMIS(config-if)# exit
SMIS(config)# show ip interface vlan 1

vlan1 is up, line protocol is up
Internet Address is 172.28.0.198/16
Broadcast Address  172.28.255.255
IP address allocation method is dynamic
IP address allocation protocol is dhcp

SMIS(config)# router rip
SMIS(config-router)# network 172.28.0.198
SMIS(config-router)# exit
SMIS(config)# show ip route

S 0.0.0.0/0  [1] via 172.28.0.2
R 100.100.100.100/32 [2] via 172.28.1.27
C 172.28.0.0/16 is directly connected, vlan1

A route to the VIP is now reachable from the switch.

Currently, clients need to know the gateway to the VIP (this should be replaced by dynamic routing):

[root@vscaler-vgpu ~]# ip address show br0
19: br0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 0c:c4:7a:14:07:28 brd ff:ff:ff:ff:ff:ff
    inet 172.28.1.49/16 brd 172.28.255.255 scope global dynamic br0
       valid_lft 18522sec preferred_lft 18522sec
    inet6 fe80::ec4:7aff:fe14:728/64 scope link
       valid_lft forever preferred_lft forever
[root@vscaler-vgpu ~]# ip route add 100.100.100.100/32 dev br0 via 172.28.0.198

To simulate a malfunction, the container with keepalived is stopped on the server with the VIP. Ping is being run from the client:

[root@vscaler-vgpu ~]# ping 100.100.100.100
...
64 bytes from 100.100.100.100: icmp_seq=8 ttl=64 time=0.461 ms
64 bytes from 100.100.100.100: icmp_seq=9 ttl=64 time=0.532 ms
64 bytes from 100.100.100.100: icmp_seq=10 ttl=64 time=0.453 ms
64 bytes from 100.100.100.100: icmp_seq=11 ttl=64 time=0.449 ms
64 bytes from 100.100.100.100: icmp_seq=12 ttl=64 time=0.478 ms
64 bytes from 100.100.100.100: icmp_seq=13 ttl=64 time=0.456 ms
From 172.28.1.27 icmp_seq=14 Redirect Host(New nexthop: 172.28.0.2)
From 172.28.1.27: icmp_seq=14 Redirect Host(New nexthop: 172.28.0.2)
From 172.28.1.27 icmp_seq=15 Redirect Host(New nexthop: 172.28.0.2)
From 172.28.1.27: icmp_seq=15 Redirect Host(New nexthop: 172.28.0.2)
From 172.28.0.198 icmp_seq=16 Redirect Host(New nexthop: 172.28.1.100)
From 172.28.0.198: icmp_seq=16 Redirect Host(New nexthop: 172.28.1.100)
From 172.28.0.198 icmp_seq=17 Redirect Host(New nexthop: 172.28.1.100)
From 172.28.0.198: icmp_seq=17 Redirect Host(New nexthop: 172.28.1.100)
From 172.28.0.198 icmp_seq=18 Redirect Host(New nexthop: 172.28.1.100)
From 172.28.0.198: icmp_seq=18 Redirect Host(New nexthop: 172.28.1.100)
64 bytes from 100.100.100.100: icmp_seq=19 ttl=64 time=0.710 ms
From 172.28.0.198 icmp_seq=16 Destination Host Unreachable
From 172.28.0.198 icmp_seq=17 Destination Host Unreachable
From 172.28.0.198 icmp_seq=18 Destination Host Unreachable
64 bytes from 100.100.100.100: icmp_seq=20 ttl=64 time=0.442 ms
64 bytes from 100.100.100.100: icmp_seq=21 ttl=64 time=0.429 ms
64 bytes from 100.100.100.100: icmp_seq=22 ttl=64 time=0.441 ms
64 bytes from 100.100.100.100: icmp_seq=23 ttl=64 time=0.436 ms
64 bytes from 100.100.100.100: icmp_seq=24 ttl=64 time=0.448 ms
...

There is a brief service interruption during which the VIP is migrated to the Backup host and RIP updates the route:

SMIS(config)# show ip route

S 0.0.0.0/0  [1] via 172.28.0.2
R 100.100.100.100/32 [2] via 172.28.0.136
C 172.28.0.0/16 is directly connected, vlan1

When the host holding the VIP goes down, around 160s is needed to migrate the route:

[1546854365.147454] 64 bytes from 100.100.100.100: icmp_seq=11 ttl=64 time=0.494 ms
[1546854366.147363] 64 bytes from 100.100.100.100: icmp_seq=12 ttl=64 time=0.393 ms
[1546854367.147340] 64 bytes from 100.100.100.100: icmp_seq=13 ttl=64 time=0.388 ms

[1546854527.147422] 64 bytes from 100.100.100.100: icmp_seq=173 ttl=64 time=0.474 ms
[1546854528.147426] 64 bytes from 100.100.100.100: icmp_seq=174 ttl=64 time=0.487 ms
[1546854529.147307] 64 bytes from 100.100.100.100: icmp_seq=175 ttl=64 time=0.404 ms

This is, most likely, because of RIP timers.

Resources

VRRP Protocol whitepaper http://www2.elo.utfsm.cl/~tel242/exp/04/VRRP_protocol.pdf
First Hop Redundancy Protocols Configuration Guide, Cisco IOS Release 15M&T https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipapp_fhrp/configuration/15-mt/fhp-15-mt-book/fhp-vrrp.html
QUAGGA - The Easy Tutorial - How to use Quagga https://openmaniak.com/quagga_tutorial.php
Quagga docs - 5 RIP https://www.nongnu.org/quagga/docs/docs-multi/RIP.html