Difference between revisions of "VScaler: Thoubleshooting Kolla issues"
| Line 33: | Line 33: | ||
Disable SELINUX on the controller nodes and reboot them! | Disable SELINUX on the controller nodes and reboot them! | ||
| + | |||
| + | == Debugging containers that dont start == | ||
| + | When a container fails to start - you can recreate the error using the image name and then pass -a to docker start: | ||
| + | |||
| + | <syntaxhighlight> | ||
| + | [root@controller01 ~]# docker ps -a | ||
| + | CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES | ||
| + | 4de70ccf5a4d 10.10.10.1:4000/kolla/centos-binary-glance-api:4.0.3 "kolla_start" 10 hours ago Exited (1) 10 hours ago bootstrap_glance | ||
| + | 10f4038a7d77 10.10.10.1:4000/kolla/centos-binary-keystone:4.0.3 "kolla_start" 10 hours ago Up 10 hours keystone | ||
| + | e96bf1cb3258 10.10.10.1:4000/kolla/centos-binary-rabbitmq:4.0.3 "kolla_start" 10 hours ago Up 10 hours rabbitmq | ||
| + | b0094b42cb75 10.10.10.1:4000/kolla/centos-binary-mariadb:4.0.3 "kolla_start" 10 hours ago Up 10 hours mariadb | ||
| + | d49e0b00bf84 10.10.10.1:4000/kolla/centos-binary-memcached:4.0.3 "kolla_start" 10 hours ago Up 10 hours memcached | ||
| + | 1a1599296c59 10.10.10.1:4000/kolla/centos-binary-keepalived:4.0.3 "kolla_start" 10 hours ago Up 10 hours keepalived | ||
| + | accc84f93171 10.10.10.1:4000/kolla/centos-binary-haproxy:4.0.3 "kolla_start" 10 hours ago Up 10 hours haproxy | ||
| + | f25d30f403d2 10.10.10.1:4000/kolla/centos-binary-cron:4.0.3 "kolla_start" 10 hours ago Up 10 hours cron | ||
| + | 0be143a36b6d 10.10.10.1:4000/kolla/centos-binary-kolla-toolbox:4.0.3 "kolla_start" 10 hours ago Up 10 hours kolla_toolbox | ||
| + | 2f667f97a160 10.10.10.1:4000/kolla/centos-binary-fluentd:4.0.3 "kolla_start" 10 hours ago Up 10 hours fluentd | ||
| + | [root@controller01 ~]# docker start -a bootstrap_glance | ||
| + | INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json | ||
| + | INFO:__main__:Validating config file | ||
| + | INFO:__main__:Kolla config strategy set to: COPY_ALWAYS | ||
| + | INFO:__main__:Copying service configuration files | ||
| + | INFO:__main__:Deleting file /etc/glance/glance-api.conf | ||
| + | INFO:__main__:Coping file from /var/lib/kolla/config_files/glance-api.conf to /etc/glance/glance-api.conf | ||
| + | INFO:__main__:Setting file /etc/glance/glance-api.conf owner to glance:glance | ||
| + | INFO:__main__:Setting file /etc/glance/glance-api.conf permission to 0600 | ||
| + | ERROR:__main__:MissingRequiredSource: /var/lib/kolla/config_files/ceph.* file is not found | ||
| + | </syntaxhighlight> | ||
Revision as of 09:47, 24 August 2017
Log location
The logs are on the nodes under: /var/lib/docker/volumes/kolla_logs/_data/
When a service fails you will find useful info in the koala logs of the container of that service. To check the logs of nova-conductor service for example we'll do:
[root@head01 ~]# ssh controller01
[root@controller01-enp2s0 ~]# tail /var/lib/docker/volumes/kolla_logs/_data/nova/nova-conductor.logInterface ansible_<if> does not exist
If you see a message of this sort in the kolla-ansible output, it's most likely referring to a node that has an interface with a different name than the one specified in the "network_interface" variable in the /etc/kolla/globals.yaml file.
Solution
To get past this issue, just add api_interface=ens5 next to the node's name in the inventory file. I've had a time when I need to add tunnel_interface=ens5 as well. The error message will tell you if the tunnel or the api one is the problem. The inventory file should look like this:
...
gpu01-ens5 tunnel_interface=ens5 api_interface=ens5
...Connection refused errors in nova-conductor logs
When I was doing a deploy the nova service wouldn't come up properly. After checking the nova-conductor logs on the controller node that reported the error, I saw a lot of errors like this:
ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 192.168.0.106:5672 is unreachable: [Errno 111] ECONNREFUSEDSolution
Disable SELINUX on the controller nodes and reboot them!
Debugging containers that dont start
When a container fails to start - you can recreate the error using the image name and then pass -a to docker start:
[root@controller01 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4de70ccf5a4d 10.10.10.1:4000/kolla/centos-binary-glance-api:4.0.3 "kolla_start" 10 hours ago Exited (1) 10 hours ago bootstrap_glance
10f4038a7d77 10.10.10.1:4000/kolla/centos-binary-keystone:4.0.3 "kolla_start" 10 hours ago Up 10 hours keystone
e96bf1cb3258 10.10.10.1:4000/kolla/centos-binary-rabbitmq:4.0.3 "kolla_start" 10 hours ago Up 10 hours rabbitmq
b0094b42cb75 10.10.10.1:4000/kolla/centos-binary-mariadb:4.0.3 "kolla_start" 10 hours ago Up 10 hours mariadb
d49e0b00bf84 10.10.10.1:4000/kolla/centos-binary-memcached:4.0.3 "kolla_start" 10 hours ago Up 10 hours memcached
1a1599296c59 10.10.10.1:4000/kolla/centos-binary-keepalived:4.0.3 "kolla_start" 10 hours ago Up 10 hours keepalived
accc84f93171 10.10.10.1:4000/kolla/centos-binary-haproxy:4.0.3 "kolla_start" 10 hours ago Up 10 hours haproxy
f25d30f403d2 10.10.10.1:4000/kolla/centos-binary-cron:4.0.3 "kolla_start" 10 hours ago Up 10 hours cron
0be143a36b6d 10.10.10.1:4000/kolla/centos-binary-kolla-toolbox:4.0.3 "kolla_start" 10 hours ago Up 10 hours kolla_toolbox
2f667f97a160 10.10.10.1:4000/kolla/centos-binary-fluentd:4.0.3 "kolla_start" 10 hours ago Up 10 hours fluentd
[root@controller01 ~]# docker start -a bootstrap_glance
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting file /etc/glance/glance-api.conf
INFO:__main__:Coping file from /var/lib/kolla/config_files/glance-api.conf to /etc/glance/glance-api.conf
INFO:__main__:Setting file /etc/glance/glance-api.conf owner to glance:glance
INFO:__main__:Setting file /etc/glance/glance-api.conf permission to 0600
ERROR:__main__:MissingRequiredSource: /var/lib/kolla/config_files/ceph.* file is not found