Difference between revisions of "OpenStack:CLI Cheet Sheet"

From Define Wiki
Jump to navigation Jump to search
(added ceph IP/Port update to mqsql queries)
 
(4 intermediate revisions by the same user not shown)
Line 223: Line 223:
 
== Openstack time saving SQL Queries ==
 
== Openstack time saving SQL Queries ==
 
=== Change Ceph Monitor IP/Ports for existing instance ===
 
=== Change Ceph Monitor IP/Ports for existing instance ===
 +
first jump into mariadb container and use the nova db like so (remember you can grep the nova database password from the kolla passwords.yml file):
 +
<pre>
 +
[root@nl-ams-wc1-stl-fl0-pd0-c001 ~]# docker exec -it mariadb mysql -u nova -p
 +
Enter password:
 +
Welcome to the MariaDB monitor.  Commands end with ; or \g.
 +
Your MariaDB connection id is 10037592
 +
Server version: 10.5.13-MariaDB-log MariaDB Server
 +
 +
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 +
 +
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 +
 +
MariaDB [(none)]> use nova;
 +
</pre>
 
==== Fix Ceph Monitor IP change for existing instances ====
 
==== Fix Ceph Monitor IP change for existing instances ====
 +
READ AND UNDERSTAND FIRST
 +
then replace the IPs with the IP/DNS names of your MONs and then update the where clause as suggested
 +
<pre>
 +
update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.hosts', JSON_ARRAY("10.10.3.1", "10.10.3.4", "10.10.3.10")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null) AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL and remove_this_deliberate_bit_of_invalid_syntax_and_replace_with_a_specific_volume_id_that_you can_easily_validate_and_roll_back_if_not_like_this b.connection_info LIKE '%15526f5f-5fd6-4199-948a-158d59eec0ba%';
 +
</pre>
 +
 +
A hard reset should recreate the instance with the right IPs, if not use reset instance commands above else you have probably got the connections wrong still
 +
 +
When you have fixed one instance remove the  last where clause and you should see something like:
 +
 
<pre>
 
<pre>
update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.hosts', JSON_ARRAY("10.10.3.1", "10.10.3.4", "10.10.3.10")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null) AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL and remove_this_deliberate_bit_of_invalid_syntax_and_replace_with_a_specific_volume_id_that_you can_easily_validate_and_roll_back_if_not;
+
Query OK, 18 rows affected (0.007 sec)
 +
Rows matched: 19  Changed: 18  Warnings: 0
 
</pre>
 
</pre>
 +
 +
See I ate my own dogfood and tested it first (19 match and 18 changed) feel free not to but if you break it all don't say you weren't warned
 +
 
==== Fix Ceph Monitor Port Change ====
 
==== Fix Ceph Monitor Port Change ====
 
<pre>
 
<pre>
 
update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.hosts', JSON_ARRAY("10.1.58.156", "10.1.58.157", "10.1.58.158")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null) AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL and remove_this_deliberate_bit_of_invalid_syntax_and_replace_with_a_specific_volume_id_that_you can_easily_validate_and_roll_back_if_not;
 
update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.hosts', JSON_ARRAY("10.1.58.156", "10.1.58.157", "10.1.58.158")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null) AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL and remove_this_deliberate_bit_of_invalid_syntax_and_replace_with_a_specific_volume_id_that_you can_easily_validate_and_roll_back_if_not;
 +
</pre>
 +
== IOMMU debugging  for PCIE passthrough ==
 +
<pre>
 +
for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done;
 
</pre>
 
</pre>

Latest revision as of 12:55, 3 May 2022

CLI Commands

From: https://docs.openstack.org/nova/latest/admin/services.html

# list hypervisor details
openstack hypervisor list --long
 
# list VMs with availability zone
openstack server list --long -c ID -c Name -c Status -c Networks -c "Image Name" -c "Flavor Name" -c "Availability Zone"
 
# list VMs on all hypervisor
openstack server list --all --long  -c ID -c Name -c Host
 
# list VMs on specific hypervisor
openstack server list --all-projects --host ${COMPUTE_NODE}
 
# get VM count by hypervisor
openstack server list --all --long  -c Host -f value | sort | uniq -c
 
# list compute nodes
openstack compute service list --service nova-compute
 
# list compute service
openstack compute service list --host ${OS_NODE}
 
# add / enable compute service
openstack compute service set --enable com1-dev nova-compute
 
# disable compute service
for OS_SERVICE in $(openstack compute service list --host ${OS_NODE} -c Binary -f value); do
    openstack compute service set --disable --disable-reason "Maintenance" ${OS_NODE} ${OS_SERVICE}
done
 
# Search for server witch status error
openstack server list --all --status ERROR
 
# Search for server with status resizing
openstack server list --all --status=VERIFY_RESIZE
 
# List instances / VMs
openstack server list
openstack server list -c ID -c Name -c Status -c Networks -c Host --long
 
# Show VM diagnostics / statistics
nova diagnostics ${SERVER_ID}
openstack server show --diagnostics ${SERVER_ID}
 
# show hypervisor usage
openstack usage list

Disable Compute Node

openstack compute service set --disable os-com2-dev nova-compute
openstack hypervisor list 
openstack compute service list --service nova-compute
openstack aggregate show <NAME>

Debug

Debug

# Search for server processes on wrong compute node
for COMPUTE_NODE in $(openstack compute service list --service nova-compute -c Host -f value); do
    for UUID in $(ssh ${COMPUTE_NODE} pgrep qemu -a | grep -o -P '(?<=-uuid ).*(?= -smbios)'); do
        VM_HOST=$(openstack server show -c "OS-EXT-SRV-ATTR:host" -f value ${UUID})
        if [ -z "${VM_HOST}" ]; then
            echo "Server process ${UUID} on ${COMPUTE_NODE} not available in OpenStack"
        else
            if [ "${VM_HOST}" != "${COMPUTE_NODE}" ]; then
                echo "VM ${UUID} on wrong compute node ${COMPUTE_NODE}"
            fi
        fi
    done
done

Remove compute service / server

openstack server list --all-projects --host ${NODE_ID}
openstack compute service list --host ${NODE_ID}
openstack compute service delete ${NODE_ID}

Manually rebalance VMs

# show hypervisor usage
openstack hypervisor list --long
 
# get processes with uses swap
grep VmSwap /proc/*/status | grep -v " 0 kB"
 
# get VMs with high CPU usage
ssh compute-node-2
 
# VMs by CPU usage
ssh ${COMPUTE_NODE} ps -eo pid,%cpu,cmd --sort="-%cpu" --no-headers | head -5 | grep -o -P '^[0-9]?.*(?<=-uuid ).*(?= -smbios)\b' | awk '{ print $1,$2,$NF }'

# VMs by RAM usage
ssh ${COMPUTE_NODE} ps -eo pid,size,cmd --sort="-size" --no-headers | head -5 | grep -o -P '^[0-9]?.*(?<=-uuid ).*(?= -smbios)\b' | awk '{ print $1,$2,$NF }'
 
openstack server show ${SEVER_ID}
 
# live migrate VM to specific hypervisor
openstack server list --all --status ACTIVE --host comX-stage | grep large
openstack server migrate --os-compute-api-version 2.30 --live-migration --wait --host comX-stage ${SEVER_ID}

Evacuate

from: https://docs.openstack.org/nova/latest/admin/evacuate.html

openstack server list --all-projects --host com3-dev
openstack server set --state error 8041442a-9775-47c8-91be-e27286e731bd
nova evacuate 8041442a-9775-47c8-91be-e27286e731bd

Aggregate

openstack aggregate list
openstack aggregate show 9
openstack aggregate add host 9 com10-stage

Add compute node

openstack compute service list
 
vi /etc/kolla/inventory
...
[external-compute]
new_compute_node_2
...
 
cd /etc/kolla/config/foo
kolla-ansible -i inventory deploy --limit comX-dev -e 'ansible_python_interpreter=/usr/bin/python3'

Remove compute node

COMPUTE_HOST=com1-dev
 
# ensure all VMs are migrated out from the compute node
openstack server list --all-projects --host ${COMPUTE_HOST}
 
# remove compute service
COMPUTE_SERVICE_ID=$(openstack compute service list --service nova-compute --host ${COMPUTE_HOST} -c ID -f value)
echo ${COMPUTE_SERVICE_ID}
openstack compute service delete ${COMPUTE_SERVICE_ID}
 
# remove network service
NETWORK_AGENT_ID=$(openstack network agent list --host ${HYPERVISOR_ID} -c ID -f value)
echo ${NETWORK_AGENT_ID}
openstack network agent delete ${NETWORK_AGENT_ID}
 
# OPTIONAL: check no remaining resource_providers_allocations
http://www.panticz.de/openstack/resource-provider
 
# OPTIONAL: delete resource provider
openstack catalog list | grep placement
PLACEMENT_ENDPOINT=http://nova-placement.service.dev.i.example.com:8780
 
TOKEN=$(openstack token issue -f value -c id)
curl ${PLACEMENT_ENDPOINT}/resource_providers -H "x-auth-token: ${TOKEN}" | python -m json.tool
 
# delete resource provider
UUID=bf003af0-3541-4220-a5d5-c7c2e57abf22
curl ${PLACEMENT_ENDPOINT}/resource_providers/${UUID} -H "x-auth-token: $TOKEN" -X DELETE

Get CPU flags

cat /proc/cpuinfo | grep flags | head -1  | cut -d ":" -f2 | tr " " "\n" | sort

Get qemu version

docker exec -t nova_libvirt qemu-x86_64 -version

create instance on specific host

openstack server create --flavor m1.tiny --network demo-net --image cirros test --availability-zone nova:sby-prr-hci1-n2-mlnx

create an instance on a lot of hosts

# for loop to create a load of instances on every host
for i in {02..06}; do openstack server create --flavor m1.small --network demo-net --image centos-7.8-2003 --key-name mykey --availability-zone nova:ausyd-mha1-lc${i}-api.dt.internal centos-${i}; done

reset VM stuck in a power on / off state

when an instance is stuck use this to fix it

nova reset-state <uuid>
nova reset-state --active <uuid>

it is now active and started, if this is not accurate (normally not as if you are doing this your instance was stuck in reboot, power off state)

openstack server stop <uuid>
# OR do this if you want it started
openstack server reboot --hard <uuid>

Create a VM on a volume

need to create the volume from an image first type is optional, then create instance using the volume name/ID

openstack volume create --image <image uuid>  --size <volume size> <volume name> --type [type name]
openstack server create --volume <volume name> --flavor <flavour name> --key-name <keypair name> --network demo-net <network name>
# example
openstack volume create --image 60fad4a3-f2cc-4ce2-a148-c1a61bb4d2a1  --size 40 centos-82-bootable --type lightos
openstack server create --volume centos-82-bootable --flavor m1.tiny --key-name mykey --network demo-net demo1


Openstack time saving SQL Queries

Change Ceph Monitor IP/Ports for existing instance

first jump into mariadb container and use the nova db like so (remember you can grep the nova database password from the kolla passwords.yml file):

[root@nl-ams-wc1-stl-fl0-pd0-c001 ~]# docker exec -it mariadb mysql -u nova -p
Enter password:
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 10037592
Server version: 10.5.13-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use nova;

Fix Ceph Monitor IP change for existing instances

READ AND UNDERSTAND FIRST then replace the IPs with the IP/DNS names of your MONs and then update the where clause as suggested

update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.hosts', JSON_ARRAY("10.10.3.1", "10.10.3.4", "10.10.3.10")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null) AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL and remove_this_deliberate_bit_of_invalid_syntax_and_replace_with_a_specific_volume_id_that_you can_easily_validate_and_roll_back_if_not_like_this b.connection_info LIKE '%15526f5f-5fd6-4199-948a-158d59eec0ba%';

A hard reset should recreate the instance with the right IPs, if not use reset instance commands above else you have probably got the connections wrong still

When you have fixed one instance remove the last where clause and you should see something like:

Query OK, 18 rows affected (0.007 sec)
Rows matched: 19  Changed: 18  Warnings: 0

See I ate my own dogfood and tested it first (19 match and 18 changed) feel free not to but if you break it all don't say you weren't warned

Fix Ceph Monitor Port Change

update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.hosts', JSON_ARRAY("10.1.58.156", "10.1.58.157", "10.1.58.158")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null) AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL and remove_this_deliberate_bit_of_invalid_syntax_and_replace_with_a_specific_volume_id_that_you can_easily_validate_and_roll_back_if_not;

IOMMU debugging for PCIE passthrough

for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done;