Difference between revisions of "Bright:Add GPUs to OpenStack"
Jump to navigation
Jump to search
| Line 1: | Line 1: | ||
| − | Performed on the node with the GPUs (provisioned as a nova hypervisor - default) | + | Performed on the centos 7 node with the 2x K80 GPUs (provisioned as a nova hypervisor under bright / openstack-image - default) |
| + | |||
| + | == Ensure Intel IOMMU is enabled == | ||
| + | Need to pass <tt>intel_iommu=on</tt> as a kernel arg at boot | ||
| + | |||
| + | <syntaxhighlight> | ||
| + | [shadow-head]% softwareimage | ||
| + | [shadow-head->softwareimage]% use openstack-image | ||
| + | [shadow-head->softwareimage[openstack-image]]% list | ||
| + | Name (key) Path Kernel version | ||
| + | -------------------- ---------------------------------------- ---------------------- | ||
| + | default-image /cm/images/default-image 3.10.0-229.el7.x86_64 | ||
| + | openstack-image /cm/images/openstack-image 3.10.0-229.el7.x86_64 | ||
| + | test-image /cm/images/test-image 3.10.0-229.el7.x86_64 | ||
| + | [shadow-head->softwareimage[openstack-image]]% show | ||
| + | Parameter Value | ||
| + | -------------------------------- ------------------------------------------------ | ||
| + | Boot FSPart 98784247814 | ||
| + | Creation time Tue, 29 Sep 2015 10:29:56 GMT | ||
| + | Enable SOL yes | ||
| + | FSPart 98784247814 | ||
| + | Kernel modules <38 in submode> | ||
| + | Kernel parameters rdblacklist=nouveau | ||
| + | Kernel version 3.10.0-229.el7.x86_64 | ||
| + | Locked no | ||
| + | Name openstack-image | ||
| + | Notes <0 bytes> | ||
| + | Path /cm/images/openstack-image | ||
| + | Revision | ||
| + | SOL Flow Control no | ||
| + | SOL Port ttyS1 | ||
| + | SOL Speed 115200 | ||
| + | [shadow-head->softwareimage[openstack-image]]% set kernelparameters "rdblacklist=nouveau intel_iommu=on" | ||
| + | [shadow-head->softwareimage*[openstack-image*]]% show | ||
| + | Parameter Value | ||
| + | -------------------------------- ------------------------------------------------ | ||
| + | Boot FSPart 98784247814 | ||
| + | Creation time Tue, 29 Sep 2015 10:29:56 GMT | ||
| + | Enable SOL yes | ||
| + | FSPart 98784247814 | ||
| + | Kernel modules <38 in submode> | ||
| + | Kernel parameters rdblacklist=nouveau intel_iommu=on | ||
| + | Kernel version 3.10.0-229.el7.x86_64 | ||
| + | Locked no | ||
| + | Name openstack-image | ||
| + | Notes <0 bytes> | ||
| + | Path /cm/images/openstack-image | ||
| + | Revision | ||
| + | SOL Flow Control no | ||
| + | SOL Port ttyS1 | ||
| + | SOL Speed 115200 | ||
| + | [shadow-head->softwareimage*[openstack-image*]]% commit | ||
| + | =============================== openstack-image ================================ | ||
| + | Field Message | ||
| + | ------------------------ -------------------------------------------------------------- | ||
| + | module Warning: Module xhci-hcd does not exist for specified kernel. | ||
| + | [shadow-head->softwareimage[openstack-image]]% | ||
| + | </syntaxhighlight> | ||
| + | |||
| + | Now when the node is booted up, check the dmesg output to confirm IOMMU is enabled; | ||
| + | <syntaxhighlight> | ||
| + | [root@gpu ~]# dmesg | grep -iE "dmar|iommu" | grep -i enabled | ||
| + | [ 0.000000] Intel-IOMMU: enabled | ||
| + | </syntaxhighlight> | ||
== Get the PCI IDs of the GPUs == | == Get the PCI IDs of the GPUs == | ||
| Line 14: | Line 77: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
| − | * So the <b>vendor id</b> is: <tt> 10de</tt> | + | * So the <b>vendor id</b> is: <tt>10de</tt> |
* And the <b>product id</b> is: <tt>102d</tt> | * And the <b>product id</b> is: <tt>102d</tt> | ||
== Update the /etc/nova/nova.conf file == | == Update the /etc/nova/nova.conf file == | ||
| + | Applied to the headnode, this will sync with other nodes (all nodes will run same nova.conf version) | ||
| + | <syntaxhighlight> | ||
| + | # PCI alias and whitelist | ||
| + | [root@shadow-head nova]# grep ^pci_ /etc/nova/nova.conf | ||
| + | pci_alias = { 'name': 'K80_Tesla', 'vendor_id': '10de', 'product_id': '102d' } | ||
| + | pci_passthrough_whitelist = [{ "vendor_id": "10de", "product_id": "102d" }] | ||
| + | |||
| + | # Scheduler params | ||
| + | [root@shadow-head nova]# grep ^scheduler /etc/nova/nova.conf | ||
| + | scheduler_max_attempts=5 | ||
| + | scheduler_default_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,PciPassthroughFilter | ||
| + | scheduler_available_filters=nova.scheduler.filters.all_filters | ||
| + | scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter | ||
| + | scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler | ||
| + | </syntaxhighlight> | ||
Revision as of 15:08, 24 March 2016
Performed on the centos 7 node with the 2x K80 GPUs (provisioned as a nova hypervisor under bright / openstack-image - default)
Ensure Intel IOMMU is enabled
Need to pass intel_iommu=on as a kernel arg at boot
[shadow-head]% softwareimage
[shadow-head->softwareimage]% use openstack-image
[shadow-head->softwareimage[openstack-image]]% list
Name (key) Path Kernel version
-------------------- ---------------------------------------- ----------------------
default-image /cm/images/default-image 3.10.0-229.el7.x86_64
openstack-image /cm/images/openstack-image 3.10.0-229.el7.x86_64
test-image /cm/images/test-image 3.10.0-229.el7.x86_64
[shadow-head->softwareimage[openstack-image]]% show
Parameter Value
-------------------------------- ------------------------------------------------
Boot FSPart 98784247814
Creation time Tue, 29 Sep 2015 10:29:56 GMT
Enable SOL yes
FSPart 98784247814
Kernel modules <38 in submode>
Kernel parameters rdblacklist=nouveau
Kernel version 3.10.0-229.el7.x86_64
Locked no
Name openstack-image
Notes <0 bytes>
Path /cm/images/openstack-image
Revision
SOL Flow Control no
SOL Port ttyS1
SOL Speed 115200
[shadow-head->softwareimage[openstack-image]]% set kernelparameters "rdblacklist=nouveau intel_iommu=on"
[shadow-head->softwareimage*[openstack-image*]]% show
Parameter Value
-------------------------------- ------------------------------------------------
Boot FSPart 98784247814
Creation time Tue, 29 Sep 2015 10:29:56 GMT
Enable SOL yes
FSPart 98784247814
Kernel modules <38 in submode>
Kernel parameters rdblacklist=nouveau intel_iommu=on
Kernel version 3.10.0-229.el7.x86_64
Locked no
Name openstack-image
Notes <0 bytes>
Path /cm/images/openstack-image
Revision
SOL Flow Control no
SOL Port ttyS1
SOL Speed 115200
[shadow-head->softwareimage*[openstack-image*]]% commit
=============================== openstack-image ================================
Field Message
------------------------ --------------------------------------------------------------
module Warning: Module xhci-hcd does not exist for specified kernel.
[shadow-head->softwareimage[openstack-image]]%Now when the node is booted up, check the dmesg output to confirm IOMMU is enabled;
[root@gpu ~]# dmesg | grep -iE "dmar|iommu" | grep -i enabled
[ 0.000000] Intel-IOMMU: enabledGet the PCI IDs of the GPUs
In this instance we are using K80s
# check for the nvidia GPUs
[root@gpu ~]# lspci | grep -i nvidia
04:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
05:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
# get their IDs with the -n flag
[root@gpu ~]# lspci -nn | grep -i nvidia
04:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
05:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)- So the vendor id is: 10de
- And the product id is: 102d
Update the /etc/nova/nova.conf file
Applied to the headnode, this will sync with other nodes (all nodes will run same nova.conf version)
# PCI alias and whitelist
[root@shadow-head nova]# grep ^pci_ /etc/nova/nova.conf
pci_alias = { 'name': 'K80_Tesla', 'vendor_id': '10de', 'product_id': '102d' }
pci_passthrough_whitelist = [{ "vendor_id": "10de", "product_id": "102d" }]
# Scheduler params
[root@shadow-head nova]# grep ^scheduler /etc/nova/nova.conf
scheduler_max_attempts=5
scheduler_default_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,PciPassthroughFilter
scheduler_available_filters=nova.scheduler.filters.all_filters
scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter
scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler