VScaler: Adding GPUs - Fixing Issues with GPUs

From Define Wiki
Revision as of 17:10, 11 May 2021 by David (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Adding a GPU with a sound card on it

The GT210 card, for example, has a sound card integrated as well and you can’t split the VGA from the Sound one when you passthrough one in a VM, because they are both in the same IOMMU group. We need to separate the 2 to separate IOMMU groups. To do that clone this git repo: https://github.com/andre-richter/vfio-pci-bind and run:

[root@node02]~ vfio-pci-bind.sh 0000:81:00:1

Note: Replace the PCI address with the corresponding one on your system.

The following covers the NVIDIA GeForce 2080 Ti

02:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti] [10de:1e04] (rev a1)
02:00.1 Audio device [0403]: NVIDIA Corporation TU102 High Definition Audio Controller [10de:10f7] (rev a1)
02:00.2 USB controller [0c03]: NVIDIA Corporation TU102 USB 3.1 Host Controller [10de:1ad6] (rev a1)
02:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU102 USB Type-C UCSI Controller [10de:1ad7] (rev a1)
[root@gpu0001 vfio-pci-bind]# ./vfio-pci-bind.sh 10de:1e04 10de:10f7
Vendor:Device 10de:10f7 found at 0000:02:00.1

IOMMU group members (sans bridges):
/sys/bus/pci/devices/0000:02:00.1/iommu_group/devices/0000:02:00.0
/sys/bus/pci/devices/0000:02:00.1/iommu_group/devices/0000:02:00.1
/sys/bus/pci/devices/0000:02:00.1/iommu_group/devices/0000:02:00.2
/sys/bus/pci/devices/0000:02:00.1/iommu_group/devices/0000:02:00.3

Binding...
Unbound 0000:02:00.0 from nouveau
Unbound 0000:02:00.1 from snd_hda_intel
Unbound 0000:02:00.2 from xhci_hcd

success...

Device 10de:10f7 at 0000:02:00.1 bound to vfio-pci
Devices listed in /sys/bus/pci/drivers/vfio-pci:
lrwxrwxrwx. 1 root root    0 May 11 11:49 0000:02:00.0 -> ../../../../devices/pci0000:00/0000:00:1c.0/0000:02:00.0
lrwxrwxrwx. 1 root root    0 May 11 11:49 0000:02:00.1 -> ../../../../devices/pci0000:00/0000:00:1c.0/0000:02:00.1
lrwxrwxrwx. 1 root root    0 May 11 11:49 0000:02:00.2 -> ../../../../devices/pci0000:00/0000:00:1c.0/0000:02:00.2
lrwxrwxrwx. 1 root root    0 May 11 11:49 0000:02:00.3 -> ../../../../devices/pci0000:00/0000:00:1c.0/0000:02:00.3

ls -l /dev/vfio/
total 0
crw-------. 1 root root 241,   0 May 11 11:49 6
crw-rw-rw-. 1 root root  10, 196 May 10 16:56 vfio

So after this we can now see the vfio kernel module is in use for those devices

[root@gpu0001 vfio-pci-bind]# lspci -s 02:00  -nnk
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti] [10de:1e04] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd Device [1458:3fd8]
	Kernel driver in use: vfio-pci
	Kernel modules: nouveau
02:00.1 Audio device [0403]: NVIDIA Corporation TU102 High Definition Audio Controller [10de:10f7] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd Device [1458:3fd8]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel
02:00.2 USB controller [0c03]: NVIDIA Corporation TU102 USB 3.1 Host Controller [10de:1ad6] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd Device [1458:3fd8]
	Kernel driver in use: vfio-pci
02:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU102 USB Type-C UCSI Controller [10de:1ad7] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd Device [1458:3fd8]
	Kernel driver in use: vfio-pci

Added the following to make sure it reloads on reboot


[root@gpu0001 vfio-pci-bind]# cat /etc/modprobe.d/vfio.conf 
options vfio-pci ids=10de:1e04,10de:10f7,10de:1ad6,10de:1ad7
[root@gpu0001 modules-load.d]# cat /etc/modules-load.d/modules.conf 
vfio-pci
 openstack flavor set m1.large.gpu --property pci_passthrough:alias='rtx_2080:1'