Ubuntu Box with multiple NVIDIA graphic cards

I recently bought a box from System76 that has multiple GPU's: one Quadro M6000, and two Tesla K40's.

When I do lspci | grep -i nvidia it says

05:00.0 VGA compatible controller: NVIDIA Corporation Device 17f0 (rev a1)
05:00.1 Audio device: NVIDIA Corporation Device 0fb0 (rev a1)
06:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40c] (rev a1)
09:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40c] (rev a1)

So, they're there.. But, when I do nvidia-smi -L it only shows

GPU 0: Quadro M6000 (UUID: GPU-09446504-6a9e-866a-a65d-0f1d55b7657b)

and, ls -l /dev/nvidia* shows

crw-rw-rw- 1 root root 195,   0 Aug  9 03:29 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Aug  9 03:29 /dev/nvidiactl
crw-rw-rw- 1 root root 248,   0 Aug 12 16:19 /dev/nvidia-uvm

I can't be sure, but I'm guessing /dev/nvidia0 is the Quadro M6000, and perhaps the fact that there isn't a /dev/nvidia1 or a /dev/nvidia2, is another symptom (or perhaps the cause) of the box not seeing the Tesla K40's.. Also, my test programs that call cudaGetDeviceCount, yields only one GPU..

I'm running Ubuntu 14.04.3, and I've installed cuda_7.0.28_linux.run (and installed the NVIDIA drivers via that run file.)

Why are the other cards inaccessible? How do I make them accessible?

bnsh

Posted 2015-08-12T22:52:35.880

Reputation: 161

I had so many issues trying to set up multiple Nvidia cards on Ubuntu I gave up. Better consult with Nvidia support directly: I'd you're into GPU computing they are actually good at helping you, but linux is not their forte – None – 2015-08-13T02:13:52.817

Answers

Alright! txbob over at devtalk nvidia forums gave me enough pointers to get to the solution.

So, basically, nouveau was interfering with the nvidia drivers, and even though I saw a disable-nouveau file in /etc/modprobe.d, it hadn't yet taken effect, because I didn't redo the initramfs...

So, to do that, I needed to do

rm -f /boot/initrd*
update-initramfs -c -k all
update-grub2

afterwards, running nvidia-smi -L yields

GPU 0: Quadro M6000 (UUID: GPU-09446504-6a9e-866a-a65d-0f1d55b7657b)
GPU 1: Tesla K40c (UUID: GPU-e992022a-724f-8f47-e08f-a954053020e6)
GPU 2: Tesla K40c (UUID: GPU-4d14695e-3e43-bf43-a3e3-91190f696d39)

So, all good now! Hopefully this might help someone else!

bnsh

Posted 2015-08-12T22:52:35.880

Reputation: 161

Though the question is quite old, this may help someone.

I think the key step in your success was that you blacklisted the nouveau driver and reconfigured grub, so the initramfs stuff is unnecessary.

Source: Bumblebee on a Lenovo T440p [NVidia GT 730M] with XUbuntu/Ubuntu 16.04 LTS

Buderka Albert

Posted 2015-08-12T22:52:35.880

Reputation: 11