An American global technology company based in Santa Clara, California, best known for its graphics processors (GPUs).
Questions tagged [nvidia]
62 questions
14
votes
1 answer
What are actual Tesla M60 models used by AWS?
Wikipedia says that the Tesla M60 has 2x8 GB RAM (whatever it means) and TDP 225–300 W.
I use an EC2 instance (g3s.xlarge) which is supposed to have a Tesla M60. But nvidia-smi command says it has 8GB ram and max power limit 150W:
> sudo…
hans
- 242
- 2
- 8
7
votes
1 answer
Google Kubernetes Engine node pool does not autoscale from 0 nodes
I am trying to run a machine learning job on GKE, and need to use a GPU.
I created a node pool with Tesla K80, as described in this walkthrough.
I set the minimum node size to 0, and hoped that the autoscaler would automatically determine how many…
anna_hope
- 173
- 1
- 5
5
votes
1 answer
Why is my CUDA GPU-Util ~70% when there are "No running processes found"?
After configuring a system with 2 Tesla K80 cards, I noticed when running nvidia-smi that one of the 4 GPUs was under heavy load despite there being "No running processes found". Why is this happening and how do I correct this?
Here is the output…
Steven C. Howell
- 651
- 6
- 9
4
votes
0 answers
Erase GPU memory
We have Nvidia GPU cards that can be used by different users in an OpenStack environment. A first user creates a VM with access to a GPU card, then deletes the VM when done. Another user then creates a VM which is given access to the same card.…
J. Chorin
- 41
- 3
4
votes
2 answers
8 GPU machine freezes
We have a SuperMicro GPU server with:
2x Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz
512GB memory
more than enough disk space
X10DRG-O+-CPU (BIOS Version : 2.0a [current])
X9DRG-O-PCIE PCI-E expander card
8x GTX 1080
It is setup with Ubuntu 16.04.1…
pks
- 41
- 3
3
votes
2 answers
NVIDIA-SMI can't communicate with NVIDIA driver
Problem description
I am trying to set up a centos-7 GPU (Nvidia Tesla K80) instance on Google Cloud, to execute CUDA work.
Unfortunately, I can't seem to properly install/configure drivers.
Indeed, here is what happens when trying to interact with…
Elouan Keryell-Even
- 453
- 2
- 8
- 20
2
votes
0 answers
"Getting devices ready" on Windows 10 while booting VM/iSCSI on another machine than initially set up
TL;DR version:
virtual Windows instance reinstalls GPU drivers while switching to other hosts despite the fact it's getting the same hardware all the time. I'm trying to avoid it / shorten its time
Full version:
I've got an iSCSI server (Windows…
Domel
- 21
- 4
2
votes
1 answer
Access Denied on NVIDIA GRID 7.2 Driver
I am trying to set up an NVIDIA Tesla T4 GPU and use its RTX functionality in a raytracing application (Bakery for Unity3D). But every time I launch the app, Bakery tells me it could not find the OptiX library.
I believe to have tracked it down to…
omacha
- 63
- 3
2
votes
1 answer
Failed to initialize NVML: Unknown Error - Not able to complete NVIDIA Tesla P100 Grid Setup on the vSphere Host Server with Vmware ESXI 6.7
I am unable to setup the NVIDIA Tesla P100 Grid Setup on the vSphere Host Server with Vmware ESXI 6.7 on DELL EMC poweredge R740.
When I am trying to run nvidia-smi command I am getting following error
Failed to initialize NVML: Unknown…
Sarath Zacharia
- 31
- 1
- 5
2
votes
0 answers
Specify a GPU to use at launch
I am currently working with an Azure GPU VM (NV6 using M60 Nvidia Graphic card)
I'm doing my benchmark on this VM without any issue for the moment.
Now I'm doing the same benchmark on a NV12 which has 2 GPU (or at least Windows server sees it as 2…
Turgal
- 121
- 1
2
votes
4 answers
Nvidia driver breaks vncserver on CentOS 7.4, is there a work around?
CentOS Linux release 7.4.1708 (Core)
uname -r output: 3.10.0-693.2.2.el7.x86_64
NVidia driver: NVIDIA-Linux-x86_64-375.66.run
When using the Nvidia graphics card driver with the Nvidia GeForce GT 720 graphics card on CentOS 7.4 it works fine for…
Edward_178118
- 895
- 4
- 14
- 30
2
votes
1 answer
Installing NVIDIA Drivers for Diskless Environment
I'm trying to set up a cluster of 8 computers plus a main file server. Ideally, I'd like to set this up in a pxe-boot, quasi-diskless/quasi-stateless environment (i.e. the only local storage is /var, where things like torque configuration will go).…
Travis DePrato
- 70
- 1
- 5
2
votes
2 answers
Install Display Card In ProLiant DL580 Gen8 Server
We have a ProLiant DL580 Gen8 Server and want to install Gigabyte GForce GTX 980 ti Display Card in PCIE slot, When we connect 8 pins sockets power, server could not turn on, and when power socket not connected, server starts but the graphic card…
MTSS
- 123
- 5
2
votes
1 answer
Executing Cuda script in LXC container results in "cuda error: no CUDA-capable device is detected"
I followed the following instructions in order to set up Cuda inside an LXC container.
When I try to execute the sample ./deviceQuery script inside the container following error is returned:
$ ./deviceQuery
./deviceQuery Starting...
CUDA Device…
Greg
- 1,557
- 5
- 24
- 35
2
votes
0 answers
libGL error: dlopen /usr/lib64/dri/nouveau_dri.so failed on CentOS 6.6
I'm having problems using the nouveau driver for my Nvidia GeForce 9100.
Xorg starts up and works fine, I am able to use everything, although in /var/log/Xorg.0.log I have:
$ cat /var/log/Xorg.0.log | grep EE
[ 36.166] (EE) AIGLX error: dlopen…
Leo
- 121
- 2