Questions tagged [cuda]

CUDA (Compute Unified Device Architecture) is a parallel computing platform and API created by Nvidia to perform GPGPU (General-Purpose computing on GPU).

28 questions
8
votes
2 answers

How to run GPGPU Memory Testing

We use a lot of GPGPU computing (mostly with CUDA, but some OpenCL). Often when users are running code, the code errors out with a memory error on only one of our hosts. I suspect one of the cards is faulty. Sometimes it brings down the whole…
Andrew Case
  • 3,409
  • 3
  • 21
  • 38
6
votes
2 answers

Force a headless server to load video drivers for the GPU?

I am running a headless server on Ubuntu, with the objective of using GPU's for non graphics computation. However, I have found that without the monitor plugged in the kernel fails to load the graphics drivers. Is there any reason that I can't use…
MrSynAckSter
  • 157
  • 4
5
votes
1 answer

Why is my CUDA GPU-Util ~70% when there are "No running processes found"?

After configuring a system with 2 Tesla K80 cards, I noticed when running nvidia-smi that one of the 4 GPUs was under heavy load despite there being "No running processes found". Why is this happening and how do I correct this? Here is the output…
5
votes
4 answers

Howto set up SGE for CUDA devices?

I'm currently facing the problem of integrating GPU-Servers into an existing SGE environment. Using google I found some examples of Clusters where this has been set up but no information on how this had been done. Is there some form of howto or…
luxifer
  • 177
  • 1
  • 3
  • 12
5
votes
2 answers

Can ESXi pass video card to VM to do CUDA?

I have an ESXi 4.1 running on hardware that can run 4 16-lane PCI-e cards. I would like to have access to the underlying hardware from a Linux VM, to run some CUDA programs. So far all I can see from inside of Linux VM is the generic VMware video…
Marcin
  • 2,281
  • 1
  • 16
  • 14
4
votes
2 answers

8 GPU machine freezes

We have a SuperMicro GPU server with: 2x Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz 512GB memory more than enough disk space X10DRG-O+-CPU (BIOS Version : 2.0a [current]) X9DRG-O-PCIE PCI-E expander card 8x GTX 1080 It is setup with Ubuntu 16.04.1…
pks
  • 41
  • 3
2
votes
0 answers

Multi-Tenancy (Multi-user) GPU Container Infrastructure Solution

What we need: Several teams from different companies want to share our GPUs for deep learning tasks (three computers with several GPUs each). So manage multiple GPUs for multiple users. Different teams should not have access to the data of other…
2
votes
0 answers

ESXI PCIe GPU Passthrough does not allow for CUDA

I am trying to do cuda development in an ESXI environment, so I installed a Quadro 5800 in my ESXI machine (Dell T7500). I did passthrough to the Windows 7 VM that I will be doing development in, but when I run GPU-Z or a cuda program, CUDA is not…
2
votes
1 answer

Executing Cuda script in LXC container results in "cuda error: no CUDA-capable device is detected"

I followed the following instructions in order to set up Cuda inside an LXC container. When I try to execute the sample ./deviceQuery script inside the container following error is returned: $ ./deviceQuery ./deviceQuery Starting... CUDA Device…
Greg
  • 1,557
  • 5
  • 24
  • 35
2
votes
1 answer

How important is the CPU when building a CUDA system?

I'm just a clueless sysadmin and we need to put together a couple of machines specifically for users to use CUDA. We're looking at the Dell PowerEdge T620 and jamming four CUDA cards into the sucker. Researching the CUDA components is another…
2
votes
1 answer

How AWS does GPU virtualization?

What kind of technology is Amazon using for GPU virtualization ? Can multiple VMs on an AWS GPU instance concurrently share GPU and have acceleration for their CUDA/ openCL programs ? I know following are the methods possible for GPU…
2
votes
0 answers

CUDA 5.0 does not see the Tesla C2050

I have upgraded our development machine which is equipped with two Tesla C1060 cards and one Tesla C2050 card to CUDA 5.0. The machine runs Windows Server 2008R2 (x64). All three cards are visible in the Windows device manager with NVIDIA driver…
sakra
  • 189
  • 1
  • 8
2
votes
1 answer

Hosting with CUDA support

One web application that I'm planning relies in [CUDA]http://www.nvidia.com/object/cuda_home_new.html) for doing heavy math processing. I developed the software at home, but now I'm looking for deployment options. I know that Amazon EC2 provides…
dsign
  • 153
  • 8
1
vote
1 answer

Where to get an CUDA/GPU enabled version of the HPL benchmark?

After setting up a new compute server for my research group I need to evaluate the overall performance of this machine, including both Tesla cards. I found some information about a CUDA enabled version of Linpack and how it is used, but no download…
M.K. aka Grisu
  • 141
  • 1
  • 8
1
vote
2 answers

Using CUDA_VISIBLE_DEVICES with sge

Using sge with resource complex called 'gpu.q' that allows resource management of gpu devices (these are all nvidia devices). However on the systems there are multiple gpu devices (in exclusive mode) and if two jobs are allocated on the same node…
Marm0t
  • 379
  • 1
  • 9
1
2