0

I have the following environment:

Pods: Pod0, Pod1 (launched as a k8s Job)
GPUs: GPU0, GPU1

GPU0 is dedicated to Pod0, and GPU1 is dedicated to Pod1.

There can be multiple Pod0s and Pod1s at the same time. If there are two Pod0s, only a single Pod0 can use GPU0, and another Pod0 should be in Pending state until the first Pod0 finishes.

Is this workload possible?

Currently I use nvidia.com/gpu resource setting but the only thing I can do is to set the number of GPU allocated to each Pod. I had tried setting NVIDIA_VISIBLE_DEVICES but could not make it as I desired.

Daigo
  • 278
  • 1
  • 17
  • If the answer was useful, please mark the answer as accepted for greater visibility for the community or upvote if the answer has some useful information. – Hemanth Kumar Aug 24 '22 at 10:47

1 Answers1

0

Instead of keeping the other Pod in pending state You can use one GPU for multiple pods here GPU1 for multipods of POD0’s with help of Nvidia GPUs. Just don't specify it in the resource limits/requests. This way containers from all pods will have full access to the GPU as if they were normal processes. Follow this guide for GPU sharing on top of k8’s and Refer this SO for more information.

  • Thank you for the answer. I don't want multiple Pods to share a GPU, I want one GPU to be dedicated to one Pod. Any ideas? – Daigo Aug 29 '22 at 06:41
  • @Daigo : You can create a node pool with your desired GPU type. This node pool will have # nodes = # pods in your deployment and each node will host only 1 pod, and will have 1 GPU of your choice. You can read more about how to do this on the GKE [docs here](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus) – Hemanth Kumar Aug 29 '22 at 10:34
  • @Daigo : Any helpful above comment? Is your issue resolved? – Hemanth Kumar Sep 01 '22 at 11:54