Nvidia DGX

Nvidia DGX is a line of Nvidia produced servers and workstations which specialize in using GPGPU to accelerate deep learning applications.

DGX-1

DGX-1 servers feature 8 GPUs based on the Pascal or Volta daughter cards[1] with HBM 2 memory, connected by an NVLink mesh network.[2]

The product line is intended to bridge the gap between GPUs and AI accelerators in that the device has specific features specializing it for deep learning workloads.[3] The initial Pascal based DGX-1 delivered 170 teraflops of half precision processing,[4] while the Volta-based upgrade increased this to 960 teraflops.[5]

DGX-2

The successor of the Nvidia DGX-1 is the Nvidia DGX-2, which uses 16 32GB V100 (second generation) cards in a single unit. This increases performance of up to 2 Petaflops with 512GB of shared memory for tackling larger problems and uses NVSwitch to speed up internal communication.

Additionally, there is a higher performance version of the DGX-2, the DGX-2H with a notable difference being the replacement of the Dual Intel Xeon Platinum 8168's @ 2.7 GHz with Dual Intel Xeon Platinum 8174's @ 3.1 GHz[6]

DGX A100

Announced and released on May 14, 2020 was the 3rd generation of DGX server, including 8 Ampere-based A100 accelerators.[7] Also included is 15TB of PCIe gen 4 NVMe storage,[8] two 64-core AMD Rome 7742 CPUs, 1 TB of RAM, and Mellanox-powered HDR InfiniBand interconnect. The initial price for the DGX A100 was $199,000.[7]

Accelerators

Comparison of accelerators used in DGX:[7]

Accelerator
A100
V100
P100
ArchitectureFP32 CUDA CoresBoost ClockMemory ClockMemory Bus WidthMemory BandwidthVRAMSingle PrecisionDouble PrecisionINT8 TensorFP16 TensorTF32 TensorInterconnectGPUGPU Die SizeTransistor CountTDPManufacturing Process
Ampere6912~1410MHz2.4Gbps HBM25120-bit1.6TB/sec40GB19.5 TFLOPs9.7 TFLOPs624 TFLOPs312 TFLOPs156 TFLOPs600GB/secA100826mm254.2B400WTSMC 7N
Volta51201530MHz1.75Gbps HBM24096-bit900GB/sec16GB/32GB15.7 TFLOPs7.8 TFLOPsN/A125 TFLOPsN/A300GB/secGV100815mm221.1B300W/350WTSMC 12nm FFN
Pascal35841480MHz1.4Gbps HBM24096-bit720GB/sec16GB10.6 TFLOPs5.3 TFLOPsN/AN/AN/A160GB/secGP100610mm215.3B300WTSMC 16nm FinFET


gollark: It autoupdates from git, so I can accursedly patch it and break things, or redirect in nginx, which is ugly.
gollark: Their site is, as previously described, flaskuous.
gollark: I can't do redirects for heav neatly.
gollark: I can't really do it neatly.
gollark: <@!160279332454006795>

See also

  • Deep Learning Super Sampling

References

  1. "nvidia dgx-1" (PDF).
  2. "inside pascal". Eight GPU hybrid cube mesh architecture with NVLink
  3. "deep learning supercomputer".
  4. "DGX-1 deep learning system" (PDF). NVIDIA DGX-1 Delivers 75X Faster Training...Note: Caffe benchmark with AlexNet, training 1.28M images with 90 epochs
  5. "DGX Server". DGX Server. Nvidia. Retrieved 7 September 2017.
  6. https://docs.nvidia.com/dgx/pdf/dgx2-user-guide.pdf
  7. Ryan Smith (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.
  8. Tom Warren; James Vincent (May 14, 2020). "Nvidia's first Ampere GPU is designed for data centers and AI, not your PC". The Verge.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.