Ampere (microarchitecture)

Nvidia Ampere
Fabrication process	TSMC 7 nm (FinFET)
History
Predecessor	Turing?[1] (consumer); Volta (professional);
Successor	Hopper

Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to the Volta architecture, officially announced on May 14, 2020. It is named after French mathematician and physicist André-Marie Ampère.[2][3] It is unknown whether Ampere will be featured in Nvidia's expected GeForce RTX 30 family of cards which may be released in Q4 2020.[1]

Details

Architectural improvements of the Ampere architecture include the following:

CUDA Compute Capability 8.0
TSMC's 7 nm FinFET process
Third-generation Tensor Cores with FP16, bfloat16, TensorFloat-32 (TF32) and FP64 support and sparsity acceleration[4]
High Bandwidth Memory 2 (HBM2)
NVLink 3.0 (50Gbps per pair)[4]
PCI Express 4.0 with SR-IOV support
Multi-Instance GPU (MIG) virtualization & GPU partitioning feature
PureVideo Feature Set K hardware video decoding

A100 accelerator and DGX A100

Announced and released on May 14, 2020 was the Ampere-based A100 accelerator.[4] The A100 features 19.5 teraflops of FP32 performance, 6912 CUDA cores, 40GB of graphics memory, and 1.6TB/s of graphics memory bandwidth.[1] The A100 accelerator was initially available only in the 3rd generation of DGX server, including 8 A100s.[4] Also included in the DGX A100 is 15TB of PCIe gen 4 NVMe storage,[1] two 64-core AMD Rome 7742 CPUs, 1 TB of RAM, and Mellanox-powered HDR InfiniBand interconnect. The initial price for the DGX A100 was $199,000.[4]

Comparison of accelerators used in DGX:[4]

Accelerator
A100
V100
P100

Architecture	FP32 CUDA Cores	Boost Clock	Memory Clock	Memory Bus Width	Memory Bandwidth	VRAM	Single Precision	Double Precision	INT8 Tensor	FP16 Tensor	bfloat16 Tensor	TensorFloat-32(TF32) Tensor	FP64 Tensor	Interconnect	GPU	GPU Die Size	Transistor Count	TDP	Manufacturing Process
Ampere	6912	1410MHz	2.4Gbps HBM2	5120-bit	1555GB/sec	40GB	19.5 TFLOPs	9.7 TFLOPs	624 TOPs	312 TFLOPs	312 TFLOPs	156 TFLOPs	19.5 TFLOPS	600GB/sec	GA100	826mm2	54.2B	400W	TSMC 7nm N7
Volta	5120	1530MHz	1.75Gbps HBM2	4096-bit	900GB/sec	16GB/32GB	15.7 TFLOPs	7.8 TFLOPs	N/A	125 TFLOPs	N/A	N/A	N/A	300GB/sec	GV100	815mm2	21.1B	300W/350W	TSMC 12nm FFN
Pascal	3584	1480MHz	1.4Gbps HBM2	4096-bit	720GB/sec	16GB	10.6 TFLOPs	5.3 TFLOPs	N/A	N/A	N/A	N/A	N/A	160GB/sec	GP100	610mm2	15.3B	300W	TSMC 16nm FinFET

References

Tom Warren; James Vincent (May 14, 2020). "Nvidia's first Ampere GPU is designed for data centers and AI, not your PC". The Verge.
https://nvidianews.nvidia.com/news/nvidias-new-ampere-data-center-gpu-in-full-production
https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/
Ryan Smith (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.

External links

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[verge-A100-1] Tom Warren; James Vincent (May 14, 2020). "Nvidia's first Ampere GPU is designed for data centers and AI, not your PC". The Verge.

[2] ttps://nvidianews.nvidia.com/news/nvidias-new-ampere-data-center-gpu-in-full-production

[3] ttps://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/

[anand-A100-4] Ryan Smith (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.