Supercharging AI and HPC workloads.

NVIDIA Tesla H200 Tensor Core GPU

141GB GDDR7 Tensor Core GPU

The NVIDIA H200 Tensor Core GPU is built to power the next generation of generative AI and high-performance computing (HPC). With advanced HBM3e memory, it delivers faster speeds and larger capacity, enabling smooth performance for large language models (LLMs), AI workloads, and scientific computing.

Take Performance Further

Llama2 70B Inference

1.9X Faster

GPT-3 175B Inference

1.6X Faster

High-Performance Computing

110X Faster

Nvidia H200 Specifications

Specifications — NVIDIA H200 Tensor Core GPU
Specification	H200 SXM¹	H200 NVL¹
FP64	34 TFLOPS	30 TFLOPS
FP64 Tensor Core	67 TFLOPS	60 TFLOPS
FP32	67 TFLOPS	60 TFLOPS
TF32 Tensor Core²	989 TFLOPS	835 TFLOPS
BFLOAT16 Tensor Core²	1,979 TFLOPS	1,671 TFLOPS
FP16 Tensor Core²	1,979 TFLOPS	1,671 TFLOPS
FP8 Tensor Core²	3,958 TFLOPS	3,341 TFLOPS
INT8 Tensor Core²	3,958 TFLOPS	3,341 TFLOPS
GPU Memory	141 GB	141 GB
GPU Memory Bandwidth	4.8 TB/s	4.8 TB/s
Decoders	7 NVDEC 7 JPEG	7 NVDEC 7 JPEG
Confidential Computing	Supported	Supported
Max Thermal Design Power (TDP)	Up to 700 W (configurable)	Up to 600 W (configurable)
Multi-Instance GPUs (MIG)	Up to 7 MIGs @ 18 GB each	Up to 7 MIGs @ 16.5 GB each
Form Factor	SXM	PCIe Dual-slot air-cooled
Interconnect	NVIDIA NVLink™: 900 GB/s PCIe Gen5: 128 GB/s	2- or 4-way NVIDIA NVLink bridge: 900 GB/s per GPU PCIe Gen5: 128 GB/s
Server Options	NVIDIA HGX™ H200 partner and NVIDIA-Certified Systems™ with 4 or 8 GPUs	NVIDIA MGX™ H200 NVL partner and NVIDIA-Certified Systems with up to 8 GPUs
NVIDIA AI Enterprise	Add-on	Included
¹ Preliminary specifications. May be subject to change. ² With sparsity.

More Power with Larger, Faster Memory

The NVIDIA H200, built on the NVIDIA Hopper™ architecture, is the first GPU with 141GB of HBM3e memory and 4.8TB/s bandwidth—nearly twice the capacity and 1.4× the bandwidth of the H100 Tensor Core GPU. Its larger, faster memory speeds up generative AI and large language models (LLMs) while improving HPC workloads with greater efficiency and lower total cost of ownership.

Unlock Insights with High-Performance LLM Inference

LLMs are key for today’s AI applications, and running them at scale demands speed and efficiency. The NVIDIA H200 delivers up to 2× faster inference than H100 GPUs on models like Llama 2, helping businesses serve massive user bases with higher throughput and lower total cost of ownership.

Accelerate High-Performance Computing

Memory bandwidth plays a vital role in HPC, allowing faster data transfer and reducing processing bottlenecks. For workloads like simulations, scientific research, and AI, the NVIDIA H200’s higher bandwidth ensures data is accessed and processed efficiently—delivering up to 110× faster results compared to CPUs.

Optimize Power and Reduce TCO

The NVIDIA H200 sets a new standard in energy efficiency and total cost of ownership. Delivering unmatched performance within the same power profile as the H100, it enables AI and supercomputing systems that are faster, more eco-friendly, and more cost-effective—giving businesses and researchers a competitive edge.

Supercharging AI and HPC workloads.

NVIDIA Tesla H200 Tensor Core GPU

141GB GDDR7 Tensor Core GPU

Take Performance Further

Llama2 70B Inference

1.9X Faster

GPT-3 175B Inference

1.6X Faster

High-Performance Computing

110X Faster

Nvidia H200 Specifications

More Power with Larger, Faster Memory

Unlock Insights with High-Performance LLM Inference

Accelerate High-Performance Computing

Optimize Power and Reduce TCO

FAQ's

Popular Questions

Follow Us

Quick Links

Popular Posts

AJA UDC-4K Mini-Converter | 4K Signal Conversion

Top 5 High-Performance Workstations for AI Development

AutoCAD 2026 – Faster, Smarter, and More

Fusion 360 July 2025 Update: What’s New

Contact Info

Location

Email

Phone