Mediasys – Turnkey solution provider, Distributor in UAE.

Supercharging AI and HPC workloads.

NVIDIA Tesla H200 Tensor Core GPU

141GB GDDR7 Tensor Core GPU

The NVIDIA H200 Tensor Core GPU is built to power the next generation of generative AI and high-performance computing (HPC). With advanced HBM3e memory, it delivers faster speeds and larger capacity, enabling smooth performance for large language models (LLMs), AI workloads, and scientific computing.

Take Performance Further

Llama2 70B Inference

1.9X Faster

GPT-3 175B Inference

1.6X Faster

High-Performance Computing

110X Faster

Nvidia H200 Specifications

Specifications — NVIDIA H200 Tensor Core GPU
Specification H200 SXM1 H200 NVL1
FP64 34 TFLOPS 30 TFLOPS
FP64 Tensor Core 67 TFLOPS 60 TFLOPS
FP32 67 TFLOPS 60 TFLOPS
TF32 Tensor Core2 989 TFLOPS 835 TFLOPS
BFLOAT16 Tensor Core2 1,979 TFLOPS 1,671 TFLOPS
FP16 Tensor Core2 1,979 TFLOPS 1,671 TFLOPS
FP8 Tensor Core2 3,958 TFLOPS 3,341 TFLOPS
INT8 Tensor Core2 3,958 TFLOPS 3,341 TFLOPS
GPU Memory 141 GB 141 GB
GPU Memory Bandwidth 4.8 TB/s 4.8 TB/s
Decoders 7 NVDEC
7 JPEG
7 NVDEC
7 JPEG
Confidential Computing Supported Supported
Max Thermal Design Power (TDP) Up to 700 W (configurable) Up to 600 W (configurable)
Multi-Instance GPUs (MIG) Up to 7 MIGs @ 18 GB each Up to 7 MIGs @ 16.5 GB each
Form Factor SXM PCIe
Dual-slot air-cooled
Interconnect NVIDIA NVLink™: 900 GB/s
PCIe Gen5: 128 GB/s
2- or 4-way NVIDIA NVLink bridge: 900 GB/s per GPU
PCIe Gen5: 128 GB/s
Server Options NVIDIA HGX™ H200 partner and NVIDIA-Certified Systems™ with 4 or 8 GPUs NVIDIA MGX™ H200 NVL partner and NVIDIA-Certified Systems with up to 8 GPUs
NVIDIA AI Enterprise Add-on Included
1 Preliminary specifications. May be subject to change.
2 With sparsity.

More Power with Larger, Faster Memory

The NVIDIA H200, built on the NVIDIA Hopper™ architecture, is the first GPU with 141GB of HBM3e memory and 4.8TB/s bandwidth—nearly twice the capacity and 1.4× the bandwidth of the H100 Tensor Core GPU. Its larger, faster memory speeds up generative AI and large language models (LLMs) while improving HPC workloads with greater efficiency and lower total cost of ownership.

inference-chart

Unlock Insights with High-Performance LLM Inference

LLMs are key for today’s AI applications, and running them at scale demands speed and efficiency. The NVIDIA H200 delivers up to 2× faster inference than H100 GPUs on models like Llama 2, helping businesses serve massive user bases with higher throughput and lower total cost of ownership.

Accelerate High-Performance Computing

Memory bandwidth plays a vital role in HPC, allowing faster data transfer and reducing processing bottlenecks. For workloads like simulations, scientific research, and AI, the NVIDIA H200’s higher bandwidth ensures data is accessed and processed efficiently—delivering up to 110× faster results compared to CPUs.

Optimize Power and Reduce TCO

The NVIDIA H200 sets a new standard in energy efficiency and total cost of ownership. Delivering unmatched performance within the same power profile as the H100, it enables AI and supercomputing systems that are faster, more eco-friendly, and more cost-effective—giving businesses and researchers a competitive edge.

FAQ's

Popular Questions