NVIDIA H200 GPU server immersion oil cooling solution

Reading Time: 6 min | Word Count: 1482

In the era of explosive AI computing power, GPUs, as core computing units, face significant challenges in hardware stability and computing power output due to their heat dissipation efficiency. The power consumption of NVIDIA‘s H200 GPU varies depending on model and usage scenario. The standard model has a thermal design power (TDP) of 700W. The H200 NVL model, on the other hand, has a TDP of 600W, making it more suitable for air-cooled enterprise racks with lower power requirements. In actual operation, power consumption also varies depending on the load. For example, during AI training, the H200’s dynamic power consumption can reach around 750W. With liquid cooling and overclocking enabled, power consumption can exceed 800W. When eight GPUs are deployed on a single motherboard, total power consumption exceeds 6000W, posing a significant challenge to NVIDIA’s original cooling solution. This article not only reviews the cooling optimization logic for this high-power scenario but also draws on our experience in customizing a solution for Australian AI factory FIRMUS (official website: www.firmus.com) and delivering 3000 samples. This provides a practical reference for immersion cooling design for high-density GPU clusters.

1,Solution Background: The Heat Dissipation Dilemma of 8-GPU H200 Servers and Firmus’s Pain Points
NVIDIA H200 GPUs, core hardware for current AI training and high-performance computing, typically consume 750W of power per card, a 30% increase compared to previous generations. When using an 8-GPU motherboard design, the system must simultaneously handle a concentrated heat source of 6,000W. Furthermore, the GPU chip area is small and the heat flux density is extremely high. If heat dissipation is not adequately managed, not only will hardware frequency throttling occur (resulting in a computing power loss of over 20%), but prolonged high temperatures can also shorten the GPU’s lifespan. This pain point is particularly acute in the context of Firmus, an Australian AI factory. As a company specializing in industrial AI algorithm development and large-scale computing power leasing, Firmus’s data centers are located in Singapore and Sydney Technology Parks. They must simultaneously meet three core requirements: high-density deployment (20+ 8-GPU H200 servers per cabinet), low noise (nighttime noise limit ≤40dB), and stable operation (computing power interruptions of no more than one hour year-round). NVIDIA’s solution clearly failed to meet these requirements.

Initially, Firmus focused on the design of the entire computer room and ensuring the stability of the piping system. However, upon receiving NVIDIA’s original H200 GPU cooler, they discovered it wouldn’t work with an oil-based immersion cooling system. The core design and issues are as follows:
• The cooling structure: A bottom vapor chamber + welded heat pipe combination transfers heat from the GPU core to the fins via the phase change process of evaporation and condensation of the working fluid within the vapor chamber. However, since the entire bottom was encased in coolant, the working fluid within the vapor chamber couldn’t vaporize, preventing heat transfer. Under full load, the H200 core temperature frequently exceeded 95°C (nearing the safety threshold of 100°C), and frequent frequency reductions impacted AI training progress. Therefore, upon receiving their project requirements, our engineers at Kenfa Tech designed a higher-performing immersion cooling solution within one day and ultimately confirmed its feasibility with the client’s team of doctoral researchers.

Optimization Solution: Implementation of Immersion Oil Cooling Technology and Firmus Customized Design

To address the bottlenecks of air cooling solutions and meet Firmus’s high-density, low-noise requirements, we upgraded the heat dissipation medium from air to insulated thermal oil. This customized immersion oil cooling solution leverages the high thermal conductivity and quietness of liquid to address key issues. Specific design details are as follows:

2,Core cooling structure upgrade: Copper baseplate + heat pipe combination compatible with 8-GPU H200 GPUs.

The traditional air-cooling “vapor chamber + fan” design has been replaced with a copper baseplate + independently welded heat pipe structure more suitable for oil-cooling environments. Compatibility has also been optimized for Firmus server motherboard dimensions:
• Copper baseplate: Made of T2 copper, it adheres directly to the H200 GPU core (with a fit tolerance of ≤0.05mm). Copper’s thermal conductivity (401W/m·K) is over 1000 times that of air, rapidly absorbing core heat and preventing localized high-temperature accumulation.

• Independently welded heat pipes: Each GPU is equipped with 24 φ6mm oxygen-free copper heat pipes, connected to the copper baseplate via a vacuum brazing process (thermal resistance at the weld is ≤0.02°C/W). The phase change rate of the working fluid in the heat pipes is 50% faster than in air-cooled scenarios, ensuring efficient heat transfer to the oil. Key to this is the ability to transfer a large amount of heat to the fin within a localized area, which is then removed by the coolant.

3,Oil Selection and Temperature Control: Meeting Firmus’s Year-Round Stable Operation Requirements
To meet Firmus’s “year-round, uninterrupted” requirements, redundant optimization was implemented in the oil selection and circulation system design:

• Oil Selection: Industrial-grade mineral insulating oil with a thermal conductivity of 1.14 W/m·K, a breakdown voltage ≥ 50 kV (to prevent electrical damage to hardware), and a low-temperature viscosity ≤ 20 mm²/s (to accommodate the low temperatures in Sydney’s data center during winter and prevent fluidity degradation).

Temperature Control: A dual-circulation external heat exchanger is designed to maintain a stable oil temperature of around 40°C (with a fluctuation range of ±2°C). This prevents a decrease in heat dissipation efficiency caused by oil temperature rise, even when Firmus is operating at full load in the summer (eight GPUs simultaneously running large model training).
• Redundancy: The heat exchanger is equipped with a backup pump set. If the primary pump fails, the backup pump can be switched to within 0.5 seconds, meeting Firmus’s reliability requirement of “no interruption of more than one hour per year.”

4,FIRMUS Sample Delivery: 3,000-Piece Scale Verification

To ensure the stability of the solution, we delivered 3,000 immersion oil cooling modules to FIRMUS in three batches, covering the testing requirements of their five computing clusters:

• First batch of 1,000 units (October 2024): Used for full-load testing of a single server to verify core temperatures. With a good heatsink, the entire GPU temperature was controlled at around 68°C. This was successfully delivered to the end user, achieving an overall PUE of approximately 1.145 in the data center. • The second batch of 900 units (April 2025): Deployed in a 100-server cluster to test cooling coordination in a high-density environment; again, performance was stable, with the GPU temperature perfectly controlled below 68°C.
• The third batch of 1200 units (September 2025): Used in a Singapore cabinet load test to verify stability in a high-temperature environment of 45°C. Again, the GPU temperature was stably controlled below 68°C, fully demonstrating that our design solution perfectly solves the thermal management issues of the H200 GPU.

5, Solution Results: From Lab Data to Firmus Real-World Feedback
Both our test data and Firmus’s real-world feedback demonstrate the advantages of the immersion oil cooling solution. Key performance indicators are as follows:
• GPU core temperature: Stable at around 68°C under full load, a 22°C decrease compared to Firmus’s previous air cooling solution (90°C+). The H200 never throttled, increasing computing power output by approximately 18% (and reducing Firmus’s large model training time by approximately 15%).
• Noise Level: Data center noise levels were reduced from 65dB to below 10dB (fully complying with noise limits set by the Sydney Technology Park and Singapore High-Tech Zone).
• Energy Consumption Comparison: The immersion oil cooling system’s external heat exchangers are located on the rooftops of the entire campus. Through free air cooling and intelligent fans, the coolant input temperature is consistently maintained at 45°C before reaching the data center computer room. Compared to the 400W fan power consumption of the air cooling solution, additional energy consumption is reduced by 62.5%. According to Firmus… Calculating a scale of 1,000 servers, this can save high electricity costs annually, making their server rentals more profitable.
• Sample reliability: During Firmus’ 12-month testing period, none of the 3,000 samples experienced cooling failures, resulting in a zero failure rate, meeting their “high reliability” requirements.

Summary and Outlook: Immersion Oil Cooling Provides a New Path for High-Density AI Data Centers

Based on our experience customizing a solution for FIRMUS and delivering 3,000 samples, we’ve learned that for high-power density GPU servers like the 8-card NVIDIA H200, traditional air cooling struggles to overcome the bottlenecks of high heat flux, high noise, low reliability, and a PUE of approximately 1.1. Immersion oil cooling, through a combination of upgraded media and customized design, not only addresses core temperature issues but also adapts to the personalized needs of different scenarios (such as FIRMUS’s high density, low noise, and high reliability).

In the future, we plan to work with FIRMUS to develop a more efficient liquid cold plate + liquid alloy solution for NVIDIA GB200 and GB300 GPUs, optimizing the cooling structure for next-generation GPU servers. Our goal is to keep oil temperatures below 45°C while maintaining GPU temperatures below 70°C. We will also standardize FIRMUS’s practical experience and continue to provide technical support for their projects next year, providing a reusable immersion oil cooling solution for more overseas AI factories.

Industries

Medical Cooling Application

Laser cooling application

Cloudy Computing Cooling

IGBT Cooling Application

EV Batteries Cooling Application

5G Telecom Cooling Application

Inverter Cooling Application

LED lighting Cooling Application

CNC Machined parts

Medical Equipment

Laser Equipment

EV Batteries and Energy Storage

5G Implementation

Cloudy Computing

IGBT Technology

Photovoltaic inverter

LED Lighting

Heat sinks

Liquid cold plates

Thermoelectric cooler

Peltier Air conditiners

Peltier Plate Assemblies

High Power Peltier Series

Peltier liquid Assemblies

Peltier recirculating chiller

Peltier Heat exchanger

Thermal management

Resources

About us

Industries

Medical Cooling Application

Laser cooling application

Cloudy Computing Cooling

IGBT Cooling Application

EV Batteries Cooling Application

5G Telecom Cooling Application

Inverter Cooling Application

LED lighting Cooling Application

CNC Machined parts

Medical Equipment

Laser Equipment

EV Batteries and Energy Storage

5G Implementation

Cloudy Computing

IGBT Technology

Photovoltaic inverter

LED Lighting

Heat sinks

Liquid cold plates

Thermoelectric cooler

Peltier Air conditiners

Peltier Plate Assemblies

High Power Peltier Series

Peltier liquid Assemblies

Peltier recirculating chiller

Peltier Heat exchanger

Thermal management

Resources

About us

Blogs

NVIDIA H200 GPU server immersion oil cooling solution

Ranking of popular articles to read

Thermal Design for CPO Microring Modulators

Thermal Management in Communication Base Stations

What is a Mechanically Pumped Two-Phase Loop (MPTL)?

Share this Post:

Start Our Business Requirements

Industries

Heat Sink

Liquid cold plates

Contact us

Get a Quotation