GPU choice A100, H100 or L40S

Message

jinyangguo · #1 Post by **jinyangguo** » Mon Apr 15, 2024 4:46 am

Dear developer and user,

Our group is going to purchase a server with GPUs. Are there any test examples running on different GPUs for comparison and recommendation? Like A100, H100, and L40S, we would like to know about the accumulating ability of these GPUs. We mainly calculate median size models normally (100-300 atoms).

Thanks,
Nick

#2 Post by **michael_wolloch** » Mon Apr 15, 2024 9:11 am

Dear Nick,

the VASP company cannot recommend specific hardware solutions, only provide some general points to consider:

1) While the openACC port of VASP can run in principle on any NVIDIA GPU, FP64 performance is key. This makes "consumer hardware" cards (with RTX and Titan branding) unreasonable due to the typically 1:64 performance drop from FP32 to FP64, which is only 1:2 for "data center" cards like the A30 or A100.

2) GPU memory size is important. You have to fit the wavefunctions on the device entirely. Double your typical WAVECAR size (it is printed in single precision I think) to estimate minimum memory requirements. The number of atoms alone is not enough to judge memory impact.

3) Not all capabilities of VASP are available in the GPU port as of now. For example, everything using RPA is not enabled currently, which includes GW calculations. On the other hand, hybrid functionals, for instance, scale especially well on GPUs.

There are some benchmarks from NVIDIA concerning multi-node scaling and energy efficiency available, although they are a bit outdated in parts:

https://developer.nvidia.com/blog/scali ... magnum-io/
https://developer.nvidia.com/blog/optim ... magnum-io/

jinyangguo · #3 Post by **jinyangguo** » Tue Apr 16, 2024 12:16 am

Hi,

Thanks a lot, Michael. The information you provide is quite helpful.
Yes, I saw some examples with A100, but few with H100 and L40S.

Best

hszhao.cn@gmail.com · #4 Post by **hszhao.cn@gmail.com** » Tue May 14, 2024 2:06 am

Dear michael,

1) While the openACC port of VASP can run in principle on any NVIDIA GPU, FP64 performance is key. This makes "consumer hardware" cards (with RTX and Titan branding) unreasonable due to the typically 1:64 performance drop from FP32 to FP64, which is only 1:2 for "data center" cards like the A30 or A100.

Based on the comparison tests between A100 and 4090 conducted here, FP64 has minimal impact on efficiency. Therefore, I am not sure why you specifically emphasized the impact of FP64 on performance. Do you have any systematic or specific test cases as evidence?

Regards,
Zhao

#5 Post by **michael_wolloch** » Tue May 14, 2024 7:29 am

Dear Zhao,

VASP uses double precision floats (FP64) nearly exclusively in its computations. Thus FP64 performance is critical. I don't know what was tested in the "benchmarks" you linked to, but since the FP64 performance of the A100 and the RTX 4090 is not close, I must assume that it was not carefully done. Benchmarking is difficult, and maybe what was measured there was not compute, but GPU to GPU communication, memory bandwidth, or another bottleneck. I remind you that publishing benchmarks without prior authorization from the VASP company infringes on the license agreement. I would urge you not to link to such websites in the future.

Since the original question was answered, and there are other threads regarding FP64 performance and benchmarking, I will lock this topic now,
all the best, Michael

My Community

GPU choice A100, H100 or L40S

GPU choice A100, H100 or L40S

Re: GPU choice A100, H100 or L40S

Re: GPU choice A100, H100 or L40S

Re: GPU choice A100, H100 or L40S

Re: GPU choice A100, H100 or L40S