Low WAVECAR writing performance

Problems running VASP: crashes, internal errors, "wrong" results.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
a.schleife
Newbie
Newbie
Posts: 3
Joined: Wed Oct 28, 2015 5:08 am
License Nr.: 5-1851

Low WAVECAR writing performance

#1 Post by a.schleife » Wed Oct 28, 2015 3:30 pm

Not sure if this is a "bug" or if I am not setting a parameter correctly, but we encountered an issue when running VASP on many cores (58 nodes, 16 cores/node) on a big system. The final wave function file is about 102 GB, but a LOT of the time seems to be spent in writing this file. This is from the OUTCAR:

General timing and accounting informations for this job:
========================================================
Total CPU time used (sec): 8211.169
User time (sec): 6686.418
System time (sec): 1524.751
Elapsed time (sec): 49008.315

In fileio.F it looks like multiple MPI processes write into the same record in the WAVECAR file and this may cause this significant performance degradation? Did I make a mistake here, or are you aware of this? Is there a way to improve this to improve IO performance? Any help or hint would be greatly appreciated!

admin
Administrator
Administrator
Posts: 2921
Joined: Tue Aug 03, 2004 8:18 am
License Nr.: 458

Re: Low WAVECAR writing performance

#2 Post by admin » Thu Oct 29, 2015 12:48 pm

If large files are not necessary one can writing simply switch off.

a.schleife
Newbie
Newbie
Posts: 3
Joined: Wed Oct 28, 2015 5:08 am
License Nr.: 5-1851

Re: Low WAVECAR writing performance

#3 Post by a.schleife » Thu Oct 29, 2015 5:01 pm

Thanks for this suggestion! It would work around the issue for cases where the wave functions are "easily" computed again from scratch (or from the charge density). Nevertheless, it is not desirable to spend such huge amounts of time writing a single not THAT large of a file.

So I guess my question is: Is this a known issue/known behavior, or an effect that only we see on our machine? If the former is the case, is there a fix planned?

a.schleife
Newbie
Newbie
Posts: 3
Joined: Wed Oct 28, 2015 5:08 am
License Nr.: 5-1851

Re: Low WAVECAR writing performance

#4 Post by a.schleife » Wed Dec 02, 2015 7:21 pm

Together with the help of Victor Anisimov at NCSA we actually found a way to fix this issue: If no WAVECAR exists and a new file is written at the end of the job, rather than overwriting an existing WAVECAR, the slow-down is not observed. Hence, changing fileio.F such that the file written by VASP is not called WAVECAR (the file read at the beginning of the job), fixed this problem for us.

Direct access writing into an existing file WAVECAR on Lustre High-Performance File System causes each write command requesting a metadata information from the MDS server. This significantly slows down the write operation. No such bottleneck exists when writing into a new file. When a large-scale VASP calculation is performed on a Lustre file system it is necessary to write WAVECAR into a new file in order to avoid the metadata bottleneck.

Post Reply