G0W0 scaling with the number of processors
Posted: Fri May 13, 2011 1:47 pm
I performed three different DFT/G0W0 calculations for Si varing the number of nodes (1, 2, 3) corresponding to 8, 16, and 32 processors. The results show that the most efficient way to run this G0W0 Si calculation is to run it just using one single node. (NPAR can only be used to improve the scaling of the DFT-Si NBANDS=64 calculation. However in G0W0 calculations NPAR= total number of processors cannot be changed). I do not have any MAXMEM related error message in the log file during ejecution for any of the three calculations. Based on these results, I am always running my other G0W0 calculations in one single node provided that it fits the memory of the machine, setting MAXMEM=1024*3=3072Gb . When it happens that it does not fit the memory (i.e. I get the MAXMEM related error message) I move to 2 nodes (16 processors).
The "Total CPUs in seconds" of my benchmarks are:
# procs Hg-bench Si-DFT-64BANDS Si-G0W0-64BANDS
8 12.6 10.9 791.0
16 7.8 24.1 1084.3
32 6.3 101.7 1859.8
Note 1: System time /Total CPU time ratio is <1% in all Hg-bench and Si-DFT and Si-G0W0 calculations.
Note 2: MAXMEM > default =1024 Gb is not needed, so it is not specified in the INCAR file (below).
Cluster:2 Xeon E5520 Quad Core 2.27 G;24 Gb RAM Hz;Infiniband adapter.
INCAR:
Sistem=Si
LOPTICS = .TRUE.
ISMEAR = 0 !SEMICONDUCTOR
SIGMA = 0.05
NBANDS = 64
ALGO = GW0 ; NOMEGA = 192
My question is there any keyword, flag in the compilation of VASP I might test in order to improve the scaling of G0W0 calculations with the number of processors?
Thanks in advance,
Annapaola Migani
The "Total CPUs in seconds" of my benchmarks are:
# procs Hg-bench Si-DFT-64BANDS Si-G0W0-64BANDS
8 12.6 10.9 791.0
16 7.8 24.1 1084.3
32 6.3 101.7 1859.8
Note 1: System time /Total CPU time ratio is <1% in all Hg-bench and Si-DFT and Si-G0W0 calculations.
Note 2: MAXMEM > default =1024 Gb is not needed, so it is not specified in the INCAR file (below).
Cluster:2 Xeon E5520 Quad Core 2.27 G;24 Gb RAM Hz;Infiniband adapter.
INCAR:
Sistem=Si
LOPTICS = .TRUE.
ISMEAR = 0 !SEMICONDUCTOR
SIGMA = 0.05
NBANDS = 64
ALGO = GW0 ; NOMEGA = 192
My question is there any keyword, flag in the compilation of VASP I might test in order to improve the scaling of G0W0 calculations with the number of processors?
Thanks in advance,
Annapaola Migani