Page 1 of 1

Total energy unreasonably different with parallel NPAR

Posted: Mon Sep 14, 2015 9:23 am
by GangBi
Dear all,

I do a calculation on a node with 12 cores on it. I do a repeated slab calculation. Firstly use the default value for NPAR. I find the self consistent calculation is converged but the total energy is positive, as
N E dE d eps ncg rms rms(c)
DAV: 1 0.697842165012E+03 0.69784E+03 -0.60118E+04 2016 0.837E+02
DAV: 2 0.231956316158E+03 -0.46589E+03 -0.43967E+03 2376 0.140E+02
DAV: 3 0.200190143989E+03 -0.31766E+02 -0.29509E+02 2436 0.310E+01
DAV: 4 0.199405108860E+03 -0.78504E+00 -0.73560E+00 2376 0.371E+00
DAV: 5 0.199388295250E+03 -0.16814E-01 -0.15968E-01 2100 0.372E-01 0.171E+02
DAV: 6 0.193222117805E+04 0.17328E+04 -0.10575E+04 2208 0.184E+02 0.114E+02
DAV: 7 0.306973187688E+04 0.11375E+04 -0.37124E+03 2076 0.116E+02 0.872E+01
DAV: 8 0.396531468939E+04 0.89558E+03 -0.20813E+03 3024 0.943E+01 0.244E+01
DAV: 9 0.399119740779E+04 0.25883E+02 -0.10042E+03 2796 0.736E+01 0.229E+01
DAV: 10 0.399343140005E+04 0.22340E+01 -0.19082E+02 2580 0.322E+01 0.202E+01
DAV: 11 0.397970519245E+04 -0.13726E+02 -0.17259E+02 2688 0.421E+01 0.180E+01
DAV: 12 0.401856874830E+04 0.38864E+02 -0.44255E+01 3036 0.225E+01 0.511E+00
DAV: 13 0.401846134440E+04 -0.10740E+00 -0.14256E+01 2400 0.909E+00 0.347E+00
DAV: 14 0.401922519746E+04 0.76385E+00 -0.21566E+00 2808 0.409E+00 0.109E+00
DAV: 15 0.401930512549E+04 0.79928E-01 -0.53074E-01 2400 0.207E+00 0.422E-01
DAV: 16 0.401935305226E+04 0.47927E-01 -0.10427E-01 2388 0.955E-01 0.281E-01
DAV: 17 0.401937332021E+04 0.20268E-01 -0.80278E-02 2808 0.948E-01 0.404E-01
DAV: 18 0.401938892756E+04 0.15607E-01 -0.39520E-02 2472 0.637E-01 0.177E-01
DAV: 19 0.401939131595E+04 0.23884E-02 -0.20370E-02 2508 0.406E-01 0.113E-01
DAV: 20 0.401939123928E+04 -0.76662E-04 -0.41487E-03 2868 0.192E-01 0.118E-01
DAV: 21 0.401939139125E+04 0.15197E-03 -0.87410E-04 2364 0.961E-02 0.109E-01
DAV: 22 0.401939161995E+04 0.22870E-03 -0.72381E-04 2256 0.669E-02 0.100E-01
DAV: 23 0.401939170716E+04 0.87205E-04 -0.14303E-04 1260 0.368E-02
1 F= 0.40193917E+04 E0= 0.40193917E+04 d E =0.000000E+00

and the number of bands is NBANDS=144, as
k-points NKPTS = 8 k-points in BZ NKDIM = 8 number of bands NBANDS= 144

However, with all the input the same except NPAR=4. The the self consistent calculation is converged with the total energy negative, as
N E dE d eps ncg rms rms(c)
DAV: 1 0.105163862033E+04 0.10516E+04 -0.67221E+04 2176 0.886E+02
DAV: 2 -0.107529395773E+02 -0.10624E+04 -0.10284E+04 2520 0.235E+02
DAV: 3 -0.875016376226E+02 -0.76749E+02 -0.75871E+02 2496 0.618E+01
DAV: 4 -0.893402286864E+02 -0.18386E+01 -0.18289E+01 2600 0.913E+00
DAV: 5 -0.893782348763E+02 -0.38006E-01 -0.37943E-01 2552 0.130E+00 0.724E+00
DAV: 6 -0.861988495185E+02 0.31794E+01 -0.64072E+00 2424 0.103E+01 0.353E+00
DAV: 7 -0.859051346963E+02 0.29371E+00 -0.30370E+00 2448 0.440E+00 0.162E+00
DAV: 8 -0.858878435512E+02 0.17291E-01 -0.48658E-01 2456 0.290E+00 0.630E-01
DAV: 9 -0.858835325671E+02 0.43110E-02 -0.14669E-01 2464 0.157E+00 0.233E-01
DAV: 10 -0.858827057887E+02 0.82678E-03 -0.42983E-02 2456 0.562E-01 0.148E-01
DAV: 11 -0.858795801960E+02 0.31256E-02 -0.40763E-03 2568 0.327E-01 0.595E-02
DAV: 12 -0.858783060030E+02 0.12742E-02 -0.27255E-03 2704 0.265E-01 0.559E-02
DAV: 13 -0.858774514252E+02 0.85458E-03 -0.22847E-03 3176 0.292E-01 0.286E-02
DAV: 14 -0.858775450251E+02 -0.93600E-04 -0.27668E-03 2384 0.122E-01 0.260E-02
DAV: 15 -0.858774491900E+02 0.95835E-04 -0.22038E-04 1984 0.558E-02
1 F= -.85877449E+02 E0= -.85877334E+02 d E =-.230281E-03

and the band is 136, which is 8 smaller than the NPAR=default
k-points NKPTS = 8 k-points in BZ NKDIM = 8 number of bands NBANDS= 136

And these are from OUTCAR:

executed on LinuxIFC date 2015.09.14 15:33:28
running on 12 total cores
distrk: each k-point on 12 cores, 1 groups
distr: one band on NCORES_PER_BAND= 3 cores, 4 groups
--------------------------------------------------------------------------------------------------------
executed on LinuxIFC date 2015.09.12 15:59:35
running on 12 total cores
distrk: each k-point on 12 cores, 1 groups
distr: one band on NCORES_PER_BAND= 1 cores, 12 groups


I think there is something wrong with the NPAR=default total energy, the result is extremely strange.




Best

Gang

Re: Total energy unreasonably different with parallel NPAR

Posted: Wed Sep 16, 2015 11:29 pm
by Neutrino
Hi Gang,

I never came across this behavior. May be posting the INCAR can help figuring out what went wrong.


Mostafa

Re: Total energy unreasonably different with parallel NPAR

Posted: Thu Sep 17, 2015 9:47 am
by GangBi
Neutrino wrote:Hi Gang,

I never came across this behavior. May be posting the INCAR can help figuring out what went wrong.


Mostafa
Here is my INCAR

SYSTEM = DFT SCF
PREC = Normal
ENCUT = 300
EDIFFG = -0.01
ISMEAR = 1
SIGMA=0.1
LREAL = A
AMIN=0.01

ISTART = 0
ICHARG = 2
LWAVE = .F.
LCHARG = .F.
IBRION = -1
ISIF = 2
NSW = 0

NPAR=4

Re: Total energy unreasonably different with parallel NPAR

Posted: Thu Sep 17, 2015 10:14 am
by alex
Hello GangBi,

the default NPAR takes the number of cores as input, if I'm not mistaken. Hence you have 144 for NBANDS (its a multiple of 12). Depending on the symmetry of the system you might end up in some weird electronic ground state (as your first (and completely useless) total energy suggests). Remember the initial guess starts with high spin.
Please also check for the term magnetic substructure close to the symmetry of the (atomic) structure and compare with the proper result with the negative total energy. I'd guess the (magnetic) symmetry of the weird case is much higher compared to the working example. (if this is the case, please break down the symmetry and try again).

Hth,

alex