ZGHEV errors vs number of cores

Problems running VASP: crashes, internal errors, "wrong" results.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
t.ossowski
Newbie
Newbie
Posts: 6
Joined: Mon Nov 09, 2009 4:08 pm

ZGHEV errors vs number of cores

#1 Post by t.ossowski » Fri Jun 20, 2014 9:57 pm

We compiled parallel version of VASP 5 using Intel fortran compiler (and mkls). We have Intel processors 4x10 cores. For some systems we have problems with messages like "matrix not hermitian". It is surprising that for example there are no problems with calculations of clean surface, but when we add adsorbate atoms, problems appear (message: "matrix not hermitian" and calculations carshed).
Another example - clean surface and one-side relaxations with dipol correction. Even for pure surface calculations crashed.

But ..... when we run calculations on 8 cores problems disappeared! We tested calculations using different numbers of cores and calculations crashed when are running on 5,7,9,10,11 cores. For 1,2,3,4,6,8,12,16 cores there are no problems.
Maybe someone knows why???
Last edited by t.ossowski on Fri Jun 20, 2014 9:57 pm, edited 1 time in total.

t.ossowski
Newbie
Newbie
Posts: 6
Joined: Mon Nov 09, 2009 4:08 pm

ZGHEV errors vs number of cores

#2 Post by t.ossowski » Mon Aug 18, 2014 7:07 am

Does anyone know the solution of our problem? We have still problems with calculations. Last time with Fe(110) surface and (1x2) primitive cell. When we run it on 16 cores we have message "Matrix not hermitian ..." and caclulations crash at the beginning of the first step. The same job on 8 cores run without problems!
Last edited by t.ossowski on Mon Aug 18, 2014 7:07 am, edited 1 time in total.

admin
Administrator
Administrator
Posts: 2921
Joined: Tue Aug 03, 2004 8:18 am
License Nr.: 458

ZGHEV errors vs number of cores

#3 Post by admin » Tue Aug 19, 2014 7:12 am

Upon changing the number of cores vasp changes the number of bands (NBANDS must be divisable by NPAR). This can lead to differences in the course of the calculation. Try to fix the number of bands.
[ Edited Tue Aug 19 2014, 09:26AM ]
Last edited by admin on Tue Aug 19, 2014 7:12 am, edited 1 time in total.

cchang
Newbie
Newbie
Posts: 12
Joined: Mon Jan 28, 2013 11:22 pm

Re: ZGHEV errors vs number of cores

#4 Post by cchang » Fri Sep 05, 2014 6:32 pm

admin: When you say number of cores, do you mean the number of MPI ranks generally, or the divisor of the band number after accounting for NPAR/NCORE? So, if I have 10 MPI ranks and NPAR=2, NCORE=5. If I understand http://cms.mpi.univie.ac.at/vasp/vasp/P ... R_tag.html, that would divide each band 5 ways, and the number of bands 2 ways. Would it then be good enough to have NBANDS divisible by 2, or should it be divisible by 10?

Thanks

admin
Administrator
Administrator
Posts: 2921
Joined: Tue Aug 03, 2004 8:18 am
License Nr.: 458

Re: ZGHEV errors vs number of cores

#5 Post by admin » Tue Sep 09, 2014 11:48 am

NBANDS depends exclusively on your system (atoms, electrons). Before choosing parameters for
parallel calculations one should know NBANDS of the system. NBANDS must be divisable by NPAR.
When NBANDS = 28 and NPAR=4 nothing is changed by VASP.
At NPAR=6 VASP adjusts NBANDS to 30.

Post Reply