Convergence problems on HCP and BCC primitive unit cells when number of cores > 1

Problems running VASP: crashes, internal errors, "wrong" results.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
rbgetman
Newbie
Newbie
Posts: 1
Joined: Tue Dec 20, 2011 9:02 pm
License Nr.: 5-1019

Convergence problems on HCP and BCC primitive unit cells when number of cores > 1

#1 Post by rbgetman » Tue Dec 20, 2011 9:31 pm

Hello,

I have recently started using VASP again after ~ a 2.5 year hiatus. As a first task, I am converging some bulk metal structures, e.g., for Ru, which is HCP. I am using VASP 5.2.

I set up a POSCAR according to experimental crystal data (Ru-Ru distances ~ 2.7 Angstroms). The primitive unit cell has lattice vectors of (0.5, -sqrt(3)a/2, 0); (0.5, sqrt(3)a/2, 0); (0,0,c), where a ~ 2.7, and c ~ 1.6a. It has 2 Ru atoms at fractional coordinates of (1/3,2/3,1/4) & (2/3,1/3,3/4). I used a gamma centered KPOINT mesh with > 160 irreducible k-points, and the following flags (amongst others, which I will elaborate on if necessary) in the INCAR:

IALGO = 48 -or- default value
ALGO = Normal -or- fast
LREAL = .FALSE.
IBRION = 1
NFREE = 10
POTIM = 0.1
NPAR = 1
LPLANE = .TRUE.
NSIM = 10 -or- 8

Oddly, if I run this simulation on a single core, the simulation finishes without errors. If I run across all 24 cores on the same node, I get errors, e.g., such as the following:

WARNING: Sub-Space-Matrix is not hermitian in DAV

I am running this simulation on Linux nodes with quad hexacore Intel Xeon processors. My vasp executable uses the MIT FFTW and MKL BLAS and LAPACK libraries, but I have tried executables that use the default LAPACK libraries and gotten the same errors. (I can share more details about my compilation if it will be useful.)

Interestingly, I do not get the same errors when I run simulations on FCC primitive unit cells.

I found several hits on this forum containing the same error message that I'm getting, but none of them seems to be solved by running on 1 core vs. multiple. I did find one other result (http://cst-www.nrl.navy.mil/users/erwin ... s/#notherm) that may be related.

Any advice on why I am getting this error?

Thank you very much.
Last edited by rbgetman on Tue Dec 20, 2011 9:31 pm, edited 1 time in total.

dEnvar

Convergence problems on HCP and BCC primitive unit cells when number of cores > 1

#2 Post by dEnvar » Wed Dec 12, 2012 1:05 am

Have you found the solution?
Last edited by dEnvar on Wed Dec 12, 2012 1:05 am, edited 1 time in total.

juhL

Convergence problems on HCP and BCC primitive unit cells when number of cores > 1

#3 Post by juhL » Thu May 02, 2013 4:07 pm

I'm also having this with any kind of IBRON, IALGO, and ISYM settings. Besides that it isn't necessary to calculate small systems on more than one node, does anyone have an educated explanation for this?
Last edited by juhL on Thu May 02, 2013 4:07 pm, edited 1 time in total.

Post Reply