exit status 253

Problems running VASP: crashes, internal errors, "wrong" results.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
tholme

exit status 253

#1 Post by tholme » Tue May 20, 2008 10:50 pm

In a relatively large calculation (126 atoms), I had an error that I've never seen before, it may be due to memory allocation. After the first 5 steps of electronic convergence, VASP quit and showed this error:

Code: Select all

[0] Abort: VAPI_register_mr (Resources temporary unavailable) at line 209 in file vbuf.c
mpiexec: Warning: accept_abort_conn: MPI_Abort from IP 10.1.61.84, rank 0, killing all.
mpiexec: Warning: task 0 exited with status 253.
mpiexec: Warning: tasks 1-59 died with signal 15 (Terminated).
Has anyone seen this error? Can you confirm that it is a lack of memory? And what can be done to fix this? Can I change INCAR to request more memory allocation?

Thanks!

Tim
Last edited by tholme on Tue May 20, 2008 10:50 pm, edited 1 time in total.

job
Jr. Member
Jr. Member
Posts: 55
Joined: Tue Aug 16, 2005 7:44 am

exit status 253

#2 Post by job » Tue Jun 03, 2008 5:48 pm

VAPI_register_mr is an internal function in the infiniband driver for registering MPI buffer memory (i.e. pinning it so that it's not swapped out). With 126 atoms on 60 processors you should not really run into any memory limits, depending on the number of k-points etc. it's only a few hundred MB per processor. However, the amount of registered memory that you can use might be significantly less than the total memory on a node. I think in some implementations you need registered memory proportional to the number of MPI tasks on each node, so you could try running of fewer nodes and see if that helps.
Last edited by job on Tue Jun 03, 2008 5:48 pm, edited 1 time in total.

Post Reply