Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

Problems running VASP: crashes, internal errors, "wrong" results.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
nkwem
Newbie
Newbie
Posts: 12
Joined: Tue Feb 23, 2010 7:31 am
License Nr.: 5-44
Location: South Africa

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

#1 Post by nkwem » Tue Jun 12, 2012 11:23 am

Hi

We are experiencing problems when running VASP 5.2.11 and 5.2.12 using multiple nodes. Jobs are able to run on a single node with 12 cores but they fail when using 2 or more nodes. We have also noticed that some input file are able to run on multiple nodes while other can't. The executables were both compiler using intel compilers and MKL libraries. We are using IntelMPI to run the jobs.What could be the problem?

Below is the INCAR file that fails to run on multiple nodes:
ISPIN=2
ISMEAR=0; SIGMA=0.05
NSW = 90
IBRION = 2
ISIF = 4
LREAL= Auto

The POSCAR is as follows:
C :fcc
3.63666666666667
2.00000000000000 0.00000000000000 0.00000000000000
0.00000000000000 2.00000000000000 0.00000000000000
0.00000000000000 0.00000000000000 2.00000000000000
1 63
Selective dynamics
Direct
0.0 0.0 0.0 T T T
0.25000000000000 0.25000000000000 0.00000000000000 T T T
0.50000000000000 0.50000000000000 0.00000000000000 T T T
0.75000000000000 0.75000000000000 0.00000000000000 T T T
0.00000000000000 0.25000000000000 0.25000000000000 T T T
0.25000000000000 0.50000000000000 0.25000000000000 T T T
0.50000000000000 0.75000000000000 0.25000000000000 T T T
0.75000000000000 0.00000000000000 0.25000000000000 T T T
0.00000000000000 0.50000000000000 0.50000000000000 T T T
0.25000000000000 0.75000000000000 0.50000000000000 T T T
0.50000000000000 0.00000000000000 0.50000000000000 T T T
0.75000000000000 0.25000000000000 0.50000000000000 T T T
0.00000000000000 0.75000000000000 0.75000000000000 T T T
0.25000000000000 0.00000000000000 0.75000000000000 T T T
0.50000000000000 0.25000000000000 0.75000000000000 T T T
0.75000000000000 0.50000000000000 0.75000000000000 T T T
0.25000000000000 0.00000000000000 0.25000000000000 T T T
0.50000000000000 0.25000000000000 0.25000000000000 T T T
0.75000000000000 0.50000000000000 0.25000000000000 T T T
0.00000000000000 0.75000000000000 0.25000000000000 T T T
0.25000000000000 0.25000000000000 0.50000000000000 T T T
0.50000000000000 0.50000000000000 0.50000000000000 T T T
0.75000000000000 0.75000000000000 0.50000000000000 T T T
0.00000000000000 0.00000000000000 0.50000000000000 T T T
0.25000000000000 0.50000000000000 0.75000000000000 T T T
0.50000000000000 0.75000000000000 0.75000000000000 T T T
0.75000000000000 0.00000000000000 0.75000000000000 T T T
0.00000000000000 0.25000000000000 0.75000000000000 T T T
0.25000000000000 0.75000000000000 0.00000000000000 T T T
0.50000000000000 0.00000000000000 0.00000000000000 T T T
0.75000000000000 0.25000000000000 0.00000000000000 T T T
0.00000000000000 0.50000000000000 0.00000000000000 T T T
0.12500000000000 0.12500000000000 0.12500000000000 T T T
0.37500000000000 0.37500000000000 0.12500000000000 T T T
0.62500000000000 0.62500000000000 0.12500000000000 T T T
0.87500000000000 0.87500000000000 0.12500000000000 T T T
0.12500000000000 0.37500000000000 0.37500000000000 T T T
0.37500000000000 0.62500000000000 0.37500000000000 T T T
0.62500000000000 0.87500000000000 0.37500000000000 T T T
0.87500000000000 0.12500000000000 0.37500000000000 T T T
0.12500000000000 0.62500000000000 0.62500000000000 T T T
0.37500000000000 0.87500000000000 0.62500000000000 T T T
0.62500000000000 0.12500000000000 0.62500000000000 T T T
0.87500000000000 0.37500000000000 0.62500000000000 T T T
0.12500000000000 0.87500000000000 0.87500000000000 T T T
0.37500000000000 0.12500000000000 0.87500000000000 T T T
0.62500000000000 0.37500000000000 0.87500000000000 T T T
0.87500000000000 0.62500000000000 0.87500000000000 T T T
0.37500000000000 0.12500000000000 0.37500000000000 T T T
0.62500000000000 0.37500000000000 0.37500000000000 T T T
0.87500000000000 0.62500000000000 0.37500000000000 T T T
0.12500000000000 0.87500000000000 0.37500000000000 T T T
0.37500000000000 0.37500000000000 0.62500000000000 T T T
0.62500000000000 0.62500000000000 0.62500000000000 T T T
0.87500000000000 0.87500000000000 0.62500000000000 T T T
0.12500000000000 0.12500000000000 0.62500000000000 T T T
0.37500000000000 0.62500000000000 0.87500000000000 T T T
0.62500000000000 0.87500000000000 0.87500000000000 T T T
0.87500000000000 0.12500000000000 0.87500000000000 T T T
0.12500000000000 0.37500000000000 0.87500000000000 T T T
0.37500000000000 0.87500000000000 0.12500000000000 T T T
0.62500000000000 0.12500000000000 0.12500000000000 T T T
0.87500000000000 0.37500000000000 0.12500000000000 T T T
0.12500000000000 0.62500000000000 0.12500000000000 T T T


Regards,
Nkwe
Last edited by nkwem on Tue Jun 12, 2012 11:23 am, edited 1 time in total.

alex
Hero Member
Hero Member
Posts: 577
Joined: Tue Nov 16, 2004 2:21 pm
License Nr.: 5-67
Location: Germany

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

#2 Post by alex » Tue Jun 12, 2012 1:51 pm

Hi Nkwe,

if this situation occurs, your (VASP-) inputs are already fine. You have to check many things:

a) log into node01, then try ssh to e.g. node02 (or your chosen remote login shell). Are you allowed without password? Was the login successful?

Try this one first, otherwise the list to check is very long ...

Cheers,

alex
Last edited by alex on Tue Jun 12, 2012 1:51 pm, edited 1 time in total.

nkwem
Newbie
Newbie
Posts: 12
Joined: Tue Feb 23, 2010 7:31 am
License Nr.: 5-44
Location: South Africa

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

#3 Post by nkwem » Wed Jun 13, 2012 4:09 pm

Hi Alex

Thank you for responding.

Yes, I can ssh to different nodes without a password and I can also ssh from a compute node to other compute nodes.

Regards,
Nkwe
<span class='smallblacktext'>[ Edited Wed Jun 13 2012, 04:12PM ]</span>
Last edited by nkwem on Wed Jun 13, 2012 4:09 pm, edited 1 time in total.

jlbettis
Jr. Member
Jr. Member
Posts: 53
Joined: Thu Mar 11, 2010 1:13 am
Location: Raleigh, NC

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

#4 Post by jlbettis » Wed Jun 13, 2012 9:10 pm

Try setting NPAR.
Last edited by jlbettis on Wed Jun 13, 2012 9:10 pm, edited 1 time in total.
VASP 5.2.11
Cray XE6

nkwem
Newbie
Newbie
Posts: 12
Joined: Tue Feb 23, 2010 7:31 am
License Nr.: 5-44
Location: South Africa

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

#5 Post by nkwem » Tue Jul 31, 2012 10:44 am

Hi jlbettis,

Thank you. Your suggestion works perfectly.

Regards,
Nkwe
Last edited by nkwem on Tue Jul 31, 2012 10:44 am, edited 1 time in total.

Post Reply