collective abort
Posted: Fri Feb 29, 2008 7:02 am
Dear all:
I had VASP running smoothly on a home made PC cluster made of AMD CPUs with system sizes of 512 and 256 atoms. Recently we bought a new cluster from a company made of intel CPUs. VASP runs smoothly with 256 atoms, but once the system has 512 atoms, VASP stops with following message:
rank 3 in job 1 ... caused collective abort of all ranks
exit status of rank 3: killed by signal 11.
The error occurs at the beginning, during the preparation of WAVECAR.
It occurs even we use one node.
We have changed the dyna.f to 512 atoms.
It is a little strange that system of 256 atoms can run but system of 512 cannot.
Please kindly help us.
<span class='smallblacktext'>[ Edited ]</span>
I had VASP running smoothly on a home made PC cluster made of AMD CPUs with system sizes of 512 and 256 atoms. Recently we bought a new cluster from a company made of intel CPUs. VASP runs smoothly with 256 atoms, but once the system has 512 atoms, VASP stops with following message:
rank 3 in job 1 ... caused collective abort of all ranks
exit status of rank 3: killed by signal 11.
The error occurs at the beginning, during the preparation of WAVECAR.
It occurs even we use one node.
We have changed the dyna.f to 512 atoms.
It is a little strange that system of 256 atoms can run but system of 512 cannot.
Please kindly help us.
<span class='smallblacktext'>[ Edited ]</span>