Page 1 of 1

Signal code: Address not mapped (1)

Posted: Thu Apr 24, 2008 11:40 pm
by midair77
Dear, all. I encounterred this error and was able to repeat to get the same error. It looks like my vasp program got seg fault/mem violation but I do not know how to intepret this part of mpi.

Our system is rocks 4.3 x86_64, openmpi-1.2.5, scalapack-1.8.0,
Barcelona, Gigabit interconnections.

# cat 2156.jupiter.mynetwork.com.out | wc -l
614
# cat 2089.jupiter.mynetwork.com.out | wc -l
157

The interesting part is that the same job ran on different nodes and got the same error but at different iterations. For job 2156, it took much longer to see the error and for job 2089 the error happened earlier.


[test@Jupiter ]$ cat Co0001.e2089
[compute-1-1:14557] *** Process received signal ***
[compute-1-1:14557] Signal: Segmentation fault (11)
[compute-1-1:14557] Signal code: Address not mapped (1)
[compute-1-1:14557] Failing at address: (nil)
[compute-1-1:14557] [ 0] /lib64/tls/libpthread.so.0 [0x3db530c4f0]
[compute-1-1:14557] [ 1] /usr/local/bin/vaspopenmpi_scala(__dfast__cnorma+0x1e4) [0x4dd884]
[compute-1-1:14557] [ 2] /usr/local/bin/vaspopenmpi_scala(__rmm_diis__eddrmm+0x6dbd) [0x5b25fd]
[compute-1-1:14557] [ 3] /usr/local/bin/vaspopenmpi_scala(elmin_+0x32fa) [0x608a9a][compute-1-1:14557] [ 4] /usr/local/bin/vaspopenmpi_scala(MAIN__+0x15492) [0x425f4a]
[compute-1-1:14557] [ 5] /usr/local/bin/vaspopenmpi_scala(main+0xe) [0x6ed9ee]
[compute-1-1:14557] [ 6] /lib64/tls/libc.so.6(__libc_start_main+0xdb) [0x3db441c3fb]
[compute-1-1:14557] [ 7] /usr/local/bin/vaspopenmpi_scala [0x410a2a]
[compute-1-1:14557] *** End of error message ***
mpiexec noticed that job rank 0 with PID 14557 on node compute-1-1.local exited on signal 11 (Segmentation fault).


[test@Jupiter ]$ cat Co0001.e2156
[compute-1-2:03847] *** Process received signal ***
[compute-1-2:03847] Signal: Segmentation fault (11)
[compute-1-2:03847] Signal code: Address not mapped (1)
[compute-1-2:03847] Failing at address: (nil)
[compute-1-2:03847] [ 0] /lib64/tls/libpthread.so.0 [0x3984e0c4f0]
[compute-1-2:03847] [ 1] /usr/local/bin/vaspopenmpi_scala(__dfast__cnorma+0x1e4) [0x4dd884]
[compute-1-2:03847] [ 2] /usr/local/bin/vaspopenmpi_scala(__rmm_diis__eddrmm+0x6dbd) [0x5b25fd]
[compute-1-2:03847] [ 3] /usr/local/bin/vaspopenmpi_scala(elmin_+0x32fa) [0x608a9a][compute-1-2:03847] [ 4] /usr/local/bin/vaspopenmpi_scala(MAIN__+0x15492) [0x425f4a]
[compute-1-2:03847] [ 5] /usr/local/bin/vaspopenmpi_scala(main+0xe) [0x6ed9ee]
[compute-1-2:03847] [ 6] /lib64/tls/libc.so.6(__libc_start_main+0xdb) [0x3983f1c3fb]
[compute-1-2:03847] [ 7] /usr/local/bin/vaspopenmpi_scala [0x410a2a]
[compute-1-2:03847] *** End of error message ***
mpiexec noticed that job rank 0 with PID 3847 on node compute-1-2.local exited on signal 11 (Segmentation fault).

Could somebody tell me what caused this type of error?

Thank you very much for your helps.

Signal code: Address not mapped (1)

Posted: Tue Apr 29, 2008 10:07 am
by admin
from the .e files you show, the address which fails seems to be primarily one of the routines in the pthread library. Please check if the errors are due to paralellization (i.e. if a single-porcessor job crashes as well)

Signal code: Address not mapped (1)

Posted: Fri Sep 18, 2009 1:40 am
by dinhloc1984
Hi,

I get the same error. This happens when I use ultra soft PP.
When I use PAW instead of USPP, it runs well.

Have any one explain for this strange case? I wan to use USPP but i can not implement.

Thanks for all your help.

Sincerely,
Loc.

Signal code: Address not mapped (1)

Posted: Sat Jul 31, 2010 6:56 am
by Sonny
Hello;

I get this error with IBRION=0 calculations.
It occurs in both serial and parallel on Xeon quad compiled with gfort.


regards;

Sonny


<span class='smallblacktext'>[ Edited Fri Aug 06 2010, 06:16AM ]</span>