Error "EDDDAV: Call to ZHEGV failed." with AM05
Posted: Tue May 23, 2023 8:04 am
Dear developers,
I am trying to run some slab calculations using the AM05 exchange and correlation functional on a HPC machine; however, despite all my efforts I am always encountering the "EDDDAV: Call to ZHEGV
failed. Returncode = XXX" error.
I have already tried the solutions suggested in the following thread on the forum (forum/viewtopic.php?t=10409), but none of them worked.
I have made additional tests, and this is what I could observe:
-- this error is strongly dependent on the choice of xc potential: the same input files do not give any issue when selecting a different functional, e.g. PBEsol; however, the same setup (AM05) works fine for bulk calculations;
-- changing the parallelization details (i.e. both at slurm side [MPI/OPENMPI, number of processes, with/without GPU] and at INCAR side [NPAR, KPAR]) doesn't solve the issue, but changes the XXX code reported in the error message;
-- different INCAR setups, i.e. diagonalization schemes, plane waves cutoff, etc, end up with the same error;
-- the use of different pseudopotentials or different VASP versions (6.3 or 6.4) doesn't solve the issue;
-- the starting geometry seems to be free of any systematic errors, like too-short distances; increasing the lattice scaling factor do not avoid the issue;
-- starting from converged wavefunctions/charge densities (obtained through another xc functional) still produces the error;
-- I got this error using VASP on the Italian Cineca HPC machines: I get the same error on any of their clusters (marconi, marconi100, galileo100). I have been in touch with their support and they excluded that this error can be related to any library issue (e.g. the scalapack, as suggested in the forum);
-- the error is not always generated at the same step: sometimes at the beginning of the first diagonalization and sometimes after several electronic (or even ionic) steps. That is strongly machine-related. When the issue doesn't appear immediately, however, the total energy grows almost exponentially during the first electronic steps; correspondingly, the some energies in the OUTCAR are reported by a sequence of *****.
I am attaching a set of input/output files to check on the issue. Any help on solving it would be very appreciated.
Thank you in advance,
Aldo Ugolotti
I am trying to run some slab calculations using the AM05 exchange and correlation functional on a HPC machine; however, despite all my efforts I am always encountering the "EDDDAV: Call to ZHEGV
failed. Returncode = XXX" error.
I have already tried the solutions suggested in the following thread on the forum (forum/viewtopic.php?t=10409), but none of them worked.
I have made additional tests, and this is what I could observe:
-- this error is strongly dependent on the choice of xc potential: the same input files do not give any issue when selecting a different functional, e.g. PBEsol; however, the same setup (AM05) works fine for bulk calculations;
-- changing the parallelization details (i.e. both at slurm side [MPI/OPENMPI, number of processes, with/without GPU] and at INCAR side [NPAR, KPAR]) doesn't solve the issue, but changes the XXX code reported in the error message;
-- different INCAR setups, i.e. diagonalization schemes, plane waves cutoff, etc, end up with the same error;
-- the use of different pseudopotentials or different VASP versions (6.3 or 6.4) doesn't solve the issue;
-- the starting geometry seems to be free of any systematic errors, like too-short distances; increasing the lattice scaling factor do not avoid the issue;
-- starting from converged wavefunctions/charge densities (obtained through another xc functional) still produces the error;
-- I got this error using VASP on the Italian Cineca HPC machines: I get the same error on any of their clusters (marconi, marconi100, galileo100). I have been in touch with their support and they excluded that this error can be related to any library issue (e.g. the scalapack, as suggested in the forum);
-- the error is not always generated at the same step: sometimes at the beginning of the first diagonalization and sometimes after several electronic (or even ionic) steps. That is strongly machine-related. When the issue doesn't appear immediately, however, the total energy grows almost exponentially during the first electronic steps; correspondingly, the some energies in the OUTCAR are reported by a sequence of *****.
I am attaching a set of input/output files to check on the issue. Any help on solving it would be very appreciated.
Thank you in advance,
Aldo Ugolotti