Memory issue with TDDFT calculation

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Message
Author
shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Memory issue with TDDFT calculation

#1 Post by shweta_choudhary » Fri May 02, 2025 10:20 am

Dear support team,

I am trying to do TDDFT calculations for a system of 264 atoms and I am facing following error:

Code: Select all

 min. memory requirement per mpi rank    579.8 MB, per node  27828.9 MB

 -----------------------------------------------------------------------------
|                                                                             |
|     EEEEEEE  RRRRRR   RRRRRR   OOOOOOO  RRRRRR      ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     EEEEE    RRRRRR   RRRRRR   O     O  RRRRRR       #       #       #      |
|     E        R   R    R   R    O     O  R   R                               |
|     E        R    R   R    R   O     O  R    R      ###     ###     ###     |
|     EEEEEEE  R     R  R     R  OOOOOOO  R     R     ###     ###     ###     |
|                                                                             |
|     Could not allocate body of response function on mpi rank 0 of size:     |
|     0 MB. Reducing NOMEGAPAR or using more computing nodes might solve      |
|     this problem.                                                           |
|                                                                             |
|       ---->  I REFUSE TO CONTINUE WITH THIS SICK JOB ... BYE!!! <----       |
|                                                                             |
 -----------------------------------------------------------------------------

However, I am giving enough memory in my jobscript:

Code: Select all

#!/bin/bash
#SBATCH -N 3
#SBATCH --ntasks-per-node=48
#SBATCH --job-name=k
#SBATCH --error=error.%J.err
#SBATCH --output=output.%J.out
#SBATCH --time=00-01:00:00
#SBATCH --partition=debug
#SBATCH --mem=672GB

source /home/VASP/vasp_var.sh
mpirun -np $SLURM_NTASKS /home/vasp.6.3.2/bin/vasp_std

Following is the INCAR:

Code: Select all

ISTART =  1; ICHARG =  0; LWAVE=T; LCHARG=F
LREAL=A; ENCUT  =  600; GGA= PS
ISMEAR =  0; SIGMA  =  0.01
EDIFF  = 1E-8; ISIF=2; NSW = 0; IBRION = -1
ALGO=TDHF; NBANDS = 1776; ANTIRES=0
IBSE=0; NBANDSO   = 20 ; NBANDSV   = 20; LORBITALREAL=T
LFXC  =T; LHARTREE  =T; LADDER=F; NOMEGAPAR=1

Could anyone please help with this?

Thanks,
Shweta


fabien_tran1
Global Moderator
Global Moderator
Posts: 450
Joined: Mon Sep 13, 2021 11:02 am

Re: Memory issue with TDDFT calculation

#2 Post by fabien_tran1 » Fri May 02, 2025 12:27 pm

Hi.

Which amount of memory is available on each node of the cluster? The option --mem specifies the memory requirement per node (https://slurm.schedmd.com/archive/slurm ... batch.html), and the maximum possible value depends of course on what is available on the node. According to the error message, each node should provide at least 28 GB. Help regarding memory requirement can be found at wiki/index.php/Category:Memory.


shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Re: Memory issue with TDDFT calculation

#3 Post by shweta_choudhary » Fri May 02, 2025 8:12 pm

Hi,

Each node has 750GB available. I have already tried reducing NTAUPAR and NOMEGAPAR to 1 and NOMEGA to 30, but nothing seems to work. I do not understand why such memory issue is happening? Is vasp not able to read --mem tag?


fabien_tran1
Global Moderator
Global Moderator
Posts: 450
Joined: Mon Sep 13, 2021 11:02 am

Re: Memory issue with TDDFT calculation

#4 Post by fabien_tran1 » Fri May 02, 2025 8:56 pm

Can you please upload the files slurm_xxx.log and OUTCAR? Have you tried more nodes (it seems that you are using 3 nodes).


shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Re: Memory issue with TDDFT calculation

#5 Post by shweta_choudhary » Sat May 03, 2025 8:00 am

Hi,

I tried more nodes as well but the issue remains the same. Somehow, this forum is not letting upload OUTCAR and .out files. I am wiring here the OUTCAR and log files initial lines.

Code: Select all

 vasp.6.3.2 27Jun22 (build Mar 17 2025 15:04:40) complex

 executed on             LinuxIFC date 2025.05.02  15:40:50
 running on  144 total cores
 distrk:  each k-point on  144 cores,    1 groups
 distr:  one band on NCORE=   1 cores,  144 groups


--------------------------------------------------------------------------------------------------------


 INCAR:
   ISTART = 1
   ICHARG = 0
   LWAVE = T
   LCHARG = F
   LREAL = A
   ENCUT = 600
   GGA = PS
   ISMEAR = 0
   SIGMA = 0.01
   EDIFF = 1E-8
   ISIF = 2
   NSW = 0
   IBRION = -1
   ALGO = TDHF
   NBANDS = 1776
   ANTIRES = 0
   IBSE = 0
   NBANDSO = 20
   NBANDSV = 20
   LORBITALREAL = T
   LFXC = T
   LHARTREE = T
   LADDER = F
   NOMEGAPAR = 1

 POTCAR:    PAW_PBE Bi_d_GW 14Apr2014
 POTCAR:    PAW_PBE Br_GW 20Mar2012
 POTCAR:    PAW_PBE O_GW 28Sep2005

Code: Select all

 values below the HOMO (VB) or above the LUMO (CB) will cause erroneous energies
 E-fermi :  -0.4650

 -----------------------------------------------------------------------------
 WAVEDER not read: bands not compatible    1780    1872
 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     The derivative of the wavefunctions with respect to k (WAVEDER) can     |
|     not be found. You should redo the groundstate calculation using         |
|     LOPTICS=.TRUE. in order to write the WAVEDER file. However for          |
|     metals, the present setting is ok.                                      |
|                                                                             |
 -----------------------------------------------------------------------------

 the WAVEDER file was not read
energies w=

 responsefunction array rank=  229968
 LDA part: xc-table for Pade appr. of Perdew

 min. memory requirement per mpi rank    579.8 MB, per node  27828.9 MB

 allocating   0 responsefunctions rank=229968
 -----------------------------------------------------------------------------
|                                                                             |
|     EEEEEEE  RRRRRR   RRRRRR   OOOOOOO  RRRRRR      ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     EEEEE    RRRRRR   RRRRRR   O     O  RRRRRR       #       #       #      |
|     E        R   R    R   R    O     O  R   R                               |
|     E        R    R   R    R   O     O  R    R      ###     ###     ###     |
|     EEEEEEE  R     R  R     R  OOOOOOO  R     R     ###     ###     ###     |
|                                                                             |
|     Could not allocate body of response function on mpi rank 0 of size:     |
|     0 MB. Reducing NOMEGAPAR or using more computing nodes might solve      |
|     this problem.                                                           |
|                                                                             |
|       ---->  I REFUSE TO CONTINUE WITH THIS SICK JOB ... BYE!!! <----       |
|                                                                             |
 -----------------------------------------------------------------------------

fabien_tran1
Global Moderator
Global Moderator
Posts: 450
Joined: Mon Sep 13, 2021 11:02 am

Re: Memory issue with TDDFT calculation

#6 Post by fabien_tran1 » Thu May 08, 2025 8:42 am

Sorry for the late answer. If possible, could you please provide all input files (INCAR, KPOINTS and POSCAR), and give more info about the computer cluster that you are using?


shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Re: Memory issue with TDDFT calculation

#7 Post by shweta_choudhary » Sun May 11, 2025 3:34 pm

Dear sir,

INCAR:

Code: Select all

ISTART =  1; ICHARG =  0; LWAVE=T; LCHARG=F
ENCUT  =  600; GGA= PS
ISMEAR =  0; SIGMA  =  0.01
EDIFF  = 1E-8; ISIF=2; NSW = 0; IBRION = -1
ALGO=TDHF; NBANDS = 1872; ANTIRES=0
IBSE=0; NBANDSO   = 20 ; NBANDSV   = 20; LORBITALREAL=T
LFXC  =T; LHARTREE  =T; LADDER=F

KPOINTS:

Code: Select all

kmesh
0
Gamma
   1   2   1
0.0  0.0  0.0
POSCAR.tar
You do not have the required permissions to view the files attached to this post.

fabien_tran1
Global Moderator
Global Moderator
Posts: 450
Joined: Mon Sep 13, 2021 11:02 am

Re: Memory issue with TDDFT calculation

#8 Post by fabien_tran1 » Mon May 12, 2025 8:27 am

Is an upper limit for the virtual memory set on your nodes? Can you show what produces "ulimit -a" on the command line?


shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Re: Memory issue with TDDFT calculation

#9 Post by shweta_choudhary » Mon May 12, 2025 8:35 am

Hi,

Code: Select all

[shweta_cy.iitr@login02 ~]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1541472
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1541472
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

fabien_tran1
Global Moderator
Global Moderator
Posts: 450
Joined: Mon Sep 13, 2021 11:02 am

Re: Memory issue with TDDFT calculation

#10 Post by fabien_tran1 » Mon May 12, 2025 9:53 am

The virtual memory was set to unlimited, which is fine. Now, I would like to repeat your calculation, but I need all details:
-The steps of the calculation (DFT followed by TDDFT, etc.)
-INCAR files for all steps.
-OUTCAR and .out files for all steps (if they are too large then compress them with zip).


shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Re: Memory issue with TDDFT calculation

#11 Post by shweta_choudhary » Thu May 15, 2025 5:52 am

Dear Sir,

I have attached the ZIP file.

Many thanks,
Shweta

You do not have the required permissions to view the files attached to this post.

fabien_tran1
Global Moderator
Global Moderator
Posts: 450
Joined: Mon Sep 13, 2021 11:02 am

Re: Memory issue with TDDFT calculation

#12 Post by fabien_tran1 » Fri May 16, 2025 3:08 pm

Hi,

Your calculation requires much more memory than what is indicated in the output files. Your system is quite large and my colleague Alexey Tal (specialist of BSE/TDDFT) will give you recommendations for reducing the memory requirement to make the calculation hopefully feasible on your computer cluster.

However, before that, we would like you to provide us the correct input files, since you probably made some mistakes:
-The POTCAR file in the folder step3 is different from the other folders.
-The POSCAR files slightly differ.
-The value of NBANDS in step3 is different from step2, leading to the message "WAVEDER not read: bands not compatible 1780 1872".


shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Re: Memory issue with TDDFT calculation

#13 Post by shweta_choudhary » Fri May 16, 2025 4:05 pm

Dear sir,

I used gw POTCAR in step 3 as recommended in vasp tutorial. Also, I have copied WAVECAR CHGCAR and CONTCAR from previous steps. Please confirm the correct procedure for this. Let me know to resolve this memory issue for large system. One node has around 700 GB memory in our cluster. I could use upto 5 6 nodes.

Many thanks,
Shweta


alexey.tal
Global Moderator
Global Moderator
Posts: 394
Joined: Mon Sep 13, 2021 12:45 pm

Re: Memory issue with TDDFT calculation

#14 Post by alexey.tal » Fri May 16, 2025 4:51 pm

Dear Shweta,

I used gw POTCAR in step 3 as recommended in vasp tutorial. Also, I have copied WAVECAR CHGCAR and CONTCAR from previous steps. Please confirm the correct procedure for this.

It is necessary that all input files (WAVECAR, WAVEDER) in your TDDFT (ALGO=TDHF) calculation are produced with the same POTCAR file.

Let me know to resolve this memory issue for large system. One node has around 700 GB memory in our cluster. I could use upto 5 6 nodes.

Indeed, you are trying to perform a large calculation and the memory is likely to be an issue. However, there are a few things you could try to fit this calculation on your system.

The main bottleneck so far is the memory required to store the exchange-correlation kernel \(f_{xc}(G,G')\) as reported in your OUTCAR the basis set for the response function is:
maximum number of plane-waves: 229856. The kernel storage then amounts to 2298562*16E-9 = 845 Gb of memory. This array is not distributed and if you have 700 Gb per node increasing the number of nodes will not help as you need to provide more memory to the first MPI rank. A way to reduce the size of this array is to reduce the basis set size of the response function, i.e., reduce ENCUTGW. But keep in mind that the calculation needs to be thoroughly converged with ENCUTGW.

The number of bands included in the TDHF kernel calculation is very small in your OUTCAR, i.e., NBANDSO=NBANDSO=20. Since you have 2688 electrons or 1344 occupied bands, that means that you only account for around 1% of the occupied bands in your calculation, which is too little. For such a system you likely need hundreds/thousand of occupied and unoccupied bands. The rank of the Casida equation in the TDHF algorithm is NBANDSO x NBANDSV x NKPTS and you are going to need to solver a really large matrix to get a reasonably converged spectrum.

In VASP we have an alternative approach for the TDDFT calculation (ALGO=TIMEEV) which is discussed in detail on our wiki. This approach is much faster for calculations when the electron-hole interaction is not taken into account as in your case (LADDER=.FALSE.) and it requires much less memory.


shweta_choudhary
Newbie
Newbie
Posts: 12
Joined: Tue May 16, 2023 12:56 pm

Re: Memory issue with TDDFT calculation

#15 Post by shweta_choudhary » Fri May 16, 2025 5:06 pm

Dear Alexey,

Thank you very much for detailed response.

1. I will use gw POTCAR during whole procedure.

2. Could you please comment if I want to perform BSE@GW or TDHF with parameters from GW to account for electron hole interaction at all for such large systems is it not feasible with our HPC configurations? Because similar memory issue i faced during GW band structure calculations.

So, optimizing ENCUTGW is the only way? How can I utilize full memory in nodes ? I could decrease ntasks per node upto 24.

3. Could you please comment on how to select NBANDO/V ?

I really appreciate your help.

Last edited by shweta_choudhary on Fri May 16, 2025 5:08 pm, edited 2 times in total.

Post Reply