it become slower train when copying ML_ABN to ML_AB to continue to train

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
suojiang_zhang1
Jr. Member
Jr. Member
Posts: 71
Joined: Tue Nov 19, 2019 4:15 am

it become slower train when copying ML_ABN to ML_AB to continue to train

#1 Post by suojiang_zhang1 » Sat Mar 29, 2025 9:53 am

Dear,
For the same computer to run the MLFF train, I found the training speed will be slower when I copied the ML_ABN from the first time to ML_AB to continue train,


marie-therese.huebsch
Full Member
Full Member
Posts: 245
Joined: Tue Jan 19, 2021 12:01 am

Re: it become slower train when copying ML_ABN to ML_AB to continue to train

#2 Post by marie-therese.huebsch » Mon Mar 31, 2025 10:08 am

Hi,

Great that you do some testing. Could you clarify what exactly you are observing?

For reference, the ab-initio calculation should remain at the same computational cost in any MD step unless you changed some settings. During training more and more local reference configurations are collected and that will indeed cost more computational effort to add the e.g. 15th compared to the 4th local reference configuration and apply the design matrix. However it is not an option to entirely avoid adding local reference configurations, since this is what improves the force field. A comparison of restarting a training calculation or running a training calculation for longer should not impact the computational cost significantly (minus the overhead you get from writing and reading files etc.).

Do you have a question in connection with your observation?

Marie-Therese


suojiang_zhang1
Jr. Member
Jr. Member
Posts: 71
Joined: Tue Nov 19, 2019 4:15 am

Re: it become slower train when copying ML_ABN to ML_AB to continue to train

#3 Post by suojiang_zhang1 » Mon Apr 21, 2025 1:32 am

Hi,
The continue train become really slower when I copy the ML_ABN to ML_AB

My INCAR looks like:
ISMEAR = 0
SIGMA = 0.5
ISPIN = 1
ISYM = 0
LREAL = Auto
### MD part
IBRION = 0
MDALGO = 3
LANGEVIN_GAMMA = 10.0 10.0 10.0 10.0 10.0 10.0
LANGEVIN_GAMMA_L = 10.0
NSW = 10000
POTIM = 1.5
ISIF = 3
TEBEG = 200
TEEND = 500
PSTRESS = 0.001
PMASS=100
POMASS= 12 8 14 32 16 19
RANDOM_SEED = 486686595 0 0
### Output
LWAVE = .FALSE.
LCHARG = .FALSE.
#NBLOCK = 10
#KBLOCK = 10
##############################
### MACHINE-LEARNING ###
################################
ML_LMLFF = .T.
ML_MODE=train
ML_DESC_TYPE = 1
ML_MCONF_NEW=12
ML_CDOUB=4
ML_CTIFOR=0.02

I check the ML_ABN and find that the numbers of basis sets per atom type increase 3000 from scratch 1500 after continuing train.
I guess the increase of basis set lead to the very slow train.


suojiang_zhang1
Jr. Member
Jr. Member
Posts: 71
Joined: Tue Nov 19, 2019 4:15 am

Re: it become slower train when copying ML_ABN to ML_AB to continue to train

#4 Post by suojiang_zhang1 » Mon Apr 21, 2025 4:26 am

in addition, I find the ML_FFN was frequently rewrited that spends quitely time on it. How set the rewriting frequency for ML_FFN


Post Reply