Dear,
For the same computer to run the MLFF train, I found the training speed will be slower when I copied the ML_ABN from the first time to ML_AB to continue train,
it become slower train when copying ML_ABN to ML_AB to continue to train
Moderators: Global Moderator, Moderator
-
- Jr. Member
- Posts: 71
- Joined: Tue Nov 19, 2019 4:15 am
it become slower train when copying ML_ABN to ML_AB to continue to train
-
- Full Member
- Posts: 245
- Joined: Tue Jan 19, 2021 12:01 am
Re: it become slower train when copying ML_ABN to ML_AB to continue to train
Hi,
Great that you do some testing. Could you clarify what exactly you are observing?
For reference, the ab-initio calculation should remain at the same computational cost in any MD step unless you changed some settings. During training more and more local reference configurations are collected and that will indeed cost more computational effort to add the e.g. 15th compared to the 4th local reference configuration and apply the design matrix. However it is not an option to entirely avoid adding local reference configurations, since this is what improves the force field. A comparison of restarting a training calculation or running a training calculation for longer should not impact the computational cost significantly (minus the overhead you get from writing and reading files etc.).
Do you have a question in connection with your observation?
Marie-Therese
-
- Jr. Member
- Posts: 71
- Joined: Tue Nov 19, 2019 4:15 am
Re: it become slower train when copying ML_ABN to ML_AB to continue to train
Hi,
The continue train become really slower when I copy the ML_ABN to ML_AB
My INCAR looks like:
ISMEAR = 0
SIGMA = 0.5
ISPIN = 1
ISYM = 0
LREAL = Auto
### MD part
IBRION = 0
MDALGO = 3
LANGEVIN_GAMMA = 10.0 10.0 10.0 10.0 10.0 10.0
LANGEVIN_GAMMA_L = 10.0
NSW = 10000
POTIM = 1.5
ISIF = 3
TEBEG = 200
TEEND = 500
PSTRESS = 0.001
PMASS=100
POMASS= 12 8 14 32 16 19
RANDOM_SEED = 486686595 0 0
### Output
LWAVE = .FALSE.
LCHARG = .FALSE.
#NBLOCK = 10
#KBLOCK = 10
##############################
### MACHINE-LEARNING ###
################################
ML_LMLFF = .T.
ML_MODE=train
ML_DESC_TYPE = 1
ML_MCONF_NEW=12
ML_CDOUB=4
ML_CTIFOR=0.02
I check the ML_ABN and find that the numbers of basis sets per atom type increase 3000 from scratch 1500 after continuing train.
I guess the increase of basis set lead to the very slow train.
-
- Jr. Member
- Posts: 71
- Joined: Tue Nov 19, 2019 4:15 am
Re: it become slower train when copying ML_ABN to ML_AB to continue to train
in addition, I find the ML_FFN was frequently rewrited that spends quitely time on it. How set the rewriting frequency for ML_FFN