A framework for force field parameter optimization.
The FF Optimizer is a base module that can be extended with other modules that implement specific optimization methods. The base FF Optimizer module offers methods to compute the cost function (also known as the error or the fitness function) and its first and second derivatives with respect to the force-field parameter values. Besides that, there are two extensions available that implement global optimization methods: Monte-Carlo (MCFF) and Covariance Matrix Adaptation Evolution Strategy (CMA-ES).
In order to optimize a ReaxFF forcefield, the files listed below must be present in the directory where the reaxff program is executed.
file containing a line of text with a single number on it. To use the basic FFOptimizer features this file must contain 6. See the
ffoactdescription below for more details.
the initial force-field file.
in addition to general ReaxFF control parameters it may also contain the FFOptimizer-related ones explained in the corresponding sections.
file with test values from the training set, the same as in the original reaxff force-field optimization, see page 27 of the ReaxFF User Manual.
file that describes variable forcefield parameters, one line per parameter (see example below).
This format is a generalization of the original params file format.
The hash symbol
#starts a comment. Each non-comment line must begin with three integer numbers, the section-block-item coordinates of the corresponding parameter. The first coordinate specifies a section: 1 - general parameters, 2 - atomic parameters, 3 - bonds, 4 - off-diagonal terms, 5 - valence angles, 6 - torsion angles, 7 - hydrogen bonds. The second coordinate specifies a block within the given section or an item index for the general parameters section (section=1). The third coordinate specifies an item index within the block, except for section 1 for which the third coordinate is ignored. The coordinates can optionally be followed by a one or more real number. The number and the meaning of the reals depends on the selected task. For some force-field optimization methods the first three real number have a special meaning referred to as delta (or sigma), ffmin, and ffmax in the following sections, respectively.
An example of the params file:
# i j k x1 x2 x3 ...
1 1 0 0.1 -1.0 1.0 # The 1st general parameter
2 3 4 0.1 -1.0 1.0 # The 4th parameter of the 3rd atoms block
an alternative way to specify variable parameters if the params file is not present. This file has the same format as ffield but instead of parameter values it contains 1.0 or 0.0 as a flag whether the corresponding value is to be variable or not, respectively.
an alternative way to specify minimum values (ffmin) for the variable parameters if the params file is not present. This file has the same format as ffield.
an alternative way to specify maximum values (ffmax) for the variable parameters if the params file is not present. This file has the same format as ffield.
file with the training set geometries in the BGF format, the same as in the original reaxff force-field optimization. Geometries of different molecules in this file must be concatenated.
as an alternative to a single geo file one can save each geometry in its own file and list the files that are part of the training set here. The molecule name specified in the models.in file must match that in the trainset.in file and it takes precedence over any molecule name specified inside the geometry file (filename1, etc.). The format of the file is as follows:
The following control parameters are related to the FFOptimizer. The default value for each parameter is given in parentheses.
the task ID to be performed by the base FFOptimizer module (i.e. when the iopt file contains 6):
- calculate gradient of the error function with respect to the variable FF parameters by finite differences. This option requires $2N + 1$ error function evaluations, where $N$ is the number of variable force-field parameters.
- calculate the second derivatives (Hessian) matrix of the error function with respect to the variable FF parameters by finite differences. The eigenvalues and eigenvectors of the obtained Hessian matrix will also be computed. In this case, $N^2 + N + 1$ function evaluations will be done. Note: a calculation of derivatives can be very slow so make sure you run it on as many processors as possible.
- calculate an error function value for each column-vector of parameters specified in the params file. If the params file does not contain any column-vectors (i.e. there are no real numbers at all in the file) then the error function for the current ffield file is calcualted. This feature can be used, for example, by external force-field optimizers. The result per trainset.in entry is written to the fort.99 file.
- find replic random vectors of parameter values that result in a valid (non-NaN) error function value. The random values are distributed uniformly in the allowed parameter space.
the delta used for calculating first and second derivatives by finite differences (0.01).
The following options may apply to any FF optimization type.
If running in parallel and ffdedi is set to 1 then the master process will run as a dedicated dispatcher and will not perform any computations. The default value depends on the number shared-memory nodes used in the calculation (0 for single-node calculation, 1 otherwise).
number of parameter sets to calculate at once for ffoact = 4 (1 by default). This control parameter may have different meaning and defaults for other force-field optimizers.
if this parameters is set to 0 then the fort.99 files will not be written. By default, every fitness function evaluation produces a fort.99.xxx.yyy file, where xxx is an iteration number and yyy is a replica index at the iteration. This may potentially lead to a large number of fort.99.* files so setting fort99 to 0 can be useful for saving space in long production runs.
MCFFOptimizer will create a geo file with optimized geometries corresponding to the current best parameter set.