nnp-scaling

Warning

Documentation under construction…

This tool calculates all symmetry functions for a given dataset (input.data), stores scaling information and computes neighbor and symmetry function histograms. It is a prerequisite for training with nnp-train. nnp-scaling can be called with MPI parallelization and requires an additional command line argument, e.g.

mpirun -np 4 nnp-scaling 500

This will randomly distribute the given structures to 4 cores (for load balancing). The scaling parameters (minimum, maximum, mean and sigma) of symmetry functions are written to scaling.data. Histograms showing the symmetry function value distributions are provided in separate files (e.g. sf.008.0003.histo will contain the histogram for symmetry function 3 for oxygen atoms). In the above example, the command line parameter nbin is set to 500 which determines the number of bins for symmetry function histograms. In addition, a neighbor histogram is written to neighbors.histo. The screen output contains a useful section about memory requirements during training, e.g.

*** MEMORY USAGE ESTIMATION ***************************************************

Estimated memory usage for training (keyword "memorize_symfunc_results":
Valid for training of energies and forces.
Memory for local structures  :     12459839800 bytes (11882.63 MiB = 11.60 GiB).
Memory for all structures    :     49526892170 bytes (47232.53 MiB = 46.13 GiB).
Average memory per structure :         6839786 bytes (6.52 MiB).
*******************************************************************************

While nnp-scaling itself will not use a lot of RAM memory, training speed with nnp-train can be significantly increased if intermediate symmetry function results can be stored and reused. This will usually require a large amount of memory and the above lines present a rough estimate of the minimum usage. In practice at least 10% more memory should be expected.