CLI

The recommended way for users to interact with NeuralXC is through the command line interface (CLI). While the self-consistent training command neuralxc sc introduced above should cover 95% of all use cases, sometimes a more fine-grained control flow is warranted. This can be achieved by utilizing the following commands which can be grouped into three categories: Data, Model and Other.

Data

Commands in this category deal with managing data, i.e. input features along with target energies. All data-related commands are prefaced with:

neuralxc data

so that, e.g. in order to add data to an .hdf5 file, the command:

neuralxc data add <args>

needs to be executed.

We provide a complete list of these commands, with required arguments shown within the command prompt and optional arguments listed underneath.

add

add <hdf5> <system> <method> <add>

Store data to file <hdf5> under the group <system>/<method>/. The quantity to add is specified as <add> and can be either energy, forces or density. If adding energies or forces --traj <str> needs to be set to point to an .xyz or .traj file containing the required quantity. If adding densities, --density <str> needs to be set to the path were density projections are stored.

--zero <float> Shift energies by this value. If not set, shifts energies so that minimum of dataset is zero.

Example: neuralxc data add data.hdf5 water PBE energy --traj water_pbe.traj

delete

delete <hdf5> <group>

Delete data from file <hdf5> within group <group>. Cannot be reversed.

Example: neuralxc data delete water/PBE

split

split <hdf5> <group> <label>

Split slice off data from file <hdf5> within group <group>. Slicing can be provided in numpy notation by setting --slice <str>. --comp <str> stores the complementary slice as its own dataset

Example: neuralxc data split data.hdf5 water/PBE training --slice :15 --comp testing

This splits of the first 15 datapoints from water/pbe stored in data.hdf5, stores it as training and stores the remaining datapoints as testing.

sample

sample <config> <size> --hdf5 <hdf5> --dest <dest>

Sample <size> data points for the basis set defined in <config> from <hdf5>, saving it to <dest> using k-means clustering in feature space.

Example: neuralxc data sample config.json 50 --hdf5 data.hdf5/water/PBE --dest sample_50.npy

Model

Commands in this category deal with the machine learning model, they are prefaced with

neuralxc

fit

fit <config> <hyper> --hdf5 <path> <baseline> <reference>

Use features generated with basis defined in <config> and hyperparameters defined in <hyper> to fit a neuralxc model that corrects <baseline> data in hdf5 file found at <path> using targets given by <reference>.

--model <str> Continue training model found at this location --hyperopt If set, conduct hyperparameter optimization.

Example: neuralxc fit config.json hyper.json --hdf5 data.hdf5 water/PBE water/CCSD

eval

eval --hdf5 <path> <baseline> <reference>

Evaluate accuracy of <baseline> with respect to <reference>

--model <str> If set, correct baseline with this model before evaluation. --plot Create error histogram and correlation plot. --sample <str> Only evaluate on this sample (.npy file containing integer indices) --keep_mean If set, don’t subtract parallelity errors.

Example: neuralxc eval --hdf5 data.hdf5 water/PBE water/CCSD --model best_model

predict

predict --model <model> --hdf5 <hdf5>

Predict energy corrections to data in <hdf5 using <model>.

--dest <str> Store to this location (default: prediction.npy)

Example: neuralxc predict --model best_model --hdf5 data.hdf5/water/PBE

serialize

serialize <in_path> <jit_path>

Serialize model found at <in_path> and store to <jit_path> to be used with libnxc.

--as_radial serializes model to be used with radial grids.

Other

Commands in this category deal with running and processing SCF calculations, they are prefaced with

neuralxc

engine

engine <config> <xyz>

Run engine (electronic structure code) specified in <config> for all molecules contained in <xyz>. Stores results (energies) of calculations in results.traj

--workdir <str> Specify work-directory. Default is to use .tmp/ and delete after calculation has finished

default

default <kind>

Generates a default input file either containing basis set information (<kind> = pre) or hyperparameters (<kind> = hyper)

preprocess

pre <config> --xyz <xyz> --dest <dest> --srcdir <srcdir>

Preprocesses (projects) electron densities found at <srcdir> for systems found in the <xyz> .xyz or .traj file and stores features in `` <dest>`` (a hdf5 file path with group name).

Example: neuralxc pre config.json --xyz water_pbe.traj --dest data.hdf5 water/PBE --srcdir workdir/