Weight updates based on simple gradient descent methods. More...

#include <GradientDescent.h>

Inheritance diagram for nnp::GradientDescent:

Collaboration diagram for nnp::GradientDescent:

[legend]

Public Types
enum	DescentType { DT_FIXED , DT_ADAM }
	Enumerate different gradient descent variants. More...

Public Member Functions
	GradientDescent (std::size_t const sizeState, DescentType const type)
	GradientDescent class constructor.

virtual	~GradientDescent ()
	Destructor.

void	setState (double *state)
	Set pointer to current state.

void	setError (double const *const error, std::size_t const size=1)
	Set pointer to current error vector.

void	setJacobian (double const *const jacobian, std::size_t const columns=1)
	Set pointer to current Jacobi matrix.

void	update ()
	Perform connection update.

void	setParametersFixed (double const eta)
	Set parameters for fixed step gradient descent algorithm.

void	setParametersAdam (double const eta, double const beta1, double const beta2, double const epsilon)
	Set parameters for Adam algorithm.

std::string	status (std::size_t epoch) const
	Status report.

std::vector< std::string >	statusHeader () const
	Header for status report file.

std::vector< std::string >	info () const
	Information about gradient descent settings.

Public Member Functions inherited from nnp::Updater
virtual void	setupTiming (std::string const &prefix="upd")
	Activate detailed timing.

virtual void	resetTimingLoop ()
	Start a new timing loop (e.g.

virtual std::map< std::string, Stopwatch >	getTiming () const
	Return timings gathered in stopwatch map.

Private Attributes
DescentType	type

double	eta
	Learning rate \(\eta\).

double	beta1
	Decay rate 1 (Adam).

double	beta2
	Decay rate 2 (Adam).

double	epsilon
	Small scalar.

double	eta0
	Initial learning rate.

double	beta1t
	Decay rate 1 to the power of t (Adam).

double	beta2t
	Decay rate 2 to the power of t (Adam).

double *	state
	State vector pointer.

double const *	error
	Error pointer (single double value).

double const *	gradient
	Gradient vector pointer.

std::vector< double >	m
	First moment estimate (Adam).

std::vector< double >	v
	Second moment estimate (Adam).

Additional Inherited Members
Protected Member Functions inherited from nnp::Updater
	Updater (std::size_t const sizeState)
	Constructor.

Protected Attributes inherited from nnp::Updater
bool	timing
	Whether detailed timing is enabled.

bool	timingReset
	Internal loop timer reset switch.

std::size_t	sizeState
	Number of neural network connections (weights + biases).

std::string	prefix
	Prefix for timing stopwatches.

std::map< std::string, Stopwatch >	sw
	Stopwatch map for timing.

Detailed Description

Weight updates based on simple gradient descent methods.

Definition at line 29 of file GradientDescent.h.

Member Enumeration Documentation

◆ DescentType

enum nnp::GradientDescent::DescentType

Enumerate different gradient descent variants.

Enumerator
DT_FIXED	Fixed step size.
DT_ADAM	Adaptive moment estimation (Adam).

Definition at line 33 of file GradientDescent.h.

    {
        DT_FIXED,
        DT_ADAM
    };

Constructor & Destructor Documentation

◆ GradientDescent()

GradientDescent::GradientDescent	(	std::size_t const	sizeState,
		DescentType const	type )

GradientDescent class constructor.

Parameters

[in]	sizeState	Number of neural network connections (weights and biases).
[in]	type	Descent type used for step size.

Definition at line 25 of file GradientDescent.cpp.

                                                         :
    Updater(sizeState),
    eta     (0.0        ),
    beta1   (0.0        ),
    beta2   (0.0        ),
    epsilon (0.0        ),
    beta1t  (0.0        ),
    beta2t  (0.0        ),
    state   (NULL       ),
    error   (NULL       ),
    gradient(NULL       )
{
    if (!(type == DT_FIXED ||
          type == DT_ADAM))
    {
        throw runtime_error("ERROR: Unknown GradientDescent type.\n");
    }
 
    if (sizeState < 1)
    {
        throw runtime_error("ERROR: Wrong GradientDescent dimensions.\n");
    }
 
    this->type = type;
 
    if (type == DT_ADAM)
    {
        m.resize(sizeState, 0.0);
        v.resize(sizeState, 0.0);
    }
}

References beta1, beta1t, beta2, beta2t, DT_ADAM, DT_FIXED, epsilon, error, eta, gradient, m, nnp::Updater::sizeState, state, type, nnp::Updater::Updater(), and v.

Here is the call graph for this function:

◆ ~GradientDescent()

virtual nnp::GradientDescent::~GradientDescent ( )

inlinevirtual

Destructor.

Definition at line 50 of file GradientDescent.h.

50{};

Member Function Documentation

◆ setState()

void GradientDescent::setState ( double * state )

virtual

Set pointer to current state.

Parameters

[in,out] state Pointer to state vector (weights vector), will be changed in-place upon calling update().

Implements nnp::Updater.

Definition at line 58 of file GradientDescent.cpp.

{
    this->state = state;
 
    return;
}

References state.

◆ setError()

void GradientDescent::setError	(	double const *const	error,
		std::size_t const	size = 1 )

virtual

Set pointer to current error vector.

Parameters

[in]	error	Pointer to error (difference between reference and neural network potential output).
[in]	size	Number of error vector entries.

Implements nnp::Updater.

Definition at line 65 of file GradientDescent.cpp.

{
    this->error = error;
 
    return;
}

References error.

◆ setJacobian()

void GradientDescent::setJacobian	(	double const *const	jacobian,
		std::size_t const	columns = 1 )

virtual

Set pointer to current Jacobi matrix.

Parameters

[in]	jacobian	Derivatives of error with respect to weights.
[in]	columns	Number of gradients provided.

Note: If there are \(m\) errors and \(n\) weights, the Jacobi matrix is a \(n \times m\) matrix stored in column-major order.

Implements nnp::Updater.

Definition at line 73 of file GradientDescent.cpp.

{
    this->gradient = jacobian;
 
    return;
}

References gradient.

◆ update()

void GradientDescent::update ( )

virtual

Perform connection update.

Update the connections via steepest descent method.

Implements nnp::Updater.

Definition at line 81 of file GradientDescent.cpp.

{
    if (type == DT_FIXED)
    {
        for (std::size_t i = 0; i < sizeState; ++i)
        {
            state[i] -= eta * (*error) * -gradient[i];
        }
    }
    else if (type == DT_ADAM)
    {
        for (std::size_t i = 0; i < sizeState; ++i)
        {
            double const g = (*error) * -gradient[i];
            m[i] = beta1 * m[i] + (1.0 - beta1) * g;
            v[i] = beta2 * v[i] + (1.0 - beta2) * g * g;
 
            // Standard implementation
            // (Algorithm 1 in publication).
            //double const mhat = m[i] / (1.0 - beta1t);
            //double const vhat = v[i] / (1.0 - beta2t);
            //state[i] -= eta * mhat / (sqrt(vhat) + epsilon);
            
            // Faster (?) alternative
            // (see last paragraph in Section 2 of publication).
            // This is actually only marginally faster
            // (less statements, but two sqrt() calls)!
            eta = eta0 * sqrt(1 - beta2t) / (1 - beta1t); 
            state[i] -= eta * m[i] / (sqrt(v[i]) + epsilon);
        }
 
        // Update betas.
        beta1t *= beta1;
        beta2t *= beta2;
    }
 
    return;
}

References beta1, beta1t, beta2, beta2t, DT_ADAM, DT_FIXED, epsilon, eta, eta0, gradient, m, nnp::Updater::sizeState, state, type, and v.

◆ setParametersFixed()

void GradientDescent::setParametersFixed ( double const eta )

Set parameters for fixed step gradient descent algorithm.

Parameters

[in] eta Step size = ratio of gradient subtracted from current weights.

Definition at line 120 of file GradientDescent.cpp.

{
    this->eta = eta;
 
    return;
}

References eta.

Referenced by nnp::Training::setupTraining().

Here is the caller graph for this function:

◆ setParametersAdam()

void GradientDescent::setParametersAdam	(	double const	eta,
		double const	beta1,
		double const	beta2,
		double const	epsilon )

Set parameters for Adam algorithm.

Parameters

[in]	eta	Step size (corresponds to \(\alpha\) in Adam publication).
[in]	beta1	Decay rate 1 (first moment).
[in]	beta2	Decay rate 2 (second moment).
[in]	epsilon	Small scalar.

Definition at line 127 of file GradientDescent.cpp.

{
    this->eta     = eta;
    this->beta1   = beta1;
    this->beta2   = beta2;
    this->epsilon = epsilon;
 
    eta0   = eta;
    beta1t = beta1;
    beta2t = beta2;
 
    return;
}

References beta1, beta1t, beta2, beta2t, epsilon, eta, and eta0.

Referenced by nnp::Training::setupTraining().

Here is the caller graph for this function:

◆ status()

string GradientDescent::status ( std::size_t epoch ) const

virtual

Status report.

Parameters

[in] epoch Current epoch.

Returns: Line with current status information.

Implements nnp::Updater.

Definition at line 144 of file GradientDescent.cpp.

{
    string s = strpr("%10zu %16.8E", epoch, eta);
 
    if (type == DT_ADAM)
    {
        double meanm = 0.0;
        double meanv = 0.0;
        for (std::size_t i = 0; i < sizeState; ++i)
        {
            meanm += abs(m[i]);
            meanv += abs(v[i]);
        }
        meanm /= sizeState;
        meanv /= sizeState;
        s += strpr(" %16.8E %16.8E %16.8E %16.8E",
                   beta1t, beta2t, meanm, meanv);
    }
    s += '\n';
 
    return s;
}

References beta1t, beta2t, DT_ADAM, eta, m, nnp::Updater::sizeState, nnp::strpr(), type, and v.

Here is the call graph for this function:

◆ statusHeader()

vector< string > GradientDescent::statusHeader ( ) const

virtual

Header for status report file.

Returns: Vector with header lines.

Implements nnp::Updater.

Definition at line 167 of file GradientDescent.cpp.

{
    vector<string> header;
 
    vector<string> title;
    vector<string> colName;
    vector<string> colInfo;
    vector<size_t> colSize;
    title.push_back("Gradient descent status report.");
    colSize.push_back(10);
    colName.push_back("epoch");
    colInfo.push_back("Training epoch.");
    colSize.push_back(16);
    colName.push_back("eta");
    colInfo.push_back("Step size.");
    if (type == DT_ADAM)
    {
        colSize.push_back(16);
        colName.push_back("beta1t");
        colInfo.push_back("Decay rate 1 to the power of t.");
        colSize.push_back(16);
        colName.push_back("beta2t");
        colInfo.push_back("Decay rate 2 to the power of t.");
        colSize.push_back(16);
        colName.push_back("mag_m");
        colInfo.push_back("Mean of absolute first momentum (m).");
        colSize.push_back(16);
        colName.push_back("mag_v");
        colInfo.push_back("Mean of absolute second momentum (v).");
    }
    header = createFileHeader(title, colSize, colName, colInfo);
 
    return header;
}

References nnp::createFileHeader(), DT_ADAM, and type.

Here is the call graph for this function:

◆ info()

vector< string > GradientDescent::info ( ) const

virtual

Information about gradient descent settings.

Returns: Vector with info lines.

Implements nnp::Updater.

Definition at line 202 of file GradientDescent.cpp.

{
    vector<string> v;
 
    if (type == DT_FIXED)
    {
        v.push_back(strpr("GradientDescentType::DT_FIXED (%d)\n", type));
        v.push_back(strpr("sizeState       = %zu\n", sizeState));
        v.push_back(strpr("eta             = %12.4E\n", eta));
    }
    else if (type == DT_ADAM)
    {
        v.push_back(strpr("GradientDescentType::DT_ADAM (%d)\n", type));
        v.push_back(strpr("sizeState       = %zu\n", sizeState));
        v.push_back(strpr("eta             = %12.4E\n", eta));
        v.push_back(strpr("beta1           = %12.4E\n", beta1));
        v.push_back(strpr("beta2           = %12.4E\n", beta2));
        v.push_back(strpr("epsilon         = %12.4E\n", epsilon));
    }
 
    return v;
}

References beta1, beta2, DT_ADAM, DT_FIXED, epsilon, eta, nnp::Updater::sizeState, nnp::strpr(), type, and v.

Here is the call graph for this function:

Member Data Documentation

◆ type

DescentType nnp::GradientDescent::type

private

Definition at line 118 of file GradientDescent.h.

Referenced by GradientDescent(), info(), status(), statusHeader(), and update().

◆ eta

double nnp::GradientDescent::eta

private

Learning rate \(\eta\).

Definition at line 120 of file GradientDescent.h.

Referenced by GradientDescent(), info(), setParametersAdam(), setParametersFixed(), status(), and update().

◆ beta1

double nnp::GradientDescent::beta1

private

Decay rate 1 (Adam).

Definition at line 122 of file GradientDescent.h.

Referenced by GradientDescent(), info(), setParametersAdam(), and update().

◆ beta2

double nnp::GradientDescent::beta2

private

Decay rate 2 (Adam).

Definition at line 124 of file GradientDescent.h.

Referenced by GradientDescent(), info(), setParametersAdam(), and update().

◆ epsilon

double nnp::GradientDescent::epsilon

private

Small scalar.

Definition at line 126 of file GradientDescent.h.

Referenced by GradientDescent(), info(), setParametersAdam(), and update().

◆ eta0

double nnp::GradientDescent::eta0

private

Initial learning rate.

Definition at line 128 of file GradientDescent.h.

Referenced by setParametersAdam(), and update().

◆ beta1t

double nnp::GradientDescent::beta1t

private

Decay rate 1 to the power of t (Adam).

Definition at line 130 of file GradientDescent.h.

Referenced by GradientDescent(), setParametersAdam(), status(), and update().

◆ beta2t

double nnp::GradientDescent::beta2t

private

Decay rate 2 to the power of t (Adam).

Definition at line 132 of file GradientDescent.h.

Referenced by GradientDescent(), setParametersAdam(), status(), and update().

◆ state

double* nnp::GradientDescent::state

private

State vector pointer.

Definition at line 134 of file GradientDescent.h.

Referenced by GradientDescent(), setState(), and update().

◆ error

double const* nnp::GradientDescent::error

private

Error pointer (single double value).

Definition at line 136 of file GradientDescent.h.

Referenced by GradientDescent(), and setError().

◆ gradient

double const* nnp::GradientDescent::gradient

private

Gradient vector pointer.

Definition at line 138 of file GradientDescent.h.

Referenced by GradientDescent(), setJacobian(), and update().

◆ m

std::vector<double> nnp::GradientDescent::m

private

First moment estimate (Adam).

Definition at line 140 of file GradientDescent.h.

Referenced by GradientDescent(), status(), and update().

◆ v

std::vector<double> nnp::GradientDescent::v

private

Second moment estimate (Adam).

Definition at line 142 of file GradientDescent.h.

Referenced by GradientDescent(), info(), status(), and update().

The documentation for this class was generated from the following files:

/home/runner/work/n2p2/n2p2/src/libnnptrain/GradientDescent.h
/home/runner/work/n2p2/n2p2/src/libnnptrain/GradientDescent.cpp

Public Types

Public Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Member Enumeration Documentation

◆ DescentType

Constructor & Destructor Documentation

◆ GradientDescent()

◆ ~GradientDescent()

Member Function Documentation

◆ setState()

◆ setError()

◆ setJacobian()

◆ update()

◆ setParametersFixed()

◆ setParametersAdam()

◆ status()

◆ statusHeader()

◆ info()

Member Data Documentation

◆ type

◆ eta

◆ beta1

◆ beta2

◆ epsilon

◆ eta0

◆ beta1t

◆ beta2t

◆ state

◆ error

◆ gradient

◆ m

◆ v