n2p2 - A neural network potential package
nnp::GradientDescent Class Reference

Weight updates based on simple gradient descent methods. More...

#include <GradientDescent.h>

Inheritance diagram for nnp::GradientDescent:
Collaboration diagram for nnp::GradientDescent:

Public Types

enum  DescentType { DT_FIXED , DT_ADAM }
 Enumerate different gradient descent variants. More...
 

Public Member Functions

 GradientDescent (std::size_t const sizeState, DescentType const type)
 GradientDescent class constructor. More...
 
virtual ~GradientDescent ()
 Destructor. More...
 
void setState (double *state)
 Set pointer to current state. More...
 
void setError (double const *const error, std::size_t const size=1)
 Set pointer to current error vector. More...
 
void setJacobian (double const *const jacobian, std::size_t const columns=1)
 Set pointer to current Jacobi matrix. More...
 
void update ()
 Perform connection update. More...
 
void setParametersFixed (double const eta)
 Set parameters for fixed step gradient descent algorithm. More...
 
void setParametersAdam (double const eta, double const beta1, double const beta2, double const epsilon)
 Set parameters for Adam algorithm. More...
 
std::string status (std::size_t epoch) const
 Status report. More...
 
std::vector< std::string > statusHeader () const
 Header for status report file. More...
 
std::vector< std::string > info () const
 Information about gradient descent settings. More...
 
- Public Member Functions inherited from nnp::Updater
virtual void setState (double *state)=0
 Set pointer to current state. More...
 
virtual void setError (double const *const error, std::size_t const size=1)=0
 Set pointer to current error vector. More...
 
virtual void setJacobian (double const *const jacobian, std::size_t const columns=1)=0
 Set pointer to current Jacobi matrix. More...
 
virtual void update ()=0
 Perform single update of state vector. More...
 
virtual std::string status (std::size_t epoch) const =0
 Status report. More...
 
virtual std::vector< std::string > statusHeader () const =0
 Header for status report file. More...
 
virtual std::vector< std::string > info () const =0
 Information about this updater. More...
 
virtual void setupTiming (std::string const &prefix="upd")
 Activate detailed timing. More...
 
virtual void resetTimingLoop ()
 Start a new timing loop (e.g. More...
 
virtual std::map< std::string, StopwatchgetTiming () const
 Return timings gathered in stopwatch map. More...
 

Private Attributes

DescentType type
 
double eta
 Learning rate \(\eta\). More...
 
double beta1
 Decay rate 1 (Adam). More...
 
double beta2
 Decay rate 2 (Adam). More...
 
double epsilon
 Small scalar. More...
 
double eta0
 Initial learning rate. More...
 
double beta1t
 Decay rate 1 to the power of t (Adam). More...
 
double beta2t
 Decay rate 2 to the power of t (Adam). More...
 
double * state
 State vector pointer. More...
 
double const * error
 Error pointer (single double value). More...
 
double const * gradient
 Gradient vector pointer. More...
 
std::vector< double > m
 First moment estimate (Adam). More...
 
std::vector< double > v
 Second moment estimate (Adam). More...
 

Additional Inherited Members

- Protected Member Functions inherited from nnp::Updater
 Updater (std::size_t const sizeState)
 Constructor. More...
 
- Protected Attributes inherited from nnp::Updater
bool timing
 Whether detailed timing is enabled. More...
 
bool timingReset
 Internal loop timer reset switch. More...
 
std::size_t sizeState
 Number of neural network connections (weights + biases). More...
 
std::string prefix
 Prefix for timing stopwatches. More...
 
std::map< std::string, Stopwatchsw
 Stopwatch map for timing. More...
 

Detailed Description

Weight updates based on simple gradient descent methods.

Definition at line 29 of file GradientDescent.h.

Member Enumeration Documentation

◆ DescentType

Enumerate different gradient descent variants.

Enumerator
DT_FIXED 

Fixed step size.

DT_ADAM 

Adaptive moment estimation (Adam).

Definition at line 33 of file GradientDescent.h.

34 {
39 };
@ DT_ADAM
Adaptive moment estimation (Adam).
@ DT_FIXED
Fixed step size.

Constructor & Destructor Documentation

◆ GradientDescent()

GradientDescent::GradientDescent ( std::size_t const  sizeState,
DescentType const  type 
)

GradientDescent class constructor.

Parameters
[in]sizeStateNumber of neural network connections (weights and biases).
[in]typeDescent type used for step size.

Definition at line 25 of file GradientDescent.cpp.

26 :
28 eta (0.0 ),
29 beta1 (0.0 ),
30 beta2 (0.0 ),
31 epsilon (0.0 ),
32 beta1t (0.0 ),
33 beta2t (0.0 ),
34 state (NULL ),
35 error (NULL ),
36 gradient(NULL )
37{
38 if (!(type == DT_FIXED ||
39 type == DT_ADAM))
40 {
41 throw runtime_error("ERROR: Unknown GradientDescent type.\n");
42 }
43
44 if (sizeState < 1)
45 {
46 throw runtime_error("ERROR: Wrong GradientDescent dimensions.\n");
47 }
48
49 this->type = type;
50
51 if (type == DT_ADAM)
52 {
53 m.resize(sizeState, 0.0);
54 v.resize(sizeState, 0.0);
55 }
56}
double epsilon
Small scalar.
std::vector< double > v
Second moment estimate (Adam).
double beta2t
Decay rate 2 to the power of t (Adam).
double beta1t
Decay rate 1 to the power of t (Adam).
double eta
Learning rate .
double const * gradient
Gradient vector pointer.
double beta1
Decay rate 1 (Adam).
std::vector< double > m
First moment estimate (Adam).
double beta2
Decay rate 2 (Adam).
double const * error
Error pointer (single double value).
double * state
State vector pointer.
Updater(std::size_t const sizeState)
Constructor.
Definition: Updater.cpp:22
std::size_t sizeState
Number of neural network connections (weights + biases).
Definition: Updater.h:110

References DT_ADAM, DT_FIXED, m, nnp::Updater::sizeState, type, and v.

◆ ~GradientDescent()

virtual nnp::GradientDescent::~GradientDescent ( )
inlinevirtual

Destructor.

Definition at line 50 of file GradientDescent.h.

50{};

Member Function Documentation

◆ setState()

void GradientDescent::setState ( double *  state)
virtual

Set pointer to current state.

Parameters
[in,out]statePointer to state vector (weights vector), will be changed in-place upon calling update().

Implements nnp::Updater.

Definition at line 58 of file GradientDescent.cpp.

59{
60 this->state = state;
61
62 return;
63}

References state.

◆ setError()

void GradientDescent::setError ( double const *const  error,
std::size_t const  size = 1 
)
virtual

Set pointer to current error vector.

Parameters
[in]errorPointer to error (difference between reference and neural network potential output).
[in]sizeNumber of error vector entries.

Implements nnp::Updater.

Definition at line 65 of file GradientDescent.cpp.

67{
68 this->error = error;
69
70 return;
71}

References error.

◆ setJacobian()

void GradientDescent::setJacobian ( double const *const  jacobian,
std::size_t const  columns = 1 
)
virtual

Set pointer to current Jacobi matrix.

Parameters
[in]jacobianDerivatives of error with respect to weights.
[in]columnsNumber of gradients provided.
Note
If there are \(m\) errors and \(n\) weights, the Jacobi matrix is a \(n \times m\) matrix stored in column-major order.

Implements nnp::Updater.

Definition at line 73 of file GradientDescent.cpp.

75{
76 this->gradient = jacobian;
77
78 return;
79}

References gradient.

◆ update()

void GradientDescent::update ( )
virtual

Perform connection update.

Update the connections via steepest descent method.

Implements nnp::Updater.

Definition at line 81 of file GradientDescent.cpp.

82{
83 if (type == DT_FIXED)
84 {
85 for (std::size_t i = 0; i < sizeState; ++i)
86 {
87 state[i] -= eta * (*error) * -gradient[i];
88 }
89 }
90 else if (type == DT_ADAM)
91 {
92 for (std::size_t i = 0; i < sizeState; ++i)
93 {
94 double const g = (*error) * -gradient[i];
95 m[i] = beta1 * m[i] + (1.0 - beta1) * g;
96 v[i] = beta2 * v[i] + (1.0 - beta2) * g * g;
97
98 // Standard implementation
99 // (Algorithm 1 in publication).
100 //double const mhat = m[i] / (1.0 - beta1t);
101 //double const vhat = v[i] / (1.0 - beta2t);
102 //state[i] -= eta * mhat / (sqrt(vhat) + epsilon);
103
104 // Faster (?) alternative
105 // (see last paragraph in Section 2 of publication).
106 // This is actually only marginally faster
107 // (less statements, but two sqrt() calls)!
108 eta = eta0 * sqrt(1 - beta2t) / (1 - beta1t);
109 state[i] -= eta * m[i] / (sqrt(v[i]) + epsilon);
110 }
111
112 // Update betas.
113 beta1t *= beta1;
114 beta2t *= beta2;
115 }
116
117 return;
118}
double eta0
Initial learning rate.

References beta1, beta1t, beta2, beta2t, DT_ADAM, DT_FIXED, epsilon, eta, eta0, gradient, m, nnp::Updater::sizeState, state, type, and v.

◆ setParametersFixed()

void GradientDescent::setParametersFixed ( double const  eta)

Set parameters for fixed step gradient descent algorithm.

Parameters
[in]etaStep size = ratio of gradient subtracted from current weights.

Definition at line 120 of file GradientDescent.cpp.

121{
122 this->eta = eta;
123
124 return;
125}

References eta.

Referenced by nnp::Training::setupTraining().

Here is the caller graph for this function:

◆ setParametersAdam()

void GradientDescent::setParametersAdam ( double const  eta,
double const  beta1,
double const  beta2,
double const  epsilon 
)

Set parameters for Adam algorithm.

Parameters
[in]etaStep size (corresponds to \(\alpha\) in Adam publication).
[in]beta1Decay rate 1 (first moment).
[in]beta2Decay rate 2 (second moment).
[in]epsilonSmall scalar.

Definition at line 127 of file GradientDescent.cpp.

131{
132 this->eta = eta;
133 this->beta1 = beta1;
134 this->beta2 = beta2;
135 this->epsilon = epsilon;
136
137 eta0 = eta;
138 beta1t = beta1;
139 beta2t = beta2;
140
141 return;
142}

References beta1, beta1t, beta2, beta2t, epsilon, eta, and eta0.

Referenced by nnp::Training::setupTraining().

Here is the caller graph for this function:

◆ status()

string GradientDescent::status ( std::size_t  epoch) const
virtual

Status report.

Parameters
[in]epochCurrent epoch.
Returns
Line with current status information.

Implements nnp::Updater.

Definition at line 144 of file GradientDescent.cpp.

145{
146 string s = strpr("%10zu %16.8E", epoch, eta);
147
148 if (type == DT_ADAM)
149 {
150 double meanm = 0.0;
151 double meanv = 0.0;
152 for (std::size_t i = 0; i < sizeState; ++i)
153 {
154 meanm += abs(m[i]);
155 meanv += abs(v[i]);
156 }
157 meanm /= sizeState;
158 meanv /= sizeState;
159 s += strpr(" %16.8E %16.8E %16.8E %16.8E",
160 beta1t, beta2t, meanm, meanv);
161 }
162 s += '\n';
163
164 return s;
165}
string strpr(const char *format,...)
String version of printf function.
Definition: utility.cpp:90

References beta1t, beta2t, DT_ADAM, eta, m, nnp::Updater::sizeState, nnp::strpr(), type, and v.

Here is the call graph for this function:

◆ statusHeader()

vector< string > GradientDescent::statusHeader ( ) const
virtual

Header for status report file.

Returns
Vector with header lines.

Implements nnp::Updater.

Definition at line 167 of file GradientDescent.cpp.

168{
169 vector<string> header;
170
171 vector<string> title;
172 vector<string> colName;
173 vector<string> colInfo;
174 vector<size_t> colSize;
175 title.push_back("Gradient descent status report.");
176 colSize.push_back(10);
177 colName.push_back("epoch");
178 colInfo.push_back("Training epoch.");
179 colSize.push_back(16);
180 colName.push_back("eta");
181 colInfo.push_back("Step size.");
182 if (type == DT_ADAM)
183 {
184 colSize.push_back(16);
185 colName.push_back("beta1t");
186 colInfo.push_back("Decay rate 1 to the power of t.");
187 colSize.push_back(16);
188 colName.push_back("beta2t");
189 colInfo.push_back("Decay rate 2 to the power of t.");
190 colSize.push_back(16);
191 colName.push_back("mag_m");
192 colInfo.push_back("Mean of absolute first momentum (m).");
193 colSize.push_back(16);
194 colName.push_back("mag_v");
195 colInfo.push_back("Mean of absolute second momentum (v).");
196 }
197 header = createFileHeader(title, colSize, colName, colInfo);
198
199 return header;
200}
vector< string > createFileHeader(vector< string > const &title, vector< size_t > const &colSize, vector< string > const &colName, vector< string > const &colInfo, char const &commentChar)
Definition: utility.cpp:104

References nnp::createFileHeader(), DT_ADAM, and type.

Here is the call graph for this function:

◆ info()

vector< string > GradientDescent::info ( ) const
virtual

Information about gradient descent settings.

Returns
Vector with info lines.

Implements nnp::Updater.

Definition at line 202 of file GradientDescent.cpp.

203{
204 vector<string> v;
205
206 if (type == DT_FIXED)
207 {
208 v.push_back(strpr("GradientDescentType::DT_FIXED (%d)\n", type));
209 v.push_back(strpr("sizeState = %zu\n", sizeState));
210 v.push_back(strpr("eta = %12.4E\n", eta));
211 }
212 else if (type == DT_ADAM)
213 {
214 v.push_back(strpr("GradientDescentType::DT_ADAM (%d)\n", type));
215 v.push_back(strpr("sizeState = %zu\n", sizeState));
216 v.push_back(strpr("eta = %12.4E\n", eta));
217 v.push_back(strpr("beta1 = %12.4E\n", beta1));
218 v.push_back(strpr("beta2 = %12.4E\n", beta2));
219 v.push_back(strpr("epsilon = %12.4E\n", epsilon));
220 }
221
222 return v;
223}

References beta1, beta2, DT_ADAM, DT_FIXED, epsilon, eta, nnp::Updater::sizeState, nnp::strpr(), type, and v.

Here is the call graph for this function:

Member Data Documentation

◆ type

DescentType nnp::GradientDescent::type
private

Definition at line 118 of file GradientDescent.h.

Referenced by GradientDescent(), info(), status(), statusHeader(), and update().

◆ eta

double nnp::GradientDescent::eta
private

Learning rate \(\eta\).

Definition at line 120 of file GradientDescent.h.

Referenced by info(), setParametersAdam(), setParametersFixed(), status(), and update().

◆ beta1

double nnp::GradientDescent::beta1
private

Decay rate 1 (Adam).

Definition at line 122 of file GradientDescent.h.

Referenced by info(), setParametersAdam(), and update().

◆ beta2

double nnp::GradientDescent::beta2
private

Decay rate 2 (Adam).

Definition at line 124 of file GradientDescent.h.

Referenced by info(), setParametersAdam(), and update().

◆ epsilon

double nnp::GradientDescent::epsilon
private

Small scalar.

Definition at line 126 of file GradientDescent.h.

Referenced by info(), setParametersAdam(), and update().

◆ eta0

double nnp::GradientDescent::eta0
private

Initial learning rate.

Definition at line 128 of file GradientDescent.h.

Referenced by setParametersAdam(), and update().

◆ beta1t

double nnp::GradientDescent::beta1t
private

Decay rate 1 to the power of t (Adam).

Definition at line 130 of file GradientDescent.h.

Referenced by setParametersAdam(), status(), and update().

◆ beta2t

double nnp::GradientDescent::beta2t
private

Decay rate 2 to the power of t (Adam).

Definition at line 132 of file GradientDescent.h.

Referenced by setParametersAdam(), status(), and update().

◆ state

double* nnp::GradientDescent::state
private

State vector pointer.

Definition at line 134 of file GradientDescent.h.

Referenced by setState(), and update().

◆ error

double const* nnp::GradientDescent::error
private

Error pointer (single double value).

Definition at line 136 of file GradientDescent.h.

Referenced by setError().

◆ gradient

double const* nnp::GradientDescent::gradient
private

Gradient vector pointer.

Definition at line 138 of file GradientDescent.h.

Referenced by setJacobian(), and update().

◆ m

std::vector<double> nnp::GradientDescent::m
private

First moment estimate (Adam).

Definition at line 140 of file GradientDescent.h.

Referenced by GradientDescent(), status(), and update().

◆ v

std::vector<double> nnp::GradientDescent::v
private

Second moment estimate (Adam).

Definition at line 142 of file GradientDescent.h.

Referenced by GradientDescent(), info(), status(), and update().


The documentation for this class was generated from the following files: