Exponential smoothing parameter for the first raw moment estimator
Exponential smoothing parameter for the second raw moment estimator
compute the ascent direction.
Use fromSimplex method from WeightsTransformation
Use fromSimplex method from WeightsTransformation
mixture weights
Compute full updates for the model's parameters.
Compute full updates for the model's parameters. Usually this has the form X_t + alpha * direction(X_t) but it differs for some algorithms, e.g. Nesterov's gradient ascent.
Current parameter values
Current batch gradient
Wrapper for accelerated gradient ascent utilities
Deffinitions for algebraic operations for the apropiate data structure, e.g. vector or matrix. .
updated parameter values
Alternative method to set step size's shrinkage rate.
Alternative method to set step size's shrinkage rate. it will be automatically calculated to shrink the step size by half every m iterations.
positive intger
this
Step size
Step size
Minimum allowed learning rate.
Minimum allowed learning rate. Once this lower bound is reached the learning rate will not shrink anymore
Reset iterator counter
Rate at which the learning rate is decreased as the number of iterations grow.
Rate at which the learning rate is decreased as the number of iterations grow. After t iterations the learning rate will be shrinkageRate^t * learningRate
iteration counter
Use toSimplex method from WeightsTransformation
Use toSimplex method from WeightsTransformation
valid mixture weight vector
Shrink learningRate by a factor of shrinkageRate
Shrink learningRate by a factor of shrinkageRate
Implementation of ADAMAX. See Adam: A Method for Stochastic Optimization. Kingma, Diederik P.; Ba, Jimmy, 2014
Using it is NOT recommended; you should use SGD or its accelerated versions instead.
(Since version gradientgmm >= 1.4) ADAMAX can be unstable for GMM problems and should not be used