Machine learning is a field that is inextricably intertwined with the field
of optimization. Countless machine learning techniques depend on the
optimization of a given objective function; for instance, classifiers such
as logistic regression, metric learning methods like NCA, manifold learning
algorithms like MVU, and the extremely popular field of deep learning.
Thanks to the attention focused on these problems, it is increasingly
important in the field to have fast, practical optimizers.
Therefore, the need is real to provide a robust, flexible framework in which
new optimizers can be easily developed. Similarly, the need is also real for
a flexible framework that allows new objective functions to be easily
implemented and optimized with a variety of possible optimizers. However,
the current landscape of optimization frameworks for machine learning is not
particularly comprehensive. A variety of tools such as Caffe, TensorFlow,
and Keras have optimization frameworks, but they are limited to SGD-type
optimizers and are only able to optimize deep neural networks or related
structures. Thus expressing arbitrary machine learning objective functions
can be difficult or in some cases not possible. Other libraries, like
scikit-learn, do have optimizers, but generally not in a coherent framework
and often the implementations may be specific to an individual machine
learning algorithm. At a higher level, many programming languages may have
generic optimizers, like SciPy and MATLAB, but typically these optimizers
are not suitable for large-scale machine learning tasks where, e.g.,
calculating the full gradient of all of the data may not be feasible.
informations.
For more information see:
ensmallen: a
flexible C++ library for efficient function optimization. by S.
Bhardwaj, R. Curtin, M. Edel, Y. Mentekidis, C. Sanderson; or see
ensmallen.org
Given this situation, we have developed
ensmallen a flexible optimization
framework. Which makes it easy to combine nearly any type of optimizer with
nearly any type of objective function, and has allowed us to minimize the
effort necessary to both implement new optimizers and to implement new
machine learning algorithms that depend on optimization.
This visualization allows us to see how many popular optimizers perform on different optimization problems. Select a problem to optimize, then select an optimizer and tune its parameters, and see the steps that the optimizer takes plotted in red. Note you can compare how different optimizers perform on a given problem in the second graph. As you try a given problem with more optimizers, the objective function vs. the number of iterations is plotted for each optimizer.
A plot of the loss reveals distinct properties for each optimizer with its own style of convergence.
As intuition says, system has higher probability of staying in the states with a smaller stepsize. As the stepsize goes up, imbalance becomes stronger. When the stepsize is close to zero, the system stays in the state(s) with the highest cost.
In order to facilitate consistent implementations, we have defined a FunctionType API that describes all the methods that an objective function may implement. ensmallen offers a few variations of this API to cover different function characteristics. This leads to several different APIs for different function types: how to implement the different types of functions f(x) that ensmallen can handle
Each of these types of objective functions require slightly different methods to be implemented. In some cases, methods will be automatically deduced by the optimizers using template metaprogramming and this allows the user to not need to implement every method for a given type of objective function.
Function |
|
The defaults here are not necessarily good for the given problem, so it is suggested that the values used be tailored to the task at hand. (Use the mouse to drag and to choose the initial parameter.) The global minimum and optimizer minimum can be found on the left.
In addition to implementing functions, users can also add new optimizers
easily if they implement an optimizer with a simple API. Fortunately, the
requirements for implementing optimizers are much simpler than for objective
functions. An optimizer must implement only the method the Optimize() method,
which should check that the given FunctionType satisfies the assumptions the
optimizer makesand optimize the given function function, storing the best set of
parameters in the matrix parameters and returning the best objective value.
Thanks to the easy abstraction, we have been able to provide support for a
large set of diverse optimizers and objective functions. See
https://www.ensmallen.org
for available optimization methods.