Alexander Lobanov
Publications:
Bychkov G. K., Dvinskikh D. M., Antsiferova A. V., Gasnikov A. V., Lobanov A. V.
Accelerated Zero-Order SGD under High-Order Smoothness and Overparameterized Regime
2024, Vol. 20, no. 5, pp. 759-788
Abstract
We present a novel gradient-free algorithm to solve a convex stochastic optimization problem,
such as those encountered in medicine, physics, and machine learning (e.g., the adversarial
multi-armed bandit problem), where the objective function can only be computed through numerical
simulation, either as the result of a real experiment or as feedback given by the function
evaluations from an adversary. Thus, we suppose that only black-box access to the function
values of the objective is available, possibly corrupted by adversarial noise: deterministic or
stochastic. The noisy setup can arise naturally from modeling randomness within a simulation
or by computer discretization, or when exact values of the function are forbidden due to privacy
issues, or when solving nonconvex problems as convex ones with an inexact function oracle.
By exploiting higher-order smoothness, fulfilled, e.g., in logistic regression, we improve the
performance of zero-order methods developed under the assumption of classical smoothness (or
having a Lipschitz gradient). The proposed algorithm enjoys optimal oracle complexity and is
designed under an overparameterization setup, i.e., when the number of model parameters is
much larger than the size of the training dataset. Overparametrized models fit to the training
data perfectly while also having good generalization and outperforming underparameterized
models on unseen data. We provide convergence guarantees for the proposed algorithm under
both types of noise. Moreover, we estimate the maximum permissible adversarial noise level
that maintains the desired accuracy in the Euclidean setup, and then we extend our results to
a non-Euclidean setup. Our theoretical results are verified using the logistic regression problem.
|
Smirnov V. N., Kazistova K. M., Sudakov I. A., Leplat V., Gasnikov A. V., Lobanov A. V.
Asymptotic Analysis of the Ruppert – Polyak Averaging for Stochastic Order Oracle
2024, Vol. 20, no. 5, pp. 961-978
Abstract
Black-box optimization, a rapidly growing field, faces challenges due to limited knowledge of
the objective function’s internal mechanisms. One promising approach to addressing this is the
Stochastic Order Oracle Concept. This concept, similar to other Order Oracle Concepts, relies
solely on relative comparisons of function values without requiring access to the exact values.
This paper presents a novel, improved estimation of the covariance matrix for the asymptotic
convergence of the Stochastic Order Oracle Concept. Our work surpasses existing research in
this domain by offering a more accurate estimation of asymptotic convergence rate. Finally,
numerical experiments validate our theoretical findings, providing strong empirical support for
our proposed approach.
|
Gasnikov A. V., Alkousa M. S., Lobanov A. V., Dorn Y. V., Stonyakin F. S., Kuruzov I. A., Singh S. R.
On Quasi-Convex Smooth Optimization Problems by a Comparison Oracle
2024, Vol. 20, no. 5, pp. 813-825
Abstract
Frequently, when dealing with many machine learning models, optimization problems appear
to be challenging due to a limited understanding of the constructions and characterizations
of the objective functions in these problems. Therefore, major complications arise when dealing
with first-order algorithms, in which gradient computations are challenging or even impossible in
various scenarios. For this reason, we resort to derivative-free methods (zeroth-order methods).
This paper is devoted to an approach to minimizing quasi-convex functions using a recently
proposed (in [56]) comparison oracle only. This oracle compares function values at two points
and tells which is larger, thus by the proposed approach, the comparisons are all we need to solve
the optimization problem under consideration. The proposed algorithm to solve the considered
problem is based on the technique of comparison-based gradient direction estimation and the
comparison-based approximation normalized gradient descent. The normalized gradient descent
algorithm is an adaptation of gradient descent, which updates according to the direction of the
gradients, rather than the gradients themselves. We proved the convergence rate of the proposed
algorithm when the objective function is smooth and strictly quasi-convex in $\mathbb{R}^n$, this algorithm
needs $\mathcal{O}\left( \left(n D^2/\varepsilon^2 \right) \log\left(n D / \varepsilon\right)\right)$ comparison queries to find an $\varepsilon$-approximate of the optimal solution,
where $D$ is an upper bound of the distance between all generated iteration points and an optimal
solution.
|