Alexander Gasnikov

    Alexander Gasnikov
    ul. Universitetskaya 1, 420500 Innopolis, Russia
    Innopolis University

    Doctor of Computer Sciences, Professor


    Current position:

    Head of the Laboratory of Mathematical Methods of Optimization, Head of the Department of Mathematical Foundations of Control, Moscow Institute of Physics and Technology, Dolgoprudny, Russia;

    Rector, Innopolis University, Innopolis, Russia


    Main research areas:

    Mathematical Modeling of Traffic Flows, Optimization (Huge-Scale, Distributed and Parallel, Stochastic, Online), Learning from optimization point of view


    Education:

    2006 — Moscow Institute of Physics and Technology, Department of Control and Applied Mathematics, Moscow, Russia;

    2007 — Candidate of Computer Sciences (PhD) in Partial Differential Equations, Thesis: “Asymptotic in time behavior of solution of Cauchy problem for conservation law with nonlinear divergent viscosity”, supervisor prof. A.A. Shananin, Moscow Institute of Physics and Technology, Moscow, Russia;

    2011 — Associate Professor at the Department of Mathematical Foundations of Control;

    2016 — Doctor of Computer Sciences (Habilitation) in Mathematical Modelling and Numerical methods of Convex Optimization, Doctoral Thesis: “Searching equilibriums in large transport networks”, supervisors prof. A.A. Shananin and prof. Yu.E. Nesterov


    Scientific prizes and awards:

    Winner of the Yahoo Award for 2019;

    Winner of the Ilya Segalovich Award (Yandex) for 2020;

    Winner of the Moscow Government Prize for 2020;

    Winner of the Talent Funding Award by the Institute of Strategic Research (China) for 2023.

    Publications:

    Nguyen N. T., Rogozin A. V., Gasnikov A. V.
    Abstract
    The consensus problem in distributed computing involves a network of agents aiming to compute the average of their initial vectors through local communication, represented by an undirected graph. This paper focuses on studying this problem using an average-case analysis approach, particularly over regular graphs. Traditional algorithms for solving the consensus problem often rely on worst-case performance evaluation scenarios, which may not reflect typical performance in real-world applications. Instead, we apply average-case analysis, focusing on the expected spectral distribution of eigenvalues to obtain a more realistic view of performance. Key contributions include deriving the optimal method for consensus on regular graphs, showing its relation to the Heavy Ball method, analyzing its asymptotic convergence rate, and comparing it to various first-order methods through numerical experiments.
    Keywords: consensus problem, average-case analysis, regular graph
    Citation: Nguyen N. T., Rogozin A. V., Gasnikov A. V.,  Average-Case Optimization Analysis for Distributed Consensus Algorithms on Regular Graphs, Rus. J. Nonlin. Dyn., 2024, Vol. 20, no. 5, pp.  907-931
    DOI:10.20537/nd241217
    Bychkov G. K., Dvinskikh D. M., Antsiferova A. V., Gasnikov A. V., Lobanov A. V.
    Abstract
    We present a novel gradient-free algorithm to solve a convex stochastic optimization problem, such as those encountered in medicine, physics, and machine learning (e.g., the adversarial multi-armed bandit problem), where the objective function can only be computed through numerical simulation, either as the result of a real experiment or as feedback given by the function evaluations from an adversary. Thus, we suppose that only black-box access to the function values of the objective is available, possibly corrupted by adversarial noise: deterministic or stochastic. The noisy setup can arise naturally from modeling randomness within a simulation or by computer discretization, or when exact values of the function are forbidden due to privacy issues, or when solving nonconvex problems as convex ones with an inexact function oracle. By exploiting higher-order smoothness, fulfilled, e.g., in logistic regression, we improve the performance of zero-order methods developed under the assumption of classical smoothness (or having a Lipschitz gradient). The proposed algorithm enjoys optimal oracle complexity and is designed under an overparameterization setup, i.e., when the number of model parameters is much larger than the size of the training dataset. Overparametrized models fit to the training data perfectly while also having good generalization and outperforming underparameterized models on unseen data. We provide convergence guarantees for the proposed algorithm under both types of noise. Moreover, we estimate the maximum permissible adversarial noise level that maintains the desired accuracy in the Euclidean setup, and then we extend our results to a non-Euclidean setup. Our theoretical results are verified using the logistic regression problem.
    Keywords: zero-order optimization, gradient-free algorithms, high-order smoothness, kernel approximation, overparametrization
    Citation: Bychkov G. K., Dvinskikh D. M., Antsiferova A. V., Gasnikov A. V., Lobanov A. V.,  Accelerated Zero-Order SGD under High-Order Smoothness and Overparameterized Regime, Rus. J. Nonlin. Dyn., 2024, Vol. 20, no. 5, pp.  759-788
    DOI:10.20537/nd241209
    Smirnov V. N., Kazistova K. M., Sudakov I. A., Leplat V., Gasnikov A. V., Lobanov A. V.
    Abstract
    Black-box optimization, a rapidly growing field, faces challenges due to limited knowledge of the objective function’s internal mechanisms. One promising approach to addressing this is the Stochastic Order Oracle Concept. This concept, similar to other Order Oracle Concepts, relies solely on relative comparisons of function values without requiring access to the exact values. This paper presents a novel, improved estimation of the covariance matrix for the asymptotic convergence of the Stochastic Order Oracle Concept. Our work surpasses existing research in this domain by offering a more accurate estimation of asymptotic convergence rate. Finally, numerical experiments validate our theoretical findings, providing strong empirical support for our proposed approach.
    Citation: Smirnov V. N., Kazistova K. M., Sudakov I. A., Leplat V., Gasnikov A. V., Lobanov A. V.,  Asymptotic Analysis of the Ruppert – Polyak Averaging for Stochastic Order Oracle, Rus. J. Nonlin. Dyn., 2024, Vol. 20, no. 5, pp.  961-978
    DOI:10.20537/nd241219
    Krivchenko V. O., Gasnikov A. V., Kovalev D. A.
    Abstract
    In this paper we present interpolation conditions for several important convex-concave function classes: nonsmooth convex-concave functions, conditions for difference of strongly-convex functions in a form that contains oracle information exclusively and smooth convex-concave functions with a bilinear coupling term. Then we demonstrate how the performance estimation problem approach can be adapted to analyze the exact worst-case convergence behavior of firstorder methods applied to composite bilinear-coupled min-max problems. Using the performance estimation problem approach, we estimate iteration complexities for several first-order fixed-step methods, Sim-GDA and Alt-GDA, which are applied to smooth convex-concave functions with a bilinear coupling term.
    Keywords: saddle point, convex-concave functions, bilinear coupling, performance estimation problem, interpolation conditions
    Citation: Krivchenko V. O., Gasnikov A. V., Kovalev D. A.,  Convex-Concave Interpolation and Application of PEP to the Bilinear-Coupled Saddle Point Problem, Rus. J. Nonlin. Dyn., 2024, Vol. 20, no. 5, pp.  875-893
    DOI:10.20537/nd241215
    Gasnikov A. V., Alkousa M. S., Lobanov A. V., Dorn Y. V., Stonyakin F. S., Kuruzov I. A., Singh S. R.
    Abstract
    Frequently, when dealing with many machine learning models, optimization problems appear to be challenging due to a limited understanding of the constructions and characterizations of the objective functions in these problems. Therefore, major complications arise when dealing with first-order algorithms, in which gradient computations are challenging or even impossible in various scenarios. For this reason, we resort to derivative-free methods (zeroth-order methods). This paper is devoted to an approach to minimizing quasi-convex functions using a recently proposed (in [56]) comparison oracle only. This oracle compares function values at two points and tells which is larger, thus by the proposed approach, the comparisons are all we need to solve the optimization problem under consideration. The proposed algorithm to solve the considered problem is based on the technique of comparison-based gradient direction estimation and the comparison-based approximation normalized gradient descent. The normalized gradient descent algorithm is an adaptation of gradient descent, which updates according to the direction of the gradients, rather than the gradients themselves. We proved the convergence rate of the proposed algorithm when the objective function is smooth and strictly quasi-convex in $\mathbb{R}^n$, this algorithm needs $\mathcal{O}\left( \left(n D^2/\varepsilon^2 \right) \log\left(n D / \varepsilon\right)\right)$ comparison queries to find an $\varepsilon$-approximate of the optimal solution, where $D$ is an upper bound of the distance between all generated iteration points and an optimal solution.
    Keywords: quasi-convex function, gradient-free algorithm, smooth function, comparison oracle, normalized gradient descent
    Citation: Gasnikov A. V., Alkousa M. S., Lobanov A. V., Dorn Y. V., Stonyakin F. S., Kuruzov I. A., Singh S. R.,  On Quasi-Convex Smooth Optimization Problems by a Comparison Oracle, Rus. J. Nonlin. Dyn., 2024, Vol. 20, no. 5, pp.  813-825
    DOI:10.20537/nd241211

    Back to the list