• Pegasos

  • Referenced in 106 articles [sw08752]
  • simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast ... contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ϵ2) iterations...
  • CG_DESCENT

  • Referenced in 142 articles [sw04813]
  • Algorithm 851: CG_DESCENT. A conjugate gradient method with guaranteed descent Recently, a new nonlinear ... conjugate gradient scheme was developed which satisfies the descent condition gTkdk...
  • Wirtinger Flow

  • Referenced in 110 articles [sw34175]
  • computational complexity, much like in a gradient descent scheme. The main contribution is that this...
  • HOGWILD

  • Referenced in 65 articles [sw28396]
  • Lock-Free Approach to Parallelizing Stochastic Gradient Descent. Stochastic Gradient Descent (SGD) is a popular ... associated optimization problem is sparse, meaning most gradient updates only modify small parts...
  • ADADELTA

  • Referenced in 69 articles [sw39429]
  • dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time ... minimal computational overhead beyond vanilla stochastic gradient descent. The method requires no manual tuning ... learning rate and appears robust to noisy gradient information, different model architecture choices, various data...
  • mboost

  • Referenced in 69 articles [sw07331]
  • package mboost: Model-Based Boosting. Functional gradient descent algorithm (boosting) for optimizing general risk functions...
  • SGD-QN

  • Referenced in 28 articles [sw19411]
  • careful quasi-Newton stochastic gradient descent. The SGD-QN algorithm is a stochastic gradient descent ... fast as a first-order stochastic gradient descent but requires less iterations to achieve...
  • BADMM

  • Referenced in 36 articles [sw20288]
  • mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared...
  • PINNsNTK

  • Referenced in 25 articles [sw42058]
  • networks behave during their training via gradient descent. More importantly, even less is known about ... infinite width limit during training via gradient descent. Specifically, we derive the NTK of PINNs ... fundamental pathology, we propose a novel gradient descent algorithm that utilizes the eigenvalues...
  • SNLSDP

  • Referenced in 39 articles [sw05127]
  • starting point for a gradient descent method with backtracking line search to solve the smooth...
  • LASSO

  • Referenced in 33 articles [sw02850]
  • gradient descent algorithm for LASSO LASSO is a useful method to achieve the shrinkage...
  • SGDR

  • Referenced in 19 articles [sw30752]
  • SGDR: Stochastic Gradient Descent with Warm Restarts. Restart techniques are common in gradient-free optimization ... simple warm restart technique for stochastic gradient descent to improve its anytime performance when training...
  • Entropy-SGD

  • Referenced in 21 articles [sw41231]
  • Entropy-SGD: Biasing Gradient Descent Into Wide Valleys. This paper proposes a new optimization algorithm ... inner loop to compute the gradient of the local entropy before each update...
  • DARTS

  • Referenced in 14 articles [sw36213]
  • efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn...
  • Vowpal Wabbit

  • Referenced in 12 articles [sw28398]
  • available with the baseline being sparse gradient descent (GD) on a loss function (several...
  • MMD GAN

  • Referenced in 12 articles [sw42580]
  • topology and can be optimized via gradient descent with relatively small batch sizes...
  • TIGRA

  • Referenced in 40 articles [sw02333]
  • presented. The TIGRA (Tikhonov-gradient method) algorithm proposed uses steepest descent iterations in an inner...
  • CNTK

  • Referenced in 9 articles [sw21056]
  • recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD, error backpropagation) learning with automatic differentiation...
  • Optspace

  • Referenced in 8 articles [sw12630]
  • Optspace: A Gradient Descent Algorithm on the Grassmann Manifold for Matrix Completion. We consider...
  • neural-tangents

  • Referenced in 5 articles [sw39529]
  • using exact Bayesian inference or using gradient descent via the Neural Tangent Kernel. Additionally, Neural ... Tangents provides tools to study gradient descent training dynamics of wide but finite networks...