
Pegasos
 Referenced in 106 articles
[sw08752]
 simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast ... contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ϵ2) iterations...

CG_DESCENT
 Referenced in 142 articles
[sw04813]
 Algorithm 851: CG_DESCENT. A conjugate gradient method with guaranteed descent Recently, a new nonlinear ... conjugate gradient scheme was developed which satisfies the descent condition gTkdk...

Wirtinger Flow
 Referenced in 110 articles
[sw34175]
 computational complexity, much like in a gradient descent scheme. The main contribution is that this...

HOGWILD
 Referenced in 65 articles
[sw28396]
 LockFree Approach to Parallelizing Stochastic Gradient Descent. Stochastic Gradient Descent (SGD) is a popular ... associated optimization problem is sparse, meaning most gradient updates only modify small parts...

ADADELTA
 Referenced in 69 articles
[sw39429]
 dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time ... minimal computational overhead beyond vanilla stochastic gradient descent. The method requires no manual tuning ... learning rate and appears robust to noisy gradient information, different model architecture choices, various data...

mboost
 Referenced in 69 articles
[sw07331]
 package mboost: ModelBased Boosting. Functional gradient descent algorithm (boosting) for optimizing general risk functions...

SGDQN
 Referenced in 28 articles
[sw19411]
 careful quasiNewton stochastic gradient descent. The SGDQN algorithm is a stochastic gradient descent ... fast as a firstorder stochastic gradient descent but requires less iterations to achieve...

BADMM
 Referenced in 36 articles
[sw20288]
 mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared...

PINNsNTK
 Referenced in 25 articles
[sw42058]
 networks behave during their training via gradient descent. More importantly, even less is known about ... infinite width limit during training via gradient descent. Specifically, we derive the NTK of PINNs ... fundamental pathology, we propose a novel gradient descent algorithm that utilizes the eigenvalues...

SNLSDP
 Referenced in 39 articles
[sw05127]
 starting point for a gradient descent method with backtracking line search to solve the smooth...

LASSO
 Referenced in 33 articles
[sw02850]
 gradient descent algorithm for LASSO LASSO is a useful method to achieve the shrinkage...

SGDR
 Referenced in 19 articles
[sw30752]
 SGDR: Stochastic Gradient Descent with Warm Restarts. Restart techniques are common in gradientfree optimization ... simple warm restart technique for stochastic gradient descent to improve its anytime performance when training...

EntropySGD
 Referenced in 21 articles
[sw41231]
 EntropySGD: Biasing Gradient Descent Into Wide Valleys. This paper proposes a new optimization algorithm ... inner loop to compute the gradient of the local entropy before each update...

DARTS
 Referenced in 14 articles
[sw36213]
 efficient search of the architecture using gradient descent. Extensive experiments on CIFAR10, ImageNet, Penn...

Vowpal Wabbit
 Referenced in 12 articles
[sw28398]
 available with the baseline being sparse gradient descent (GD) on a loss function (several...

MMD GAN
 Referenced in 12 articles
[sw42580]
 topology and can be optimized via gradient descent with relatively small batch sizes...

TIGRA
 Referenced in 40 articles
[sw02333]
 presented. The TIGRA (Tikhonovgradient method) algorithm proposed uses steepest descent iterations in an inner...

CNTK
 Referenced in 9 articles
[sw21056]
 recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD, error backpropagation) learning with automatic differentiation...

Optspace
 Referenced in 8 articles
[sw12630]
 Optspace: A Gradient Descent Algorithm on the Grassmann Manifold for Matrix Completion. We consider...

neuraltangents
 Referenced in 5 articles
[sw39529]
 using exact Bayesian inference or using gradient descent via the Neural Tangent Kernel. Additionally, Neural ... Tangents provides tools to study gradient descent training dynamics of wide but finite networks...