AdaGrad
ADAGRAD: adaptive gradient algorithm; Adaptive subgradient methods for online learning and stochastic optimization. We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning. Metaphorically, the adaptation allows us to find needles in haystacks in the form of very predictive but rarely seen features. Our paradigm stems from recent advances in stochastic optimization and online learning which employ proximal functions to control the gradient steps of the algorithm. We describe and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal function that can be chosen in hindsight. We give several efficient algorithms for empirical risk minimization problems with common and important regularization functions and domain constraints. We experimentally study our theoretical analysis and show that adaptive subgradient methods outperform state-of-the-art, yet non-adaptive, subgradient algorithms.
Keywords for this software
References in zbMATH (referenced in 85 articles , 1 standard article )
Showing results 1 to 20 of 85.
Sorted by year (- Erway, Jennifer B.; Griffin, Joshua; Marcia, Roummel F.; Omheni, Riadh: Trust-region algorithms for training responses: machine learning methods using indefinite Hessian approximations (2020)
- Joulani, Pooria; György, András; Szepesvári, Csaba: A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, variance reduction, and variational bounds (2020)
- Kissas, Georgios; Yang, Yibo; Hwuang, Eileen; Witschey, Walter R.; Detre, John A.; Perdikaris, Paris: Machine learning in cardiovascular flows modeling: predicting arterial blood pressure from non-invasive 4D flow MRI data using physics-informed neural networks (2020)
- Kotłowski, Wojciech: Scale-invariant unconstrained online learning (2020)
- Lei, Lihua; Jordan, Michael I.: On the adaptivity of stochastic gradient-based optimization (2020)
- Ning, Hanwen; Zhang, Jiaming; Feng, Ting-Ting; Chu, Eric King-wah; Tian, Tianhai: Control-based algorithms for high dimensional online learning (2020)
- Palagi, Laura; Seccia, Ruggiero: Block layer decomposition schemes for training deep neural networks (2020)
- Paquette, Courtney; Scheinberg, Katya: A stochastic line search method with expected complexity analysis (2020)
- Rius-Sorolla, G.; Maheut, J.; Coronado-Hernandez, Jairo R.; Garcia-Sabater, J. P.: Lagrangian relaxation of the generic materials and operations planning model (2020)
- Ruehle, Fabian: Data science applications to string theory (2020)
- Cichosz, Paweł: A case study in text mining of discussion forum posts: classification with bag of words and global vectors (2019)
- Heinlein, Alexander; Klawonn, Axel; Lanser, Martin; Weber, Janine: Machine learning in adaptive domain decomposition methods -- predicting the geometric location of constraints (2019)
- Holland, Matthew J.; Ikeda, Kazushi: Efficient learning with robust gradient descent (2019)
- Hu, Yaohua; Yu, Carisa Kwok Wai; Yang, Xiaoqi: Incremental quasi-subgradient methods for minimizing the sum of quasi-convex functions (2019)
- Kawashima, Takayuki; Fujisawa, Hironori: Robust and sparse regression in generalized linear model by stochastic optimization (2019)
- Kovachki, Nikola B.; Stuart, Andrew M.: Ensemble Kalman inversion: a derivative-free technique for machine learning tasks (2019)
- Krishnamurthy, Akshay; Agarwal, Alekh; Huang, Tzu-Kuo; Iii, Hal Daumé; Langford, John: Active learning for cost-sensitive classification (2019)
- Luo, Zhijian; Qian, Yuntao: Stochastic sub-sampled Newton method with variance reduction (2019)
- Michelioudakis, Evangelos; Artikis, Alexander; Paliouras, Georgios: Semi-supervised online structure learning for composite event recognition (2019)
- Milzarek, Andre; Xiao, Xiantao; Cen, Shicong; Wen, Zaiwen; Ulbrich, Michael: A stochastic semismooth Newton method for nonsmooth nonconvex optimization (2019)