This paper describes the Automatically Tuned Linear Algebra Software (ATLAS) project, as well as the fundamental principles that underly it. ATLAS is an instantiation of a new paradigm in high performance library production and maintenance, which we term automated empirical optimization of software; this style of library management has been created in order to allow software to keep pace with the incredible rate of hardware advancement inherent in Moore’s Law. ATLAS is the application of this new paradigm to linear algebra software, with the present emphasis on the basic linear algebra subprograms, a widely used, performance-critical, linear algebra kernel library

This software is also referenced in ORMS.

References in zbMATH (referenced in 195 articles , 1 standard article )

Showing results 21 to 40 of 195.
Sorted by year (citations)

previous 1 2 3 4 ... 8 9 10 next

  1. Sanchez, Eduardo J.; Paolini, Christopher P.; Castillo, Jose E.: The mimetic methods toolkit: an object-oriented API for mimetic finite differences (2014)
  2. Schatz, Martin D.; Low, Tze Meng; van de Geijn, Robert A.; Kolda, Tamara G.: Exploiting symmetry in tensors for high performance: multiplication with symmetric tensors (2014)
  3. Velichka, M. D.; Jacobson, M. J. jun.; Stein, A.: Computing discrete logarithms in the Jacobian of high-genus hyperelliptic curves over even characteristic finite fields (2014)
  4. Abed, Khalid H.; Morris, Gerald R.: Improving performance of codes with large/irregular stride memory access patterns via high performance reconfigurable computers (2013) ioport
  5. Bouchard-Côté, Alexandre: A note on probabilistic models over strings: the linear algebra approach (2013)
  6. Castaldo, Anthony M.; Whaley, R. Clint; Samuel, Siju: Scaling LAPACK panel operations using parallel cache assignment (2013)
  7. Kouya, Tomonori: Performance evaluation of multiple and mixed precision iterative refinement method and its application to high-order implicit Runge-Kutta method (2013)
  8. Poulson, Jack; Marker, Bryan; van de Geijn, Robert A.; Hammond, Jeff R.; Romero, Nichols A.: Elemental, a new framework for distributed memory dense matrix computations (2013)
  9. Vannieuwenhoven, Nick; Meerbergen, Karl: IMF: an incomplete multifrontal (LU)-factorization for element-structured sparse linear systems (2013)
  10. Du, Peng; Weber, Rick; Luszczek, Piotr; Tomov, Stanimire; Peterson, Gregory; Dongarra, Jack: From CUDA to opencl: towards a performance-portable solution for multi-platform GPU programming (2012) ioport
  11. Ghysels, P.; Kłosiewicz, P.; Vanroose, W.: Improving the arithmetic intensity of multigrid with the help of polynomial smoothers. (2012)
  12. Hartley, Timothy D. R.; Saule, Erik; Çatalyürek, Ümit V.: Improving performance of adaptive component-based dataflow middleware (2012) ioport
  13. Ho, Kenneth L.; Greengard, Leslie: A fast direct solver for structured linear systems by recursive skeletonization (2012)
  14. Klöckner, Andreas; Pinto, Nicolas; Lee, Yunsup; Catanzaro, Bryan; Ivanov, Paul; Fasih, Ahmed: PyCUDA and PyOpenCL: a scripting-based approach to GPU run-time code generation (2012) ioport
  15. López-Espín, Jose J.; Vidal, Antonio M.; Giménez, Domingo: Two-stage least squares and indirect least squares algorithms for simultaneous equations models (2012)
  16. Lu, Qingda; Gao, Xiaoyang; Krishnamoorthy, Sriram; Baumgartner, Gerald; Ramanujam, J.; Sadayappan, P.: Empirical performance model-driven data layout optimization and library call selection for tensor contraction expressions (2012) ioport
  17. Ozaki, Katsuhisa; Ogita, Takeshi; Oishi, Shin’ichi; Rump, Siegfried M.: Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications (2012)
  18. Pauderis, Colton; Storjohann, Arne: Deterministic unimodularity certification (2012)
  19. Wernsing, John R.; Stitt, Greg: Elastic computing: A portable optimization framework for hybrid computers (2012) ioport
  20. Yzelman, Albert-Jan N.; Bisseling, Rob H.: A cache-oblivious sparse matrix-vector multiplication scheme based on the Hilbert curve (2012)

previous 1 2 3 4 ... 8 9 10 next