ATLAS

This paper describes the Automatically Tuned Linear Algebra Software (ATLAS) project, as well as the fundamental principles that underly it. ATLAS is an instantiation of a new paradigm in high performance library production and maintenance, which we term automated empirical optimization of software; this style of library management has been created in order to allow software to keep pace with the incredible rate of hardware advancement inherent in Moore’s Law. ATLAS is the application of this new paradigm to linear algebra software, with the present emphasis on the basic linear algebra subprograms, a widely used, performance-critical, linear algebra kernel library

This software is also referenced in ORMS.


References in zbMATH (referenced in 197 articles , 1 standard article )

Showing results 41 to 60 of 197.
Sorted by year (citations)

previous 1 2 3 4 5 ... 8 9 10 next

  1. Wernsing, John R.; Stitt, Greg: Elastic computing: A portable optimization framework for hybrid computers (2012) ioport
  2. Yzelman, Albert-Jan N.; Bisseling, Rob H.: A cache-oblivious sparse matrix-vector multiplication scheme based on the Hilbert curve (2012)
  3. Bender, Michael A.; Kuszmaul, Bradley C.; Teng, Shang-Hua; Wang, Kebin: Optimal cache-oblivious mesh layouts (2011)
  4. Borrell, R.; Lehmkuhl, O.; Trias, F. X.; Oliva, A.: Parallel direct Poisson solver for discretisations with one Fourier diagonalisable direction (2011)
  5. Carette, Jacques; Kiselyov, Oleg: Multi-stage programming with functors and monads: eliminating abstraction overhead from generic code (2011)
  6. Drevet, Charles-Éric; Islam, Md. Nazrul; Schost, Éric: Optimization techniques for small matrix multiplication (2011)
  7. Dubois, Jérôme; Calvin, Christophe; Petiton, Serge: Performance and numerical accuracy evaluation of heterogeneous multicore systems for Krylov orthogonal basis computation (2011)
  8. Fabregat-Traver, Diego; Bientinesi, Paolo: Knowledge-based automatic generation of partitioned matrix expressions (2011)
  9. Fursin, Grigori; Kashnikov, Yuriy; Memon, Abdul Wahid; Chamski, Zbigniew; Temam, Olivier; Namolaru, Mircea; Yom-Tov, Elad; Mendelson, Bilha; Zaks, Ayal; Courtois, Eric; Bodin, Francois; Barnard, Phil; Ashton, Elton; Bonilla, Edwin; Thomson, John; Williams, Christopher K. I.; O’Boyle, Michael: Milepost GCC: Machine learning enabled self-tuning compiler (2011) ioport
  10. Gao, Da; Schwartzentruber, Thomas E.: Optimizations and OpenMP implementation for the direct simulation Monte Carlo method (2011)
  11. Heinzl, René; Schwaha, Philipp: A generic topology library (2011)
  12. Kalinnik, Natalia; Korch, Matthias; Rauber, Thomas: An efficient time-step-based self-adaptive algorithm for predictor-corrector methods of Runge-Kutta type (2011)
  13. Michailidis, Panagiotis D.; Margaritis, Konstantinos G.: Parallel direct methods for solving the system of linear equations with pipelining on a multicore using OpenMP (2011)
  14. Russell, Francis P.; Mellor, Michael R.; Kelly, Paul H. J.; Beckmann, Olav: DESOLA: an active linear algebra library using delayed evaluation and runtime code generation (2011) ioport
  15. Yzelman, A. N.; Bisseling, Rob H.: Two-dimensional cache-oblivious sparse matrix-vector multiplication (2011) ioport
  16. Bredenstein, A.; Denner, A.; Dittmaier, S.; Pozzorini, S.: NLO QCD corrections to ( \textt\overline\textt\textb\overline\textb) production at the LHC: 2. Full hadronic results (2010)
  17. Chowdhury, Rezaul Alam; Ramachandran, Vijaya: The cache-oblivious Gaussian elimination paradigm: Theoretical framework, parallelization and Experimental evaluation (2010)
  18. Dimakopoulos, Yannis: An efficient parallel and fully implicit algorithm for the simulation of transient free-surface flows of multimode viscoelastic liquids (2010)
  19. Granat, Robert; Kågström, Bo; Kressner, Daniel: A novel parallel QR algorithm for hybrid distributed memory HPC systems (2010)
  20. Weggler, S.; Rutka, V.; Hildebrandt, A.: A new numerical method for nonlocal electrostatics in biomolecular simulations (2010)

previous 1 2 3 4 5 ... 8 9 10 next