GotoBLAS2 was released by the Texas Advanced Computing Center as open source software under the BSD license. This product is no longer under active development by TACC, but it is available to the community to use, study, and extend. GotoBLAS2 uses new algorithms and memory techniques for optimal performance of the BLAS routines. The changes in this version target new architecture features in microprocessors and interprocessor communication techniques. In addition, NUMA controls enhance multi-threaded execution of BLAS routines on node.

References in zbMATH (referenced in 13 articles )

Showing results 1 to 13 of 13.
Sorted by year (citations)

  1. Williams, R. Ryan: Faster all-pairs shortest paths via circuit complexity (2018)
  2. Pan, Victor Ya.: Fast matrix multiplication and its algebraic neighbourhood (2017)
  3. Low, Tze Meng; Igual, Francisco D.; Smith, Tyler M.; Quintana-Orti, Enrique S.: Analytical modeling is enough for high-performance BLIS (2016)
  4. Kelefouras, Vasilios; Kritikakou, Angeliki; Goutis, Costas: A methodology for speeding up loop kernels by exploiting the software information and the memory architecture (2015)
  5. Williams, Ryan: Faster all-pairs shortest paths via circuit complexity (2014)
  6. Castaldo, Anthony M.; Whaley, R. Clint; Samuel, Siju: Scaling LAPACK panel operations using parallel cache assignment (2013)
  7. Hedtke, Ivo; Murthy, Sandeep: Search and test algorithms for triple product property triples. (2012)
  8. Bodrato, Marco: A Strassen-like matrix multiplication suited for squaring and higher power computation (2010)
  9. Chowdhury, Rezaul Alam; Ramachandran, Vijaya: The cache-oblivious Gaussian elimination paradigm: Theoretical framework, parallelization and Experimental evaluation (2010)
  10. D’Alberto, Paolo; Nicolau, Alexandru: Adaptive Winograd’s matrix multiplications (2009)
  11. Van Dyk, Danny; Geveler, Markus; Mallach, Sven; Ribbrock, Dirk; Göddeke, Dominik; Gutwenger, Carsten: HONEI: A collection of libraries for numerical computations targeting multiple processor architectures (2009)
  12. González, Manuel; González, Francisco; Dopico, Daniel; Luaces, Alberto: On the effect of linear algebra implementations in real-time multibody system dynamics (2008)
  13. Goto, Kazushige; van de Geijn, Robert A.: Anatomy of high-performance matrix multiplication. (2008)