• CUBLAS

  • Referenced in 84 articles [sw06880]
  • computational resources of NVIDIA Graphics Processing Unit (GPU), but does not auto-parallelize across multiple...
  • Nektar++

  • Referenced in 101 articles [sw11964]
  • local matrix generation stage coupled with the parallelization techniques developed for the linear system solvers ... elliptic finite element method to the GPU and perform a case study for a particular ... This study provides comparison between CPU and GPU implementations of the method as well ... parallelism of GPUs. We demonstrate that the HDG method is well-suited for GPU implementation...
  • DeCo

  • Referenced in 5 articles [sw24120]
  • function DeCo which applies banks of parallel sequential Monte Carlo algorithms to filter the time ... computing and for graphical process unit (GPU) parallel computing. For the GPU implementation ... MATLAB parallel computing toolbox and show how to use general purpose GPU computing almost effortlessly...
  • StarPU

  • Referenced in 41 articles [sw14216]
  • parallel computer mixing IBM Cell Broadband Engines and AMD opteron processors. Other architectures, featuring GPU ... across the entire machine, that is, where parallel tasks would be dynamically scheduled over...
  • VTune

  • Referenced in 19 articles [sw08852]
  • parallel performance analysis. Collect a rich set of data to tune CPU & GPU compute performance...
  • GPUTeraSort

  • Referenced in 14 articles [sw12706]
  • GPU along with task parallelism by scheduling some of the memory-intensive and compute-intensive...
  • GPflowOpt

  • Referenced in 4 articles [sw33396]
  • TensorFlow including automatic differentiation, parallelization and GPU computations for Bayesian optimization. Design goals focus...
  • MPSGPU-SJTU

  • Referenced in 2 articles [sw29217]
  • been widely used in the field of computational fluid dynamics in recent years. However ... novel acceleration technique, graphics processing unit (GPU) parallel computing, is applied in MPS. Based...
  • CULA

  • Referenced in 11 articles [sw12745]
  • processing unit (GPU) found in many standard personal computers is a highly parallel math processor ... ratio. High-level linear algebra operations are computationally intense, often requiring O(N3) operations ... power of the GPU. Our work is on CULA, a GPU accelerated implementation of linear ... GPU execution model featured by NVIDIA GPUs based on CUDA demands very strong parallelism, requiring...
  • cuFFT

  • Referenced in 24 articles [sw11258]
  • simple interface for computing FFTs on an NVIDIA GPU, which allows users to quickly leverage ... floating-point power and parallelism of the GPU in a highly optimized and tested...
  • laGP

  • Referenced in 28 articles [sw14043]
  • Regression. Performs approximate GP regression for large computer experiments and spatial datasets. The approximation ... parallelization are supported for prediction over a vast out-of-sample testing set; GPU acceleration ... supported for an important subroutine. OpenMP and GPU features may require special compilation. An interface ... augmented Lagrangian scheme, and large scale computer model calibration...
  • IDTxl

  • Referenced in 5 articles [sw25603]
  • continuous data with parallel computing engines for both GPU and CPU platforms. Written for Python3.4.3...
  • EAGL

  • Referenced in 2 articles [sw08231]
  • EAGL), a self-contained GPU library, to support parallel computing of bilinear pairings based ... takes full advantage of the parallel processing power of GPU, with no shared memory bank ... GPU pipeline vs. memory access latency are highly complex for parallelization of pairing computations. Overall ... main performance bottleneck for pairing computations on the tested GPU device, and the lazy reduction...
  • AmgX

  • Referenced in 14 articles [sw13440]
  • AmgX: a library for GPU accelerated algebraic multigrid and preconditioned iterative methods. The solution ... systems arises in many applications, such as computational fluid dynamics and oil reservoir simulation ... that they require large scale distributed parallel computing to obtain the solution of interest ... AmgX library, which provides drop-in GPU acceleration of distributed algebraic multigrid (AMG) and preconditioned...
  • BioEM

  • Referenced in 1 article [sw16813]
  • BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images. In cryo-electron microscopy ... demanding. Here we present highly parallelized, GPU-accelerated computer software that performs this task efficiently ... OpenMP, and MPI parallelization combined with both CPU and GPU computing. The resulting BioEM software...
  • corr2

  • Referenced in 1 article [sw26426]
  • correlation coefficient using a GPU (requires Parallel Computing Toolbox™). For more information, see Image Processing...
  • Zippy

  • Referenced in 5 articles [sw34497]
  • model for high performance general- purpose computation on GPU clusters remains a complex problem ... abstracts the GPU cluster programming with a two-level parallelism hierarchy and a non-uniform ... integration of parallel visualiza- tion, graphics, and computation modules on a GPU cluster...
  • gputools

  • Referenced in 7 articles [sw14139]
  • gputools package enables GPU computing in R. Motivation: By default, the R statistical environment does ... make use of parallelism. Researchers may resort to expensive solutions such as cluster hardware ... processing units (GPUs) provide an inexpensive and computationally powerful alternative. Using R and the CUDA ... used in microarray gene expression analysis for GPU-equipped computers. Results: R users can take...
  • CindyGL

  • Referenced in 3 articles [sw17504]
  • CindyGL: authoring GPU-based interactive mathematical content. CindyJS is a framework for creating interactive (mathematical ... leverages WebGL for parallelized computations.{par} CindyGL provides access to the GPU fragment shader...
  • CCHE2D

  • Referenced in 1 article [sw09202]
  • using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solver outperforms ... parallel alternatives. Computing accuracy and efficiency of both CPU and GPU versions of models were ... case. It has been demonstrated that the parallelized CCHE2D flow model with CUDA Fortran ... flow with a much higher computing efficiency on the GPU...