Exploiting batch processing on streaming architectures to solve 2D elliptic finite element problems: a hybridized discontinuous Galerkin (HDG) case study. Numerical methods for elliptic partial differential equations (PDEs) within both continuous and hybridized discontinuous Galerkin (HDG) frameworks share the same general structure: local (elemental) matrix generation followed by a global linear system assembly and solve. The lack of inter-element communication and easily parallelizable nature of the local matrix generation stage coupled with the parallelization techniques developed for the linear system solvers make a numerical scheme for elliptic PDEs a good candidate for implementation on streaming architectures such as modern graphical processing units (GPUs). We propose an algorithmic pipeline for mapping an elliptic finite element method to the GPU and perform a case study for a particular method within the HDG framework. This study provides comparison between CPU and GPU implementations of the method as well as highlights certain performance-crucial implementation details. The choice of the HDG method for the case study was dictated by the computationally-heavy local matrix generation stage as well as the reduced trace-based communication pattern, which together make the method amenable to the fine-grained parallelism of GPUs. We demonstrate that the HDG method is well-suited for GPU implementation, obtaining total speedups on the order of 30-35 times over a serial CPU implementation for moderately sized problems.

References in zbMATH (referenced in 64 articles , 1 standard article )

Showing results 1 to 20 of 64.
Sorted by year (citations)

1 2 3 4 next

  1. Cheng, Liang; Ju, Xiaoying; Tong, Feifei; An, Hongwei: Transition to chaos through period doublings of a forced oscillating cylinder in steady current (2020)
  2. Ju, Xiaoying; An, Hongwei; Cheng, Liang; Tong, Feifei: Modes of synchronisation around a near-wall oscillating cylinder in streamwise directions (2020)
  3. Kumar, Abhishek; Pothérat, Alban: Mixed baroclinic convection in a cavity (2020)
  4. Moratilla-Vega, M. A.; Lackhove, K.; Janicka, J.; Xia, H.; Page, G. J.: Jet noise analysis using an efficient LES/high-order acoustic coupling method (2020)
  5. Moxey, David; Amici, Roman; Kirby, Mike: Efficient matrix-free high-order finite element evaluation for simplicial elements (2020)
  6. Önder, Asim; Liu, Philip L.-F.: Stability of the solitary wave boundary layer subject to finite-amplitude disturbances (2020)
  7. Puligilla, Shivakanth Chary; Jayaraman, Balaji: Assessment of end-to-end and sequential data-driven learning for non-intrusive modeling of fluid flows (2020)
  8. Xiong, Chengwang; Qi, Xiang; Gao, Ankang; Xu, Hui; Ren, Chengjiao; Cheng, Liang: The bypass transition mechanism of the Stokes boundary layer in the intermittently turbulent regime (2020)
  9. Zhang, Kai; Hayostek, Shelby; Amitay, Michael; He, Wei; Theofilis, Vassilios; Taira, Kunihiko: On the formation of three-dimensional separated flows over wings under tip effects (2020)
  10. Cantwell, Chris D.; Nielsen, Allan S.: A minimally intrusive low-memory approach to resilience for existing transient solvers (2019)
  11. Cervi, Jessica; Spiteri, Raymond J.: A comparison of fourth-order operator splitting methods for cardiac simulations (2019)
  12. Jallepalli, Ashok; Haimes, Robert; Kirby, Robert M.: Adaptive characteristic length for L-SIAC filtering of FEM data (2019)
  13. Jallepalli, Ashok; Kirby, Robert M.: Efficient algorithms for the line-SIAC filter (2019)
  14. Jayaraman, Balaji; Lu, Chen; Whitman, Joshua; Chowdhary, Girish: Sparse feature map-based Markov models for nonlinear fluid flows (2019)
  15. Moxey, David; Sastry, Shankar P.; Kirby, Robert M.: Interpolation error bounds for curvilinear finite elements and their implications on adaptive mesh refinement (2019)
  16. Perry, Daniel J.; Kirby, Robert M.; Narayan, Akil; Whitaker, Ross T.: Allocation strategies for high fidelity models in the multifidelity regime (2019)
  17. Ren, Chengjiao; Cheng, Liang; Tong, Feifei; Xiong, Chengwang; Chen, Tingguo: Oscillatory flow regimes around four cylinders in a diamond arrangement (2019)
  18. Wang, Rui; Bao, Yan; Zhou, Dai; Zhu, Hongbo; Ping, Huan; Han, Zhaolong; Serson, Douglas; Xu, Hui: Flow instabilities in the wake of a circular cylinder with parallel dual splitter plates attached (2019)
  19. Abide, Stéphane; Viazzo, Stéphane; Raspo, Isabelle; Randriamampianina, Anthony: Higher-order compact scheme for high-performance computing of stratified rotating flows (2018)
  20. Badia, Santiago; Martín, Alberto F.; Principe, Javier: \textttFEMPAR: an object-oriented parallel finite element framework (2018)

1 2 3 4 next

Further publications can be found at: