Incompressible fluid simulation algorithm optimization of OpenFOAM on Tianhe supercomputing.

Saved in:
Bibliographic Details
Title: Incompressible fluid simulation algorithm optimization of OpenFOAM on Tianhe supercomputing.
Authors: LIU, Zhongmin1 1539409961@qq.com, ZHANG, Xiang2 zhangxiang08@nudt.edu.cn, MA, Di2 madi@nudt.edu.cn, SUN, Yang3 victor_sun@nudt.edu, ZHOU, Lei1, QIU, Qi1, GONG, Chunye2,4
Source: Computer Engineering & Science / Jisuanji Gongcheng yu Kexue. Dec2025, Vol. 47 Issue 12, p2119-2128. 10p.
Subjects: Incompressible flow, Computational fluid dynamics, Parallel processing, Supercomputers, Gauss-Seidel method, Optimization algorithms, ARM microprocessors, Iterative methods (Mathematics)
Abstract: The incompressible fluid simulation solvers in the open-source fluid dynamics software OpenFOAM exhibit cross-platform applicability. However, their performance optimizations are predominantly tailored to supercomputing systems with existing architectures such as Intel, rendering their algorithmic optimizations unable to fully leverage the vectorized parallel advantages of the ARM architecture on the Tianhe supercomputing system. To address this, this paper focuses on incompressible fluid simulation solvers as the research subject and employs ARM vectorization techniques to optimize their symmetric Gauss-Seidel (SGS) method and diagonal incomplete Cholesky preconditioned conjugate gradient (DIC-PCG) method, thereby enhancing the solver's operational efficiency. To achieve vectorization goals, this paper analyzes the relationships between neighboring grid cells during a single iteration of the two types of solving algorithms, revealing that the maximum number of neighboring cells is two and that there are no dependencies between them. Leveraging this prior knowledge, the original algorithm code is modified with minimal cost--specifically, by adding just four lines of if-else conditional statements--to vectorize the neighboring cells and accelerate the algorithms. Experimental results under various configurations demonstrate that the improved algorithm achieves a maximum single-core speedup of 1.75 and a maximum multi-core speedup of 149.16, with a parallel efficiency still reaching 29.13%. [ABSTRACT FROM AUTHOR]
Copyright of Computer Engineering & Science / Jisuanji Gongcheng yu Kexue is the property of Computer Engineering & Science and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Be the first to leave a comment!
You must be logged in first