Performance Optimizations for Parallel Modeling of Solidification with Dynamic Intensity of Computation
Kamil Halbiniak , Łukasz Szustak , Adam Kulawik , Paweł Gepner
AbstractIn our previous works, a parallel application dedicated to the numerical modeling of alloy solidification was developed and tested using various programming environments on hybrid shared-memory platforms with multicore CPUs and manycore Intel Xeon Phi accelerators. While this solution allows obtaining a reasonable good performance in the case of the static intensity of computations, the performance results achieved for the dynamic intensity of computations indicates pretty large room for further optimizations. In this work, we focus on improving the overall performance of the application with the dynamic computational intensity. For this aim, we propose to modify the application code significantly using the loop fusion technique. The proposed method permits us to execute all kernels in a single nested loop, as well as reduce the number of conditional operators performed within a single time step. As a result, the proposed optimizations allows increasing the application performance for all tested configurations of computing resources. The highest performance gain is achieved for a single Intel Xeon SP CPU, where the new code yields the speedup of up to 1.78 times against the original version. The developed method is vital for further optimizations of the application performance. It allows introducing an algorithm for the dynamic workload prediction and load balancing in successive time steps of simulation. In this work, we propose the workload prediction algorithm with 1D computational map.
|Publication size in sheets||0.55|
|Book||Wyrzykowski Roman, Deelman Ewa, Dongarra Jack , Karczewski Konrad (eds.): Parallel Processing and Applied Mathematics, 2020, Springer, ISBN 978-3-030-43228-7, [978-3-030-43229-4], DOI:10.1007/978-3-030-43229-4|
|Keywords in English||Numerical modeling of solidification; Phase-field method; Parallel programming; OpenMP; Workload prediction; Load balancing; Intel Xeon Phi Intel; Xeon Scalable processors|
|Score||= 20.0, 18-09-2020, ChapterFromConference|
* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.