TY - GEN
T1 - Acceleration of Monte-Carlo simulation on high performance computing platforms
AU - Wang, Pei Jen
AU - Liu, Cheng Yueh
AU - Tu, Chia Heng
AU - Lee, Chen Pang
AU - Hung, Shih Hao
N1 - Publisher Copyright:
© 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2018/10/9
Y1 - 2018/10/9
N2 - Monte Carlo methods are often used to solve computational problems with randomness. The random sampling helps avoid the deterministic results, but it requires intensive computations to obtain the results. Several attempts have been made to boost the performance of the Monte Carlo based algorithms by taking advantage of the parallel computers. In this paper, we use the photonic simulation application, MCML, as a case study to 1) parallelize the Monte Carlo method with OpenMP and vectorization, 2) compare the parallelization techniques, and 3) evaluate the parallelized programs on the platforms with the Xeon Phi processor. In particular, the OpenMP version incorporates the vectorization technique that utilizes the AVX-512 vector instructions on the Xeon Phi processor. Our experimental results show that the OpenMP code achieves up to 345x speedup on the Xeon Phi processor, compared with the original code runs on the Xeon E5 processor.
AB - Monte Carlo methods are often used to solve computational problems with randomness. The random sampling helps avoid the deterministic results, but it requires intensive computations to obtain the results. Several attempts have been made to boost the performance of the Monte Carlo based algorithms by taking advantage of the parallel computers. In this paper, we use the photonic simulation application, MCML, as a case study to 1) parallelize the Monte Carlo method with OpenMP and vectorization, 2) compare the parallelization techniques, and 3) evaluate the parallelized programs on the platforms with the Xeon Phi processor. In particular, the OpenMP version incorporates the vectorization technique that utilizes the AVX-512 vector instructions on the Xeon Phi processor. Our experimental results show that the OpenMP code achieves up to 345x speedup on the Xeon Phi processor, compared with the original code runs on the Xeon E5 processor.
UR - http://www.scopus.com/inward/record.url?scp=85056852582&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056852582&partnerID=8YFLogxK
U2 - 10.1145/3264746.3264765
DO - 10.1145/3264746.3264765
M3 - Conference contribution
AN - SCOPUS:85056852582
T3 - Proceedings of the 2018 Research in Adaptive and Convergent Systems, RACS 2018
SP - 225
EP - 230
BT - Proceedings of the 2018 Research in Adaptive and Convergent Systems, RACS 2018
PB - Association for Computing Machinery, Inc
T2 - 2018 Conference Research in Adaptive and Convergent Systems, RACS 2018
Y2 - 9 October 2018 through 12 October 2018
ER -