TY - GEN
T1 - On the portability and performance of message-passing programs on embedded multicore platforms
AU - Hung, Shih Hao
AU - Chiu, Po Hsun
AU - Tu, Chia Heng
AU - Chou, Wei Ting
AU - Yang, Wen Long
PY - 2012
Y1 - 2012
N2 - Recently, embedded multicore platforms have become popular, but software development for such platforms has been very challenging. While message-passing is a popular programming model for parallel applications, it is not adequately supported on the current embedded multicore platforms. Similar to the situations in '80s∼'90s, applications are hardly portable across parallel computers before the advent of MPI. Unfortunately, MPI is too big for most embedded platforms of today. Moreover, the message-passing functions need to utilize the architectural features to offer optimized performance, but such platform-specific optimizations often hurt the portability. This paper addresses the portability and performance issues by designing a new message-passing library with a three-layer modular design. The top two layers are mostly platform-independent, and the bottom layer enables platform-specific optimizations. We discuss the performance issues in the paper and evaluate the issues with experimental results.
AB - Recently, embedded multicore platforms have become popular, but software development for such platforms has been very challenging. While message-passing is a popular programming model for parallel applications, it is not adequately supported on the current embedded multicore platforms. Similar to the situations in '80s∼'90s, applications are hardly portable across parallel computers before the advent of MPI. Unfortunately, MPI is too big for most embedded platforms of today. Moreover, the message-passing functions need to utilize the architectural features to offer optimized performance, but such platform-specific optimizations often hurt the portability. This paper addresses the portability and performance issues by designing a new message-passing library with a three-layer modular design. The top two layers are mostly platform-independent, and the bottom layer enables platform-specific optimizations. We discuss the performance issues in the paper and evaluate the issues with experimental results.
UR - http://www.scopus.com/inward/record.url?scp=84867411423&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867411423&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2012.110
DO - 10.1109/IPDPSW.2012.110
M3 - Conference contribution
AN - SCOPUS:84867411423
SN - 9780769546766
T3 - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
SP - 896
EP - 903
BT - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
T2 - 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
Y2 - 21 May 2012 through 25 May 2012
ER -