Molecular dynamics simulation is an important and powerful tool in studying physical and chemical properties of materials, in particular, at the nanoscales. Conventional molecular dynamics, as oppose to the ab initio molecular dynamics, adopts the Newton's second law to predict particle position in the next time step with the assistance of empirical interatomic potential to calculate the forces between neighboring particles. Large-scale MD systems, consisting millions of atoms, require sophisticated parallelization in computer codes to improve computational efficiency. In this work, the IBM Blue Gene/P, a linux PC cluster and GPU/CUDA are used to test the computation performance of the MD code, LAMMPS, and other codes. Furthermore, strong and weak scaling were tested to determine the parallel efficiency of the codes, and to study effects of system size and number of computing cores. It is found that both strong and weak scaling are achievable in the tested problem sizes. Furthermore, due to reduced CPU clock frequency in the Blue Gene/P machine, its performance is inferior than that of the linux cluster if the number of the computing cores involved is small. Effects of computation algorithms also strongly influence the performance of the codes on the machines. Blue Gene/P is extremely suitable for large-scale problem sizes with many processors involved. When the problem fits in the GPU architecture, the performance of the GPU/CUDA may be comparable with that of Blue Gene/P. In addition to the parallel performance, the accuracy of the physical problem is verified to ensure the MD simulation produce correct results.