We propose a new algorithm called Multiple Combinations and Multiple Phases (MCMP)for distributing program workload onto the computers of a software distributed shared memory (SDSM) cluster in this paper. This algorithm considers not only computational power but also memory availability in determining the numbers of program threads assigned to computers. Additionally, the location policy of this algorithm is selecting only the computers useful for performance optimization to execute user programs but not simply distributing program threads onto all of the computers available in the cluster. We have implemented the proposed algorithm on a test bed called Teamster. Our experimental results show that the proposed algorithm produces a 20∼30% improvement in the performance of the test application compared to the other algorithms, and it can efficiently specify the best-fit node combinations from the experimental system configurations for the test application with a few of cost.