Load balance is an important issue for the performance of software distributed shared memory (DSM) systems. One solution for addressing this issue is to exploit dynamic thread migration at runtime. In order to reduce the data consistency communication increased by thread migration, an effective load balance scheme must carefully choose the threads and the destination nodes for workload migration. A group-based load balance scheme is proposed to resolve this problem. The main characteristic of this scheme is to classify the overloaded nodes and the lightly loaded nodes into a sender group and a receiver group, and then consider all the threads of the sender group and all the nodes of the receiver group for each thread migration decision. The experimental results show that the group-based load balance scheme reduces more communication than previous methods. Besides, the paper also resolves the problem of the high overhead caused by group-based schemes. Therefore, the performance of the test programs is effectively enhanced after minimizing the communication increased by thread migration.