Master-worker applications often demand high throughput. A master-worker application consists of master and worker processes. The master processes generate tasks, while the worker processes compute the tasks. A peer can solely implement the master process, the worker process, or both. A scalable implementation of master-worker applications is to form an overlay network in which masters deliver their tasks to workers through their interconnect links, and workers either compute received tasks or forward some of the tasks to other workers. Different overlay construction could result in various system throughputs. In this work, we study the fundamental issue. That is, how the overlay is structured to maximize the system throughput. We first propose a basic, simple overlay formation algorithm to form an overlay. Then, we develop a number of peering strategies. The simple overlay formation algorithm is flexible to integrate these peering strategies, generating types of the overlay. Our performance studies show that the overlays based on the exploitation of network locality can perform better.