Background: How the transcription factor binding sites (TFBSs) are distributed in the promoter region have implications for gene regulation. Previous studies used the translation start codon as the reference point to infer the TFBS distribution. However, it is biologically more relevant to use the transcription start site (TSS) as the reference point. In this study, we reexamined the spatial distribution of TFBSs, investigated various promoter features that may affect the distribution, and studied the effect of TFBS distribution on transcriptional regulation.Results: We found a sharp peak for the distribution of TFBSs at ~115 bp upstream of the TSS, but no clear peak when the translation start codon was used as the reference point. Our analysis of sequence variation data among 63 yeast strains revealed very low deletion polymorphisms in the region between the distribution peak and the TSS, suggesting that the distances between TFBSs and the TSS have been selectively constrained in evolution. As in previous studies, we found that the nucleosome occupancy and the presence/absence of TATA-box in the promoter region affect the TFBS distribution pattern. In addition, we found that there exists a correlation between the 5'UTR length and the TFBS distribution pattern and we showed that the TFBS distribution pattern affects gene transcription level and plasticity.Conclusions: The spatial distribution of TFBSs obtained using the TSS as the reference point shows a much sharper peak than does the distribution obtained using the translation start codon as the reference point. The TFBS distribution pattern is affected by nucleosome occupancy and presence of TATA-box and it affects the transcription level and transcription plasticity of the gene.
All Science Journal Classification (ASJC) codes