A new algorithm to identify cooperative transcription factor pairs and developing performance indices and web tools to expedite prediction performance evaluation and comparison

  • 賴 福柔

Student thesis: Doctoral Thesis


Transcriptional regulation of gene expression is known to be highly connected through the networks of cooperative transcription factors (TFs) based on genome-wide location analysis in yeast A small number of cooperative TFs can set up very complex spatial and temporal patterns of gene expression to accomplish combinatorial regulation of a gene simultaneously or under different conditions Identifying cooperativity among transcription factors helps understand the biological relevance of the TFs under investigation In the recent decade various types of TF-TF interactions which contribute to positive or negative synergy in regulating gene expression have been studied and modeled and many algorithms have been proposed to identify cooperative TF pairs using one or several experimental data including ChIP-chip TF binding site (TFBS) gene expression TF knockout and protein-protein interaction (PPI) However the nucleosome occupancy data has not yet been used for this research topic despite that several researches have revealed the association between nucleosomes and TFBSs Furthermore although many algorithms have been proposed it is still difficult to conduct a comprehensive and objective performance comparison of different algorithms due to lack of sufficient performance indices and adequate overall performance scores In the first part of the dissertation we develop a novel method to infer the cooperativity between two TFs by integrating the TF-gene documented regulation TFBS and nucleosome occupancy data The results show that many of our predictions are validated by the literature and the method outperforms 11 existing methods suggesting that the method is effective in identifying cooperative TF pairs in yeast In the second part of the dissertation we adopt/propose eight performance indices and design two overall performance scores to compare the performance of the existing algorithms for predicting cooperative TF pairs The performance comparison framework can be applied to comprehensively and objectively evaluate the performance of a newly developed algorithm Nevertheless to use the framework researchers have to put a lot of efforts to construct it first including collecting and processing multiple genome-wide datasets from the public domain collecting the lists of the predicted cooperative TF pairs from existing algorithms in the literature and writing a lot of codes to implement the eight performance indices In order to save researchers time and effort in the third part of the dissertation we further develop a web tool to implement the performance comparison framework featuring fast data processing a comprehensive performance comparison and an easy-to-use web interface Besides we also construct a database web site at that given a TF its cooperative TFs documented in the literature can be obtained and each cooperative TF is provided with validating information retrieved from public databases These support data for the TF-TF pair include literature supports physical or genetic protein-protein interactions gene co-citations common gene ontology (GO) terms and common target genes With the help of the framework the web tool and the web database we develop researchers who conduct cooperative TF pair prediction can expedite the research progress by early investigating literature-support data and evaluating the prediction performance
Date of Award2015 Jul 10
Original languageEnglish
SupervisorYueh-Min Huang (Supervisor)

Cite this