TY - JOUR
T1 - Identifying transcriptional start sites of human microRNAs based on high-throughput sequencing data
AU - Chien, Chia Hung
AU - Sun, Yi Ming
AU - Chang, Wen Chi
AU - Chiang-Hsieh, Pei Yun
AU - Lee, Tzong Yi
AU - Tsai, Wei Chih
AU - Horng, Jorng Tzong
AU - Tsou, Ann Ping
AU - Huang, Hsien Da
N1 - Funding Information:
National Science Council of the Republic of China (Contract No. NSC 98-2311-B-009-004-MY3 and NSC 99-2627-B-009-003); UST-UCSD International Center of Excellence in Advanced Bio-engineering sponsored by the Taiwan National Science Council I-RiCE Program (NSC-99-2911-I-010-101, in part); MOE ATU (in part). Funding for open access charge: National Science Council of the Republic of China.
PY - 2011/11
Y1 - 2011/11
N2 - MicroRNAs (miRNAs) are critical small non-coding RNAs that regulate gene expression by hybridizing to the 3′-untranslated regions (3′-UTR) of target mRNAs, subsequently controlling diverse biological processes at post-transcriptional level. How miRNA genes are regulated receives considerable attention because it directly affects miRNA-mediated gene regulatory networks. Although numerous prediction models were developed for identifying miRNA promoters or transcriptional start sites (TSSs), most of them lack experimental validation and are inadequate to elucidate relationships between miRNA genes and transcription factors (TFs). Here, we integrate three experimental datasets, including cap analysis of gene expression (CAGE) tags, TSS Seq libraries and H3K4me3 chromatin signature derived from high-throughput sequencing analysis of gene initiation, to provide direct evidence of miRNA TSSs, thus establishing an experimental-based resource of human miRNA TSSs, named miRStart. Moreover, a machine-learning-based Support Vector Machine (SVM) model is developed to systematically identify representative TSSs for each miRNA gene. Finally, to demonstrate the effectiveness of the proposed resource, an important human intergenic miRNA, hsa-miR-122, is selected to experimentally validate putative TSS owing to its high expression in a normal liver. In conclusion, this work successfully identified 847 human miRNA TSSs (292 of them are clustered to 70 TSSs of miRNA clusters) based on the utilization of high-throughput sequencing data from TSS-relevant experiments, and establish a valuable resource for biologists in advanced research in miRNA-mediated regulatory networks.
AB - MicroRNAs (miRNAs) are critical small non-coding RNAs that regulate gene expression by hybridizing to the 3′-untranslated regions (3′-UTR) of target mRNAs, subsequently controlling diverse biological processes at post-transcriptional level. How miRNA genes are regulated receives considerable attention because it directly affects miRNA-mediated gene regulatory networks. Although numerous prediction models were developed for identifying miRNA promoters or transcriptional start sites (TSSs), most of them lack experimental validation and are inadequate to elucidate relationships between miRNA genes and transcription factors (TFs). Here, we integrate three experimental datasets, including cap analysis of gene expression (CAGE) tags, TSS Seq libraries and H3K4me3 chromatin signature derived from high-throughput sequencing analysis of gene initiation, to provide direct evidence of miRNA TSSs, thus establishing an experimental-based resource of human miRNA TSSs, named miRStart. Moreover, a machine-learning-based Support Vector Machine (SVM) model is developed to systematically identify representative TSSs for each miRNA gene. Finally, to demonstrate the effectiveness of the proposed resource, an important human intergenic miRNA, hsa-miR-122, is selected to experimentally validate putative TSS owing to its high expression in a normal liver. In conclusion, this work successfully identified 847 human miRNA TSSs (292 of them are clustered to 70 TSSs of miRNA clusters) based on the utilization of high-throughput sequencing data from TSS-relevant experiments, and establish a valuable resource for biologists in advanced research in miRNA-mediated regulatory networks.
UR - http://www.scopus.com/inward/record.url?scp=82255185653&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=82255185653&partnerID=8YFLogxK
U2 - 10.1093/nar/gkr604
DO - 10.1093/nar/gkr604
M3 - Article
C2 - 21821656
AN - SCOPUS:82255185653
SN - 0305-1048
VL - 39
SP - 9345
EP - 9356
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 21
ER -