摘要
Motivation MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play an important role in the post-Transcriptional regulation of gene expression via either mRNA cleavage or translation inhibition. Several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data. Typically, precursor/genomic sequences are required as references for most methods. However, the non-Availability of genomic sequences is often a limitation in miRNA discovery in non-model plants. A systematic approach to determine novel miRNAs without reference sequences is thus necessary. Results In this study, an effective method was developed to identify miRNAs from non-model plants based only on NGS datasets. The miRNA prediction model was trained with several duplex structure-related features of mature miRNAs and their passenger strands using a support vector machine algorithm. The accuracy of the independent test reached 96.61% and 93.04% for dicots (Arabidopsis) and monocots (rice), respectively. Furthermore, true small RNA sequencing data from orchids was tested in this study. Twenty-one predicted orchid miRNAs were selected and experimentally validated. Significantly, 18 of them were confirmed in the qRT-PCR experiment. This novel approach was also compiled as a user-friendly program called microRPM (miRNA Prediction Model). Availability and implementation This resource is freely available at http://microRPM.itps.ncku.edu.tw. Contact [email protected] or [email protected] Supplementary informationSupplementary dataare available at Bioinformatics online.
原文 | English |
---|---|
頁(從 - 到) | 1108-1115 |
頁數 | 8 |
期刊 | Bioinformatics |
卷 | 34 |
發行號 | 7 |
DOIs | |
出版狀態 | Published - 2018 4月 1 |
All Science Journal Classification (ASJC) codes
- 統計與概率
- 生物化學
- 分子生物學
- 電腦科學應用
- 計算機理論與數學
- 計算數學