MicroRPM: A microRNA prediction model based only on plant small RNA sequencing data

Kuan Chieh Tseng, Yi Fan Chiang-Hsieh, Hsuan Pai, Chi Nga Chow, Shu Chuan Lee, Han Qin Zheng, Po Li Kuo, Guan Zhen Li, Yu Cheng Hung, Na Sheng Lin, Wen-Chi Chang

Research output: Contribution to journalArticle

Abstract

Motivation MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play an important role in the post-Transcriptional regulation of gene expression via either mRNA cleavage or translation inhibition. Several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data. Typically, precursor/genomic sequences are required as references for most methods. However, the non-Availability of genomic sequences is often a limitation in miRNA discovery in non-model plants. A systematic approach to determine novel miRNAs without reference sequences is thus necessary. Results In this study, an effective method was developed to identify miRNAs from non-model plants based only on NGS datasets. The miRNA prediction model was trained with several duplex structure-related features of mature miRNAs and their passenger strands using a support vector machine algorithm. The accuracy of the independent test reached 96.61% and 93.04% for dicots (Arabidopsis) and monocots (rice), respectively. Furthermore, true small RNA sequencing data from orchids was tested in this study. Twenty-one predicted orchid miRNAs were selected and experimentally validated. Significantly, 18 of them were confirmed in the qRT-PCR experiment. This novel approach was also compiled as a user-friendly program called microRPM (miRNA Prediction Model). Availability and implementation This resource is freely available at http://microRPM.itps.ncku.edu.tw. Contact nslin@sinica.edu.tw or sarah321@mail.ncku.edu.tw Supplementary informationSupplementary dataare available at Bioinformatics online.

Original languageEnglish
Pages (from-to)1108-1115
Number of pages8
JournalBioinformatics
Volume34
Issue number7
DOIs
Publication statusPublished - 2018 Apr 1

Fingerprint

Plant RNA
RNA Sequence Analysis
MicroRNA
RNA
MicroRNAs
Prediction Model
Sequencing
Model-based
Bioinformatics
Nucleotides
Gene expression
Support vector machines
Learning systems
Availability
Genomics
Experiments
Arabidopsis
Small Untranslated RNA
Transcriptional Regulation
Gene Expression Regulation

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Tseng, K. C., Chiang-Hsieh, Y. F., Pai, H., Chow, C. N., Lee, S. C., Zheng, H. Q., ... Chang, W-C. (2018). MicroRPM: A microRNA prediction model based only on plant small RNA sequencing data. Bioinformatics, 34(7), 1108-1115. https://doi.org/10.1093/bioinformatics/btx725
Tseng, Kuan Chieh ; Chiang-Hsieh, Yi Fan ; Pai, Hsuan ; Chow, Chi Nga ; Lee, Shu Chuan ; Zheng, Han Qin ; Kuo, Po Li ; Li, Guan Zhen ; Hung, Yu Cheng ; Lin, Na Sheng ; Chang, Wen-Chi. / MicroRPM : A microRNA prediction model based only on plant small RNA sequencing data. In: Bioinformatics. 2018 ; Vol. 34, No. 7. pp. 1108-1115.
@article{1e285949b6154169946bfb885c486c1b,
title = "MicroRPM: A microRNA prediction model based only on plant small RNA sequencing data",
abstract = "Motivation MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play an important role in the post-Transcriptional regulation of gene expression via either mRNA cleavage or translation inhibition. Several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data. Typically, precursor/genomic sequences are required as references for most methods. However, the non-Availability of genomic sequences is often a limitation in miRNA discovery in non-model plants. A systematic approach to determine novel miRNAs without reference sequences is thus necessary. Results In this study, an effective method was developed to identify miRNAs from non-model plants based only on NGS datasets. The miRNA prediction model was trained with several duplex structure-related features of mature miRNAs and their passenger strands using a support vector machine algorithm. The accuracy of the independent test reached 96.61{\%} and 93.04{\%} for dicots (Arabidopsis) and monocots (rice), respectively. Furthermore, true small RNA sequencing data from orchids was tested in this study. Twenty-one predicted orchid miRNAs were selected and experimentally validated. Significantly, 18 of them were confirmed in the qRT-PCR experiment. This novel approach was also compiled as a user-friendly program called microRPM (miRNA Prediction Model). Availability and implementation This resource is freely available at http://microRPM.itps.ncku.edu.tw. Contact nslin@sinica.edu.tw or sarah321@mail.ncku.edu.tw Supplementary informationSupplementary dataare available at Bioinformatics online.",
author = "Tseng, {Kuan Chieh} and Chiang-Hsieh, {Yi Fan} and Hsuan Pai and Chow, {Chi Nga} and Lee, {Shu Chuan} and Zheng, {Han Qin} and Kuo, {Po Li} and Li, {Guan Zhen} and Hung, {Yu Cheng} and Lin, {Na Sheng} and Wen-Chi Chang",
year = "2018",
month = "4",
day = "1",
doi = "10.1093/bioinformatics/btx725",
language = "English",
volume = "34",
pages = "1108--1115",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "7",

}

Tseng, KC, Chiang-Hsieh, YF, Pai, H, Chow, CN, Lee, SC, Zheng, HQ, Kuo, PL, Li, GZ, Hung, YC, Lin, NS & Chang, W-C 2018, 'MicroRPM: A microRNA prediction model based only on plant small RNA sequencing data', Bioinformatics, vol. 34, no. 7, pp. 1108-1115. https://doi.org/10.1093/bioinformatics/btx725

MicroRPM : A microRNA prediction model based only on plant small RNA sequencing data. / Tseng, Kuan Chieh; Chiang-Hsieh, Yi Fan; Pai, Hsuan; Chow, Chi Nga; Lee, Shu Chuan; Zheng, Han Qin; Kuo, Po Li; Li, Guan Zhen; Hung, Yu Cheng; Lin, Na Sheng; Chang, Wen-Chi.

In: Bioinformatics, Vol. 34, No. 7, 01.04.2018, p. 1108-1115.

Research output: Contribution to journalArticle

TY - JOUR

T1 - MicroRPM

T2 - A microRNA prediction model based only on plant small RNA sequencing data

AU - Tseng, Kuan Chieh

AU - Chiang-Hsieh, Yi Fan

AU - Pai, Hsuan

AU - Chow, Chi Nga

AU - Lee, Shu Chuan

AU - Zheng, Han Qin

AU - Kuo, Po Li

AU - Li, Guan Zhen

AU - Hung, Yu Cheng

AU - Lin, Na Sheng

AU - Chang, Wen-Chi

PY - 2018/4/1

Y1 - 2018/4/1

N2 - Motivation MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play an important role in the post-Transcriptional regulation of gene expression via either mRNA cleavage or translation inhibition. Several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data. Typically, precursor/genomic sequences are required as references for most methods. However, the non-Availability of genomic sequences is often a limitation in miRNA discovery in non-model plants. A systematic approach to determine novel miRNAs without reference sequences is thus necessary. Results In this study, an effective method was developed to identify miRNAs from non-model plants based only on NGS datasets. The miRNA prediction model was trained with several duplex structure-related features of mature miRNAs and their passenger strands using a support vector machine algorithm. The accuracy of the independent test reached 96.61% and 93.04% for dicots (Arabidopsis) and monocots (rice), respectively. Furthermore, true small RNA sequencing data from orchids was tested in this study. Twenty-one predicted orchid miRNAs were selected and experimentally validated. Significantly, 18 of them were confirmed in the qRT-PCR experiment. This novel approach was also compiled as a user-friendly program called microRPM (miRNA Prediction Model). Availability and implementation This resource is freely available at http://microRPM.itps.ncku.edu.tw. Contact nslin@sinica.edu.tw or sarah321@mail.ncku.edu.tw Supplementary informationSupplementary dataare available at Bioinformatics online.

AB - Motivation MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play an important role in the post-Transcriptional regulation of gene expression via either mRNA cleavage or translation inhibition. Several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data. Typically, precursor/genomic sequences are required as references for most methods. However, the non-Availability of genomic sequences is often a limitation in miRNA discovery in non-model plants. A systematic approach to determine novel miRNAs without reference sequences is thus necessary. Results In this study, an effective method was developed to identify miRNAs from non-model plants based only on NGS datasets. The miRNA prediction model was trained with several duplex structure-related features of mature miRNAs and their passenger strands using a support vector machine algorithm. The accuracy of the independent test reached 96.61% and 93.04% for dicots (Arabidopsis) and monocots (rice), respectively. Furthermore, true small RNA sequencing data from orchids was tested in this study. Twenty-one predicted orchid miRNAs were selected and experimentally validated. Significantly, 18 of them were confirmed in the qRT-PCR experiment. This novel approach was also compiled as a user-friendly program called microRPM (miRNA Prediction Model). Availability and implementation This resource is freely available at http://microRPM.itps.ncku.edu.tw. Contact nslin@sinica.edu.tw or sarah321@mail.ncku.edu.tw Supplementary informationSupplementary dataare available at Bioinformatics online.

UR - http://www.scopus.com/inward/record.url?scp=85045845770&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045845770&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btx725

DO - 10.1093/bioinformatics/btx725

M3 - Article

C2 - 29136092

AN - SCOPUS:85045845770

VL - 34

SP - 1108

EP - 1115

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 7

ER -

Tseng KC, Chiang-Hsieh YF, Pai H, Chow CN, Lee SC, Zheng HQ et al. MicroRPM: A microRNA prediction model based only on plant small RNA sequencing data. Bioinformatics. 2018 Apr 1;34(7):1108-1115. https://doi.org/10.1093/bioinformatics/btx725