Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm

Chih Hung Hsieh, Tien-Hao Chang, Cheng Hao Hsueh, Chi Yeh Wu, Yen Jen Oyang

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Background: MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention.Results: This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G2DE) based classifier. The G2DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set.Conclusion: Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G2DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G2DE based predictor.

Original languageEnglish
Article numberS52
JournalBMC Bioinformatics
Volume11
Issue numberSUPPLL.1
DOIs
Publication statusPublished - 2010 Jan 18

Fingerprint

MicroRNA
Density Estimation
Estimation Algorithms
MicroRNAs
Precursor
Classifiers
kernel
Predictors
Classifier
Performance Prediction
Molecular biology
Decision trees
RNA
Logic
Gene expression
Learning algorithms
Untranslated RNA
Support vector machines
Decision Trees
Transcriptional Regulation

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Hsieh, Chih Hung ; Chang, Tien-Hao ; Hsueh, Cheng Hao ; Wu, Chi Yeh ; Oyang, Yen Jen. / Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm. In: BMC Bioinformatics. 2010 ; Vol. 11, No. SUPPLL.1.
@article{d2ba84397b9a41488558ee77042ed42b,
title = "Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm",
abstract = "Background: MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention.Results: This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G2DE) based classifier. The G2DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set.Conclusion: Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G2DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G2DE based predictor.",
author = "Hsieh, {Chih Hung} and Tien-Hao Chang and Hsueh, {Cheng Hao} and Wu, {Chi Yeh} and Oyang, {Yen Jen}",
year = "2010",
month = "1",
day = "18",
doi = "10.1186/1471-2105-11-S1-S52",
language = "English",
volume = "11",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "SUPPLL.1",

}

Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm. / Hsieh, Chih Hung; Chang, Tien-Hao; Hsueh, Cheng Hao; Wu, Chi Yeh; Oyang, Yen Jen.

In: BMC Bioinformatics, Vol. 11, No. SUPPLL.1, S52, 18.01.2010.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm

AU - Hsieh, Chih Hung

AU - Chang, Tien-Hao

AU - Hsueh, Cheng Hao

AU - Wu, Chi Yeh

AU - Oyang, Yen Jen

PY - 2010/1/18

Y1 - 2010/1/18

N2 - Background: MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention.Results: This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G2DE) based classifier. The G2DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set.Conclusion: Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G2DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G2DE based predictor.

AB - Background: MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention.Results: This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G2DE) based classifier. The G2DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set.Conclusion: Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G2DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G2DE based predictor.

UR - http://www.scopus.com/inward/record.url?scp=75149198083&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=75149198083&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-11-S1-S52

DO - 10.1186/1471-2105-11-S1-S52

M3 - Article

VL - 11

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - SUPPLL.1

M1 - S52

ER -