A SPARSE NEGATIVE BINOMIAL CLASSIFIER WITH COVARIATE ADJUSTMENT FOR RNA-SEQ DATA

Tanbin Rahman, Hsin En Huang, Yujia Li, An Shun Tai, Wen Ping Hseih, Colleen A. McClung, George Tseng

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Supervised machine learning methods have been increasingly used in biomedical research and clinical practice. In transcriptomic applications, RNA-seq data have become dominating and have gradually replaced tradi-tional microarray, due to their reduced background noise and increased digital precision. Most existing machine learning methods are, however, designed for continuous intensities of microarray and are not suitable for RNA-seq count data. In this paper we develop a negative binomial model via general-ized linear model framework with double regularization for gene and covari-ate sparsity to accommodate three key elements: adequate modeling of count data with overdispersion, gene selection and adjustment for covariate effect. The proposed sparse negative binomial classifier (snbClass) is evaluated in simulations and two real applications of multidisease postmortem brain tissue RNA-seq data and cervical tumor miRNA-seq data to demonstrate its superior performance in prediction accuracy and feature selection.

Original languageEnglish
Pages (from-to)1071-1089
Number of pages19
JournalAnnals of Applied Statistics
Volume16
Issue number2
DOIs
Publication statusPublished - 2022 Jun

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modelling and Simulation
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'A SPARSE NEGATIVE BINOMIAL CLASSIFIER WITH COVARIATE ADJUSTMENT FOR RNA-SEQ DATA'. Together they form a unique fingerprint.

Cite this