TY - JOUR
T1 - Sample size calculation for differential expression analysis of RNA-seq data under Poisson distribution
AU - Li, Chung I.
AU - Su, Pei Fang
AU - Guo, Yan
AU - Shyr, Yu
PY - 2013
Y1 - 2013
N2 - Sample size determination is an important issue in the experimental design of biomedical research. Because of the complexity of RNA-seq experiments, however, the field currently lacks a sample size method widely applicable to differential expression studies utilising RNA-seq technology. In this report, we propose several methods for sample size calculation for single-gene differential expression analysis of RNA-seq data under Poisson distribution. These methods are then extended to multiple genes, with consideration for addressing the multiple testing problem by controlling false discovery rate. Moreover, most of the proposed methods allow for closed-form sample size formulas with specification of the desired minimum fold change and minimum average read count, and thus are not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size formulas are presented; the results indicate that our methods work well, with achievement of desired power. Finally, our sample size calculation methods are applied to three real RNA-seq data sets.
AB - Sample size determination is an important issue in the experimental design of biomedical research. Because of the complexity of RNA-seq experiments, however, the field currently lacks a sample size method widely applicable to differential expression studies utilising RNA-seq technology. In this report, we propose several methods for sample size calculation for single-gene differential expression analysis of RNA-seq data under Poisson distribution. These methods are then extended to multiple genes, with consideration for addressing the multiple testing problem by controlling false discovery rate. Moreover, most of the proposed methods allow for closed-form sample size formulas with specification of the desired minimum fold change and minimum average read count, and thus are not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size formulas are presented; the results indicate that our methods work well, with achievement of desired power. Finally, our sample size calculation methods are applied to three real RNA-seq data sets.
UR - http://www.scopus.com/inward/record.url?scp=84885030144&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84885030144&partnerID=8YFLogxK
U2 - 10.1504/IJCBDD.2013.056830
DO - 10.1504/IJCBDD.2013.056830
M3 - Article
C2 - 24088268
AN - SCOPUS:84885030144
SN - 1756-0756
VL - 6
SP - 358
EP - 375
JO - International Journal of Computational Biology and Drug Design
JF - International Journal of Computational Biology and Drug Design
IS - 4
ER -