One-step extrapolation of the prediction performance of a gene signature derived from a small study

Liang-Yi Wang, Wen Chung Lee

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Objective: Microarray-related studies often involve a very large number of genes and small sample size. Cross-validating or bootstrapping is therefore imperative to obtain a fair assessment of the prediction/classification performance of a gene signature. A deficiency of these methods is the reduced training sample size because of the partition process in cross-validation and sampling with replacement in bootstrapping. To address this problem, we aim to obtain a prediction performance estimate that strikes a good balance between bias and variance and has a small root mean squared error. Methods: We propose to make a one-step extrapolation from the fitted learning curve to estimate the prediction/classification performance of the model trained by all the samples. Results: Simulation studies show that the method strikes a good balance between bias and variance and has a small root mean squared error. Three microarray data sets are used for demonstration. Conclusions: Our method is advocated to estimate the prediction performance of a gene signature derived from a small study.

Original languageEnglish
Article numbere007170
JournalBMJ Open
Volume5
Issue number4
DOIs
Publication statusPublished - 2015 Jan 1
Externally publishedYes

Fingerprint

Sample Size
Genes
Learning Curve
Datasets

All Science Journal Classification (ASJC) codes

  • Medicine(all)

Cite this

@article{f8ad85d48fd34ec09fc9f8bd850fc89c,
title = "One-step extrapolation of the prediction performance of a gene signature derived from a small study",
abstract = "Objective: Microarray-related studies often involve a very large number of genes and small sample size. Cross-validating or bootstrapping is therefore imperative to obtain a fair assessment of the prediction/classification performance of a gene signature. A deficiency of these methods is the reduced training sample size because of the partition process in cross-validation and sampling with replacement in bootstrapping. To address this problem, we aim to obtain a prediction performance estimate that strikes a good balance between bias and variance and has a small root mean squared error. Methods: We propose to make a one-step extrapolation from the fitted learning curve to estimate the prediction/classification performance of the model trained by all the samples. Results: Simulation studies show that the method strikes a good balance between bias and variance and has a small root mean squared error. Three microarray data sets are used for demonstration. Conclusions: Our method is advocated to estimate the prediction performance of a gene signature derived from a small study.",
author = "Liang-Yi Wang and Lee, {Wen Chung}",
year = "2015",
month = "1",
day = "1",
doi = "10.1136/bmjopen-2014-007170",
language = "English",
volume = "5",
journal = "BMJ Open",
issn = "2044-6055",
publisher = "BMJ Publishing Group",
number = "4",

}

One-step extrapolation of the prediction performance of a gene signature derived from a small study. / Wang, Liang-Yi; Lee, Wen Chung.

In: BMJ Open, Vol. 5, No. 4, e007170, 01.01.2015.

Research output: Contribution to journalArticle

TY - JOUR

T1 - One-step extrapolation of the prediction performance of a gene signature derived from a small study

AU - Wang, Liang-Yi

AU - Lee, Wen Chung

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Objective: Microarray-related studies often involve a very large number of genes and small sample size. Cross-validating or bootstrapping is therefore imperative to obtain a fair assessment of the prediction/classification performance of a gene signature. A deficiency of these methods is the reduced training sample size because of the partition process in cross-validation and sampling with replacement in bootstrapping. To address this problem, we aim to obtain a prediction performance estimate that strikes a good balance between bias and variance and has a small root mean squared error. Methods: We propose to make a one-step extrapolation from the fitted learning curve to estimate the prediction/classification performance of the model trained by all the samples. Results: Simulation studies show that the method strikes a good balance between bias and variance and has a small root mean squared error. Three microarray data sets are used for demonstration. Conclusions: Our method is advocated to estimate the prediction performance of a gene signature derived from a small study.

AB - Objective: Microarray-related studies often involve a very large number of genes and small sample size. Cross-validating or bootstrapping is therefore imperative to obtain a fair assessment of the prediction/classification performance of a gene signature. A deficiency of these methods is the reduced training sample size because of the partition process in cross-validation and sampling with replacement in bootstrapping. To address this problem, we aim to obtain a prediction performance estimate that strikes a good balance between bias and variance and has a small root mean squared error. Methods: We propose to make a one-step extrapolation from the fitted learning curve to estimate the prediction/classification performance of the model trained by all the samples. Results: Simulation studies show that the method strikes a good balance between bias and variance and has a small root mean squared error. Three microarray data sets are used for demonstration. Conclusions: Our method is advocated to estimate the prediction performance of a gene signature derived from a small study.

UR - http://www.scopus.com/inward/record.url?scp=84928264580&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928264580&partnerID=8YFLogxK

U2 - 10.1136/bmjopen-2014-007170

DO - 10.1136/bmjopen-2014-007170

M3 - Article

C2 - 25888476

AN - SCOPUS:84928264580

VL - 5

JO - BMJ Open

JF - BMJ Open

SN - 2044-6055

IS - 4

M1 - e007170

ER -