Fast Bayesian variable screenings for binary response regressions with small sample size

Research output: Contribution to journalArticle

Abstract

Screening procedures play an important role in data analysis, especially in high-throughput biological studies where the datasets consist of more covariates than independent subjects. In this article, a Bayesian screening procedure is introduced for the binary response models with logit and probit links. In contrast to many screening rules based on marginal information involving one or a few covariates, the proposed Bayesian procedure simultaneously models all covariates and uses closed-form screening statistics. Specifically, we use the posterior means of the regression coefficients as screening statistics; by imposing a generalized g-prior on the regression coefficients, we derive the analytical form of their posterior means and compute the screening statistics without Markov chain Monte Carlo implementation. We evaluate the utility of the proposed Bayesian screening method using simulations and real data analysis. When the sample size is small, the simulation results suggest improved performance with comparable computational cost.

Original languageEnglish
Pages (from-to)2708-2723
Number of pages16
JournalJournal of Statistical Computation and Simulation
Volume87
Issue number14
DOIs
Publication statusPublished - 2017 Sep 22

Fingerprint

Binary Response
Small Sample Size
Screening
Regression
Posterior Mean
Covariates
Statistics
Regression Coefficient
Data analysis
Binary Response Model
Probit
Logit
Small sample
Sample size
Binary response
Markov Chain Monte Carlo
Simulation Methods
Markov processes
High Throughput
Computational Cost

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modelling and Simulation
  • Statistics, Probability and Uncertainty
  • Applied Mathematics

Cite this

@article{53b43ed692704d999f78ce1aba243a5c,
title = "Fast Bayesian variable screenings for binary response regressions with small sample size",
abstract = "Screening procedures play an important role in data analysis, especially in high-throughput biological studies where the datasets consist of more covariates than independent subjects. In this article, a Bayesian screening procedure is introduced for the binary response models with logit and probit links. In contrast to many screening rules based on marginal information involving one or a few covariates, the proposed Bayesian procedure simultaneously models all covariates and uses closed-form screening statistics. Specifically, we use the posterior means of the regression coefficients as screening statistics; by imposing a generalized g-prior on the regression coefficients, we derive the analytical form of their posterior means and compute the screening statistics without Markov chain Monte Carlo implementation. We evaluate the utility of the proposed Bayesian screening method using simulations and real data analysis. When the sample size is small, the simulation results suggest improved performance with comparable computational cost.",
author = "Chang, {S. M.} and Tzeng, {J. Y.} and Chen, {R. B.}",
year = "2017",
month = "9",
day = "22",
doi = "10.1080/00949655.2017.1341887",
language = "English",
volume = "87",
pages = "2708--2723",
journal = "Journal of Statistical Computation and Simulation",
issn = "0094-9655",
publisher = "Taylor and Francis Ltd.",
number = "14",

}

TY - JOUR

T1 - Fast Bayesian variable screenings for binary response regressions with small sample size

AU - Chang, S. M.

AU - Tzeng, J. Y.

AU - Chen, R. B.

PY - 2017/9/22

Y1 - 2017/9/22

N2 - Screening procedures play an important role in data analysis, especially in high-throughput biological studies where the datasets consist of more covariates than independent subjects. In this article, a Bayesian screening procedure is introduced for the binary response models with logit and probit links. In contrast to many screening rules based on marginal information involving one or a few covariates, the proposed Bayesian procedure simultaneously models all covariates and uses closed-form screening statistics. Specifically, we use the posterior means of the regression coefficients as screening statistics; by imposing a generalized g-prior on the regression coefficients, we derive the analytical form of their posterior means and compute the screening statistics without Markov chain Monte Carlo implementation. We evaluate the utility of the proposed Bayesian screening method using simulations and real data analysis. When the sample size is small, the simulation results suggest improved performance with comparable computational cost.

AB - Screening procedures play an important role in data analysis, especially in high-throughput biological studies where the datasets consist of more covariates than independent subjects. In this article, a Bayesian screening procedure is introduced for the binary response models with logit and probit links. In contrast to many screening rules based on marginal information involving one or a few covariates, the proposed Bayesian procedure simultaneously models all covariates and uses closed-form screening statistics. Specifically, we use the posterior means of the regression coefficients as screening statistics; by imposing a generalized g-prior on the regression coefficients, we derive the analytical form of their posterior means and compute the screening statistics without Markov chain Monte Carlo implementation. We evaluate the utility of the proposed Bayesian screening method using simulations and real data analysis. When the sample size is small, the simulation results suggest improved performance with comparable computational cost.

UR - http://www.scopus.com/inward/record.url?scp=85021297390&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021297390&partnerID=8YFLogxK

U2 - 10.1080/00949655.2017.1341887

DO - 10.1080/00949655.2017.1341887

M3 - Article

AN - SCOPUS:85021297390

VL - 87

SP - 2708

EP - 2723

JO - Journal of Statistical Computation and Simulation

JF - Journal of Statistical Computation and Simulation

SN - 0094-9655

IS - 14

ER -