Shrinkage regression-based methods for microarray missing value imputation

Hsiuying Wang, Chia Chun Chiu, Yi Ching Wu, Wei-Sheng Wu

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Background: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. Results: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Conclusions: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.

Original languageEnglish
Article numberS11
JournalBMC systems biology
Volume7
Issue number6
DOIs
Publication statusPublished - 2013 Dec 13

Fingerprint

Missing Values
Imputation
Microarrays
Shrinkage
Microarray
Regression
Genes
Microarray Data
Testing
Microarray Analysis
Gene
Shrinkage Estimation
Pearson Correlation
Correlation Structure
Coefficient
Inaccurate
Least-Squares Analysis
Correlation coefficient
Estimate
Least Squares

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Modelling and Simulation
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Wang, Hsiuying ; Chiu, Chia Chun ; Wu, Yi Ching ; Wu, Wei-Sheng. / Shrinkage regression-based methods for microarray missing value imputation. In: BMC systems biology. 2013 ; Vol. 7, No. 6.
@article{94a31a7d257e47bc94b52148c9bcc37e,
title = "Shrinkage regression-based methods for microarray missing value imputation",
abstract = "Background: Missing values commonly occur in the microarray data, which usually contain more than 5{\%} missing values with up to 90{\%} of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. Results: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Conclusions: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.",
author = "Hsiuying Wang and Chiu, {Chia Chun} and Wu, {Yi Ching} and Wei-Sheng Wu",
year = "2013",
month = "12",
day = "13",
doi = "10.1186/1752-0509-7-S6-S11",
language = "English",
volume = "7",
journal = "BMC Systems Biology",
issn = "1752-0509",
publisher = "BioMed Central",
number = "6",

}

Shrinkage regression-based methods for microarray missing value imputation. / Wang, Hsiuying; Chiu, Chia Chun; Wu, Yi Ching; Wu, Wei-Sheng.

In: BMC systems biology, Vol. 7, No. 6, S11, 13.12.2013.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Shrinkage regression-based methods for microarray missing value imputation

AU - Wang, Hsiuying

AU - Chiu, Chia Chun

AU - Wu, Yi Ching

AU - Wu, Wei-Sheng

PY - 2013/12/13

Y1 - 2013/12/13

N2 - Background: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. Results: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Conclusions: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.

AB - Background: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. Results: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Conclusions: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.

UR - http://www.scopus.com/inward/record.url?scp=84908510618&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84908510618&partnerID=8YFLogxK

U2 - 10.1186/1752-0509-7-S6-S11

DO - 10.1186/1752-0509-7-S6-S11

M3 - Article

C2 - 24565159

AN - SCOPUS:84908510618

VL - 7

JO - BMC Systems Biology

JF - BMC Systems Biology

SN - 1752-0509

IS - 6

M1 - S11

ER -