TY - JOUR
T1 - SwCAM
T2 - Estimation of subtype-specific expressions in individual samples with unsupervised sample-wise deconvolution
AU - Chen, Lulu
AU - Wu, Chiung Ting
AU - Lin, Chia Hsiang
AU - Dai, Rujia
AU - Liu, Chunyu
AU - Clarke, Robert
AU - Yu, Guoqiang
AU - Van Eyk, Jennifer E.
AU - Herrington, David M.
AU - Wang, Yue
N1 - Publisher Copyright:
© The Author(s) 2021. Published by Oxford University Press. All rights reserved.
PY - 2022/3/1
Y1 - 2022/3/1
N2 - Motivation: Complex biological tissues are often a heterogeneous mixture of several molecularly distinct cell subtypes. Both subtype compositions and subtype-specific (STS) expressions can vary across biological conditions. Computational deconvolution aims to dissect patterns of bulk tissue data into subtype compositions and STS expressions. Existing deconvolution methods can only estimate averaged STS expressions in a population, while many downstream analyses such as inferring co-expression networks in particular subtypes require subtype expression estimates in individual samples. However, individual-level deconvolution is a mathematically underdetermined problem because there are more variables than observations. Results: We report a sample-wise Convex Analysis of Mixtures (swCAM) method that can estimate subtype proportions and STS expressions in individual samples from bulk tissue transcriptomes. We extend our previous CAM framework to include a new term accounting for between-sample variations and formulate swCAM as a nuclearnorm and ℓ2,1-norm regularized matrix factorization problem. We determine hyperparameter values using crossvalidation with random entry exclusion and obtain a swCAM solution using an efficient alternating direction method of multipliers. Experimental results on realistic simulation data show that swCAM can accurately estimate STS expressions in individual samples and successfully extract co-expression networks in particular subtypes that are otherwise unobtainable using bulk data. In two real-world applications, swCAM analysis of bulk RNASeq data from brain tissue of cases and controls with bipolar disorder or Alzheimer's disease identified significant changes in cell proportion, expression pattern and co-expression module in patient neurons. Comparative evaluation of swCAM versus peer methods is also provided.
AB - Motivation: Complex biological tissues are often a heterogeneous mixture of several molecularly distinct cell subtypes. Both subtype compositions and subtype-specific (STS) expressions can vary across biological conditions. Computational deconvolution aims to dissect patterns of bulk tissue data into subtype compositions and STS expressions. Existing deconvolution methods can only estimate averaged STS expressions in a population, while many downstream analyses such as inferring co-expression networks in particular subtypes require subtype expression estimates in individual samples. However, individual-level deconvolution is a mathematically underdetermined problem because there are more variables than observations. Results: We report a sample-wise Convex Analysis of Mixtures (swCAM) method that can estimate subtype proportions and STS expressions in individual samples from bulk tissue transcriptomes. We extend our previous CAM framework to include a new term accounting for between-sample variations and formulate swCAM as a nuclearnorm and ℓ2,1-norm regularized matrix factorization problem. We determine hyperparameter values using crossvalidation with random entry exclusion and obtain a swCAM solution using an efficient alternating direction method of multipliers. Experimental results on realistic simulation data show that swCAM can accurately estimate STS expressions in individual samples and successfully extract co-expression networks in particular subtypes that are otherwise unobtainable using bulk data. In two real-world applications, swCAM analysis of bulk RNASeq data from brain tissue of cases and controls with bipolar disorder or Alzheimer's disease identified significant changes in cell proportion, expression pattern and co-expression module in patient neurons. Comparative evaluation of swCAM versus peer methods is also provided.
UR - http://www.scopus.com/inward/record.url?scp=85125441137&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125441137&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btab839
DO - 10.1093/bioinformatics/btab839
M3 - Article
C2 - 34904628
AN - SCOPUS:85125441137
SN - 1367-4803
VL - 38
SP - 1403
EP - 1410
JO - Bioinformatics
JF - Bioinformatics
IS - 5
ER -