TY - JOUR

T1 - Avoiding pitfalls in L1-regularised inference of gene networks

AU - Tjärnberg, Andreas

AU - Nordling, Torbjörn E.M.

AU - Studham, Matthew

AU - Nelander, Sven

AU - Sonnhammer, Erik L.L.

N1 - Publisher Copyright:
© The Royal Society of Chemistry 2015.

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Statistical regularisation methods such as LASSO and related L1 regularised regression methods are commonly used to construct models of gene regulatory networks. Although they can theoretically infer the correct network structure, they have been shown in practice to make errors, i.e. leave out existing links and include non-existing links. We show that L1 regularisation methods typically produce a poor network model when the analysed data are ill-conditioned, i.e. the gene expression data matrix has a high condition number, even if it contains enough information for correct network inference. However, the correct structure of network models can be obtained for informative data, data with such a signal to noise ratio that existing links can be proven to exist, when these methods fail, by using least-squares regression and setting small parameters to zero, or by using robust network inference, a recent method taking the intersection of all non-rejectable models. Since available experimental data sets are generally ill-conditioned, we recommend to check the condition number of the data matrix to avoid this pitfall of L1 regularised inference, and to also consider alternative methods. This journal is

AB - Statistical regularisation methods such as LASSO and related L1 regularised regression methods are commonly used to construct models of gene regulatory networks. Although they can theoretically infer the correct network structure, they have been shown in practice to make errors, i.e. leave out existing links and include non-existing links. We show that L1 regularisation methods typically produce a poor network model when the analysed data are ill-conditioned, i.e. the gene expression data matrix has a high condition number, even if it contains enough information for correct network inference. However, the correct structure of network models can be obtained for informative data, data with such a signal to noise ratio that existing links can be proven to exist, when these methods fail, by using least-squares regression and setting small parameters to zero, or by using robust network inference, a recent method taking the intersection of all non-rejectable models. Since available experimental data sets are generally ill-conditioned, we recommend to check the condition number of the data matrix to avoid this pitfall of L1 regularised inference, and to also consider alternative methods. This journal is

UR - http://www.scopus.com/inward/record.url?scp=84918811986&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84918811986&partnerID=8YFLogxK

U2 - 10.1039/c4mb00419a

DO - 10.1039/c4mb00419a

M3 - Article

C2 - 25377664

AN - SCOPUS:84918811986

VL - 11

SP - 287

EP - 296

JO - Molecular BioSystems

JF - Molecular BioSystems

SN - 1742-206X

IS - 1

ER -