TY - GEN
T1 - Identification of biomarkers and signatures in protein data
AU - Nordling, Torbjorn E.M.
AU - Padhan, Narendra
AU - Nelander, Sven
AU - Claesson-Welsh, Lena
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/10/22
Y1 - 2015/10/22
N2 - The correct diagnosis of cancer patients conventionally depends on the pathologist's experience and ability to distinguish cancer tissue from normal tissue under a microscope. Advances in technology for measuring the abundance of, e.g., proteins and mRNAs in tissue samples make it interesting to search for an optimal subset of these for classification of samples as cancer or normal. We discuss issues of identification of biomarkers that provide distinct signatures for prediction of tissues as cancer or normal, exemplified by our recent study of cancer signalling signatures in human colon cancer characterised with regards to protein abundance using high sensitivity isoelectric focusing. We show that the optimal subset for separation of cancer tissues from normal tissues does not contain any of the proteins in the top quintile in terms of significant difference between the groups according to Mann-Whitney U-test or correlation to the diagnosis. Actually, one of the proteins belongs to the tertile with the lowest significance and correlation. This highlights the weakness of the practice of only looking for significant differences in the abundance of individual proteins and raises the question of how many lifesaving discoveries that have been missed due to it. We also demonstrate how Monte Carlo simulations of the separation with random class assignment can be used to calculate p-values for observing any specific separation by chance and selection of the optimal number of proteins in the subset based on these p-values. Both selection of the optimal number of biomarkers and calculation of p-values corrected for multiple hypothesis testing are essential to obtain a subset of biomarkers that yield robust predictions for clinical use.
AB - The correct diagnosis of cancer patients conventionally depends on the pathologist's experience and ability to distinguish cancer tissue from normal tissue under a microscope. Advances in technology for measuring the abundance of, e.g., proteins and mRNAs in tissue samples make it interesting to search for an optimal subset of these for classification of samples as cancer or normal. We discuss issues of identification of biomarkers that provide distinct signatures for prediction of tissues as cancer or normal, exemplified by our recent study of cancer signalling signatures in human colon cancer characterised with regards to protein abundance using high sensitivity isoelectric focusing. We show that the optimal subset for separation of cancer tissues from normal tissues does not contain any of the proteins in the top quintile in terms of significant difference between the groups according to Mann-Whitney U-test or correlation to the diagnosis. Actually, one of the proteins belongs to the tertile with the lowest significance and correlation. This highlights the weakness of the practice of only looking for significant differences in the abundance of individual proteins and raises the question of how many lifesaving discoveries that have been missed due to it. We also demonstrate how Monte Carlo simulations of the separation with random class assignment can be used to calculate p-values for observing any specific separation by chance and selection of the optimal number of proteins in the subset based on these p-values. Both selection of the optimal number of biomarkers and calculation of p-values corrected for multiple hypothesis testing are essential to obtain a subset of biomarkers that yield robust predictions for clinical use.
UR - http://www.scopus.com/inward/record.url?scp=84959062990&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959062990&partnerID=8YFLogxK
U2 - 10.1109/eScience.2015.46
DO - 10.1109/eScience.2015.46
M3 - Conference contribution
AN - SCOPUS:84959062990
T3 - Proceedings - 11th IEEE International Conference on eScience, eScience 2015
SP - 411
EP - 419
BT - Proceedings - 11th IEEE International Conference on eScience, eScience 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 11th IEEE International Conference on eScience, eScience 2015
Y2 - 31 August 2015 through 4 September 2015
ER -