TY - JOUR
T1 - Sorting multiple classes in multi-dimensional ROC analysis
T2 - Parametric and nonparametric approaches
AU - Li, Jialiang
AU - Chow, Yanyu
AU - Wong, Weng Kee
AU - Wong, Tien Yin
N1 - Funding Information:
The research was partially supported by National Medical Research Council NMRC/CBRG/0014/2012 and ARF R-155-000-130-112. Weng Kee Wong worked on this manuscript when he was a visiting fellow and a member of the scientific advisory board for a six-month workshop on the design and analysis of experimental designs at The Sir Isaac Newton Institute at Cambridge, England. He would like to thank the Institute for the support during his repeated visits in the latter half of 2011. Authors report no conflict of interest
PY - 2014/2
Y1 - 2014/2
N2 - In large-scale data analysis, such as in a microarray study to identify the most differentially expressed genes, diagnostic tests are frequently used to classify and predict subjects into their different categories. Frequently, these categories do not have an intrinsic natural order even though the quantitative test results have a relative order. As identifying the correct order for a proper definition of accuracy measures is important for a high-dimensional receiver operating characteristic (ROC) analysis, we propose rigorous and automated approaches to sort out the multiple categories using simple summary statistics such as means and relative effects. We discuss the hypervolume under the ROC manifold (HUM), its dependence on the order of the test results and the minimum acceptable HUM values in a general multi-category classification problem. Using a leukemia data set and a liver cancer data set, we show how our approaches provide accurate screening results when we have a large number of tests.
AB - In large-scale data analysis, such as in a microarray study to identify the most differentially expressed genes, diagnostic tests are frequently used to classify and predict subjects into their different categories. Frequently, these categories do not have an intrinsic natural order even though the quantitative test results have a relative order. As identifying the correct order for a proper definition of accuracy measures is important for a high-dimensional receiver operating characteristic (ROC) analysis, we propose rigorous and automated approaches to sort out the multiple categories using simple summary statistics such as means and relative effects. We discuss the hypervolume under the ROC manifold (HUM), its dependence on the order of the test results and the minimum acceptable HUM values in a general multi-category classification problem. Using a leukemia data set and a liver cancer data set, we show how our approaches provide accurate screening results when we have a large number of tests.
UR - http://www.scopus.com/inward/record.url?scp=84893293591&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893293591&partnerID=8YFLogxK
U2 - 10.3109/1354750X.2013.868516
DO - 10.3109/1354750X.2013.868516
M3 - Review article
C2 - 24329017
AN - SCOPUS:84893293591
SN - 1354-750X
VL - 19
SP - 1
EP - 8
JO - Biomarkers
JF - Biomarkers
IS - 1
ER -