TY - JOUR
T1 - Linear Approximation of F-Measure for the Performance Evaluation of Classification Algorithms on Imbalanced Data Sets
AU - Wong, Tzu Tsung
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2022/2/1
Y1 - 2022/2/1
N2 - Accuracy is a popular measure for evaluating the performance of classification algorithms tested on ordinary data sets. When a data set is imbalanced, F-measure will be a better choice than accuracy for this purpose. Since F-measure is calculated as the harmonic mean of recall and precision, it is difficult to find the sampling distribution of F-measure for evaluating classification algorithms. Since the values of recall and precision are dependent, their joint distribution is assumed to follow a bivariate normal distribution in this study. When the evaluation method is k-fold cross validation, a linear approximation approach is proposed to derive the sampling distribution of F-measure. This approach is used to design methods for comparing the performance of two classification algorithms tested on single or multiple imbalanced data sets. The methods are tested on ten imbalanced data sets to demonstrate their effectiveness. The weight of recall provided by this linear approximation approach can help us to analyze the characteristics of classification algorithms.
AB - Accuracy is a popular measure for evaluating the performance of classification algorithms tested on ordinary data sets. When a data set is imbalanced, F-measure will be a better choice than accuracy for this purpose. Since F-measure is calculated as the harmonic mean of recall and precision, it is difficult to find the sampling distribution of F-measure for evaluating classification algorithms. Since the values of recall and precision are dependent, their joint distribution is assumed to follow a bivariate normal distribution in this study. When the evaluation method is k-fold cross validation, a linear approximation approach is proposed to derive the sampling distribution of F-measure. This approach is used to design methods for comparing the performance of two classification algorithms tested on single or multiple imbalanced data sets. The methods are tested on ten imbalanced data sets to demonstrate their effectiveness. The weight of recall provided by this linear approximation approach can help us to analyze the characteristics of classification algorithms.
UR - http://www.scopus.com/inward/record.url?scp=85123625656&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123625656&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2020.2986749
DO - 10.1109/TKDE.2020.2986749
M3 - Article
AN - SCOPUS:85123625656
SN - 1041-4347
VL - 34
SP - 753
EP - 763
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 2
ER -