TY - GEN
T1 - Patterns discovery on complex diagnosis and biological data using fuzzy latent variables
AU - Yin, Zong Xian
AU - Chiang, Jung Hsien
N1 - Funding Information:
This work was supported by the National Natural Science Foundation of China.
PY - 2007
Y1 - 2007
N2 - This paper proposes a new clustering algorithm referred to as the Possibilitic Latent Variables (PLV) clustering algorithm. This algorithm provides a powerful tool for the analysis of complex data, such as clinical diagnosis and biological expressions data, due to its robustness to various data distributions and its accuracy in establishing appropriate groups from data. The algorithm combines a distribution model and the fuzzy degrees concept. Compared to the expectation-maximization (EM) algorithm, which is a well-known distribution estimating algorithm, the PLV algorithm has the considerable advantage that it can be applied to various data types, i.e. it is not restricted solely to Gaussian data distributions. Additionally, the proposed algorithm has a better performance than the well-known fuzzy clustering algorithm, i.e. the FCMalgorithm, where it can address compact regions, other than simply dividing objects into several equal populations. The performance of the proposed algorithm is verified by conducting clustering tasks on the contents of several medical diagnosis and biological expressions datasets.
AB - This paper proposes a new clustering algorithm referred to as the Possibilitic Latent Variables (PLV) clustering algorithm. This algorithm provides a powerful tool for the analysis of complex data, such as clinical diagnosis and biological expressions data, due to its robustness to various data distributions and its accuracy in establishing appropriate groups from data. The algorithm combines a distribution model and the fuzzy degrees concept. Compared to the expectation-maximization (EM) algorithm, which is a well-known distribution estimating algorithm, the PLV algorithm has the considerable advantage that it can be applied to various data types, i.e. it is not restricted solely to Gaussian data distributions. Additionally, the proposed algorithm has a better performance than the well-known fuzzy clustering algorithm, i.e. the FCMalgorithm, where it can address compact regions, other than simply dividing objects into several equal populations. The performance of the proposed algorithm is verified by conducting clustering tasks on the contents of several medical diagnosis and biological expressions datasets.
UR - http://www.scopus.com/inward/record.url?scp=34548769247&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548769247&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2007.367903
DO - 10.1109/ICDE.2007.367903
M3 - Conference contribution
AN - SCOPUS:34548769247
SN - 1424408032
SN - 9781424408030
T3 - Proceedings - International Conference on Data Engineering
SP - 576
EP - 585
BT - 23rd International Conference on Data Engineering, ICDE 2007
T2 - 23rd International Conference on Data Engineering, ICDE 2007
Y2 - 15 April 2007 through 20 April 2007
ER -