Many gene selection methods have been proposed to select a subset of genes that can have a high prediction accuracy for cancer classification, and most set the same preference for all genes. However, many biological reports have pointed out that mutated or flawed genes, named as risk genes, can be one of the major causes of a specific disease. This study proposes a gene selection method based on the risk genes found in biological reports. The information provided by risk genes can reduce the time complexity for gene selection and increase the accuracy of cancer classification. This gene selection method is composed of two stages. Since all risk genes must be chosen, the first stage is to remove the genes that have similar expression levels or functions to risk genes. The next stage is to perform gene selection and gene replacement based on the results of a process that divides the remaining genes into clusters. Based on the test results from four microarray data sets, our gene selection method outperforms those proposed by previous studies, and genes that have the potential to be new risk genes are presented.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Artificial Intelligence