A new approach to generating virtual samples to enhance classification accuracy with small data-a case of bladder cancer

Liang Sian Lin, Susan C. Hu, Yao San Lin, Der Chiang Li, Liang Ren Siao

研究成果: Article同行評審

3 引文 斯高帕斯(Scopus)

摘要

In the medical field, researchers are often unable to obtain the sufficient samples in a short period of time necessary to build a stable data-driven forecasting model used to classify a new disease. To address the problem of small data learning, many studies have demonstrated that generating virtual samples intended to augment the amount of training data is an effective approach, as it helps to improve forecasting models with small datasets. One of the most popular methods used in these studies is the mega-trend-diffusion (MTD) technique, which is widely used in various fields. The effectiveness of the MTD technique depends on the degree of data diffusion. However, data diffusion is seriously affected by extreme values. In addition, the MTD method only considers data fitted using a unimodal triangular membership function. However, in fact, data may come from multiple distributions in the real world. Therefore, considering the fact that data comes from multi-distributions, in this paper, a distance-based mega-trend-diffusion (DB-MTD) technique is proposed to appropriately estimate the degree of data diffusion with less impacts from extreme values. In the proposed method, it is assumed that the data is fitted by the triangular and trapezoidal membership functions to generate virtual samples. In addition, a possibility evaluation mechanism is proposed to measure the applicability of the virtual samples. In our experiment, two bladder cancer datasets are used to verify the effectiveness of the proposed DB-MTD method. The experimental results demonstrated that the proposed method outperforms other VSG techniques in classification and regression items for small bladder cancer datasets.

原文English
頁(從 - 到)6204-6233
頁數30
期刊Mathematical Biosciences and Engineering
19
發行號6
DOIs
出版狀態Published - 2022

All Science Journal Classification (ASJC) codes

  • 建模與模擬
  • 農業與生物科學 (全部)
  • 計算數學
  • 應用數學

指紋

深入研究「A new approach to generating virtual samples to enhance classification accuracy with small data-a case of bladder cancer」主題。共同形成了獨特的指紋。

引用此