Building Robust Models for Small Data Containing Nominal Inputs and Continuous Outputs Based on Possibility Distributions

  • 史 其仕

Student thesis: Doctoral Thesis

Abstract

Regarding building statistically robust models it is challenging for standard algorithms to learning from small data In previous studies virtual sample generation (VSG) techniques have been verified as effective in terms of meeting this challenge However most VSG techniques were developed for numerical datasets and classification problems Therefore to address situations where the dataset has nominal inputs and continuous outputs a systemic VSG procedure is proposed in this study to create new samples based on theories of fuzziness and diffusion At first based on the concept of the encoding process in the M5’ model tree this study reveals a useful procedure by which to extract the fuzzy relations between nominal inputs and continuous outputs Further with the idea of nonparametric operations it employs trend similarity to present the fuzzy relations between inputs and outputs Then possibility distributions of the inputs and outputs are built based on these fuzzy relations Finally virtual samples are created based on these distributions and their possibility values In the experiments it uses five public datasets two prediction models and two other VSG techniques The experimental results show that the small data using virtual samples created by the proposed method outperform the comparison experiments with the other VSG techniques
Date of Award2019
Original languageEnglish
SupervisorDer-Chiang Li (Supervisor)

Cite this

'