Rapid breakthroughs in information technologies have driven substantial developments in artificial intelligence applications, particularly the widespread use of deep learning techniques in domains such as speech, image and text recognition. However, real world data distribution applications suffer from significant problems including data imbalance which can easily lead to machine learning biased towards the side with more data, resulting in inaccurate classification or prediction results. Therefore, effectively addressing data imbalance is a pressing research topic. Generative Adversarial Networks (GAN) addresses data imbalance, but is prone to vanishing gradients. Recent work has thus focused on improving the GAN architecture to resolve this problem. The present research extends these efforts, applying C4.5, Random Forest, Support Vector Machine, K-Nearest Neighbor and Naïve Bayes classification algorithms to a single imbalanced traffic collision dataset to identify methods for improving prediction results. Experimental results show that classification performance significantly improves after data augmentation using Synthetic Minority Oversampling Technique, GAN, Conditional GAN, and Gaussian Discriminant Analysis GAN as compared with the non-augmented dataset. In addition, the Gaussian Discriminant Analysis GAN with Naïve Bayes classifier produces a dataset that optimizes classification performance for traffic accident prediction at highway intersections.
|頁（從 - 到）
|IEEE Transactions on Intelligent Transportation Systems
|Published - 2022 10月 1
All Science Journal Classification (ASJC) codes