Email security level classification of imbalanced data using artificial neural network: The real case in a world-leading enterprise

Jen Wei Huang, Chia Wen Chiang, Jia Wei Chang

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

Email is far more convenient than traditional mail in the delivery of messages. However, it is susceptible to information leakage in business. This problem can be alleviated by classifying emails into different security levels using text mining and machine learning technology. In this research, we developed a scheme in which a neural network is used to extract information from emails to enable its transformation into a multidimensional vector. Email text data is processed using bi-gram to train the document vector, which then undergoes under-sampling to deal with the problem of data imbalance. Finally, the security label of emails is classified using an artificial neural network. The proposed system was evaluated in an actual corporate setting. The results show that the proposed feature extraction approach is more effective than existing methods for the representations of email data in true positive rates and F1-scores.

Original languageEnglish
Pages (from-to)11-21
Number of pages11
JournalEngineering Applications of Artificial Intelligence
Volume75
DOIs
Publication statusPublished - 2018 Oct

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Email security level classification of imbalanced data using artificial neural network: The real case in a world-leading enterprise'. Together they form a unique fingerprint.

Cite this