Static PE Malware Type Classification Using Machine Learning Techniques

Shao Huai Zhang, Cheng Chung Kuo, Chu Sing Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, machine learning techniques have become more and more popular. It is also introduced to the research about malware detection. However, most of research are still focused on binary classification issue, which predicts a file as benign or malicious. Only a small fraction of them work on malware type detection or classification of malware family. This work mainly uses several machine learning models to build static malware type classifiers on PE-format files. A recently released dataset for windows malware detection are used and relabeled into multi-class via VirusTotal, and several efficient and scalable machine learning models are considered. The evaluation results show that our best model, random forest, can achieve high performance with micro avg f1 score 0.96 and macro avg f1 score 0.89, which is better than the model used in referred work.

Original languageEnglish
Title of host publicationProceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages81-86
Number of pages6
ISBN (Electronic)9781728131597
DOIs
Publication statusPublished - 2019 Aug
Event2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019 - Tainan, Taiwan
Duration: 2019 Aug 302019 Sep 1

Publication series

NameProceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019

Conference

Conference2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019
CountryTaiwan
CityTainan
Period19-08-3019-09-01

Fingerprint

Learning systems
learning
Research
Macros
Classifiers
Malware
Machine Learning
evaluation
performance
Datasets
Forests

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Health Informatics
  • Communication
  • Social Sciences (miscellaneous)

Cite this

Zhang, S. H., Kuo, C. C., & Yang, C. S. (2019). Static PE Malware Type Classification Using Machine Learning Techniques. In Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019 (pp. 81-86). [8858297] (Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICEA.2019.8858297
Zhang, Shao Huai ; Kuo, Cheng Chung ; Yang, Chu Sing. / Static PE Malware Type Classification Using Machine Learning Techniques. Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 81-86 (Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019).
@inproceedings{5fbbececf14642e5b8156ca22aab2fbb,
title = "Static PE Malware Type Classification Using Machine Learning Techniques",
abstract = "In recent years, machine learning techniques have become more and more popular. It is also introduced to the research about malware detection. However, most of research are still focused on binary classification issue, which predicts a file as benign or malicious. Only a small fraction of them work on malware type detection or classification of malware family. This work mainly uses several machine learning models to build static malware type classifiers on PE-format files. A recently released dataset for windows malware detection are used and relabeled into multi-class via VirusTotal, and several efficient and scalable machine learning models are considered. The evaluation results show that our best model, random forest, can achieve high performance with micro avg f1 score 0.96 and macro avg f1 score 0.89, which is better than the model used in referred work.",
author = "Zhang, {Shao Huai} and Kuo, {Cheng Chung} and Yang, {Chu Sing}",
year = "2019",
month = "8",
doi = "10.1109/ICEA.2019.8858297",
language = "English",
series = "Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "81--86",
booktitle = "Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019",
address = "United States",

}

Zhang, SH, Kuo, CC & Yang, CS 2019, Static PE Malware Type Classification Using Machine Learning Techniques. in Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019., 8858297, Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019, Institute of Electrical and Electronics Engineers Inc., pp. 81-86, 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019, Tainan, Taiwan, 19-08-30. https://doi.org/10.1109/ICEA.2019.8858297

Static PE Malware Type Classification Using Machine Learning Techniques. / Zhang, Shao Huai; Kuo, Cheng Chung; Yang, Chu Sing.

Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 81-86 8858297 (Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Static PE Malware Type Classification Using Machine Learning Techniques

AU - Zhang, Shao Huai

AU - Kuo, Cheng Chung

AU - Yang, Chu Sing

PY - 2019/8

Y1 - 2019/8

N2 - In recent years, machine learning techniques have become more and more popular. It is also introduced to the research about malware detection. However, most of research are still focused on binary classification issue, which predicts a file as benign or malicious. Only a small fraction of them work on malware type detection or classification of malware family. This work mainly uses several machine learning models to build static malware type classifiers on PE-format files. A recently released dataset for windows malware detection are used and relabeled into multi-class via VirusTotal, and several efficient and scalable machine learning models are considered. The evaluation results show that our best model, random forest, can achieve high performance with micro avg f1 score 0.96 and macro avg f1 score 0.89, which is better than the model used in referred work.

AB - In recent years, machine learning techniques have become more and more popular. It is also introduced to the research about malware detection. However, most of research are still focused on binary classification issue, which predicts a file as benign or malicious. Only a small fraction of them work on malware type detection or classification of malware family. This work mainly uses several machine learning models to build static malware type classifiers on PE-format files. A recently released dataset for windows malware detection are used and relabeled into multi-class via VirusTotal, and several efficient and scalable machine learning models are considered. The evaluation results show that our best model, random forest, can achieve high performance with micro avg f1 score 0.96 and macro avg f1 score 0.89, which is better than the model used in referred work.

UR - http://www.scopus.com/inward/record.url?scp=85074212170&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074212170&partnerID=8YFLogxK

U2 - 10.1109/ICEA.2019.8858297

DO - 10.1109/ICEA.2019.8858297

M3 - Conference contribution

AN - SCOPUS:85074212170

T3 - Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019

SP - 81

EP - 86

BT - Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Zhang SH, Kuo CC, Yang CS. Static PE Malware Type Classification Using Machine Learning Techniques. In Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 81-86. 8858297. (Proceedings - 2019 International Conference on Intelligent Computing and Its Emerging Applications, ICEA 2019). https://doi.org/10.1109/ICEA.2019.8858297