Deep Hashing for Malware Family Classification and New Malware Identification

Yunchun Zhang, Zikun Liao, Ning Zhang, Shaohui Min, Qi Wang, Tony Q.S. Quek, Mingxiong Zhao

研究成果: Article同行評審


Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: ResNet50-based deep hashing for malware retrieval and voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48 bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: malware classification accuracy with an impressive accuracy rate of 96.5%, and the identification of new or zero-day malware with a commendable accuracy of 85.7%.

頁(從 - 到)1
期刊IEEE Internet of Things Journal
出版狀態Accepted/In press - 2024

All Science Journal Classification (ASJC) codes

  • 訊號處理
  • 資訊系統
  • 硬體和架構
  • 電腦科學應用
  • 電腦網路與通信


深入研究「Deep Hashing for Malware Family Classification and New Malware Identification」主題。共同形成了獨特的指紋。