An information retrieval approach for malware classification based on Windows API calls

Julia Yu Chin Cheng, Tzung Shian Tsai, Chu Sing Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Citations (Scopus)

Abstract

Automated malware toolkits allow for easy generation of new malicious programs. These new executables carry similar malicious code and demonstrate similar malicious behavior on infected hosts. In order to speed up the efficiency of mal ware detection, discriminating a malware as known or a new species of malware has become a critical issue in the security industry. In this paper, we propose a new approach to precisely classify malicious executables by employing information retrieval theory. Dynamic analysis of a sample's sequence of Windows API function calls produces corresponding parameters and values which is used as input to a standard TF-IDF weighting scheme to identify malware families by their behavior characteristics. Irrelevance reduction is developed to filter out non-relevant features and improve accuracy of malware classification. Finally, a similarity measure is used to determine the most similar malware family to the tested samples.

Original languageEnglish
Title of host publicationProceedings - International Conference on Machine Learning and Cybernetics
PublisherIEEE Computer Society
Pages1678-1683
Number of pages6
ISBN (Electronic)9781479902576
DOIs
Publication statusPublished - 2013
Event12th International Conference on Machine Learning and Cybernetics, ICMLC 2013 - Tianjin, China
Duration: 2013 Jul 142013 Jul 17

Publication series

NameProceedings - International Conference on Machine Learning and Cybernetics
Volume4
ISSN (Print)2160-133X
ISSN (Electronic)2160-1348

Other

Other12th International Conference on Machine Learning and Cybernetics, ICMLC 2013
Country/TerritoryChina
CityTianjin
Period13-07-1413-07-17

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'An information retrieval approach for malware classification based on Windows API calls'. Together they form a unique fingerprint.

Cite this