A data compacting technique to reduce the NetFlow size in botnet detection with BotCluster

Chun Yu Wang, Yu Cheng Chen, Shih Hao Fuh, Feng Min Cho, Ta Chun Lo, Jyh Biau Chang, Qi Jun Cheng, Ce Kuen Shieh

研究成果: Conference contribution

摘要

Big data analytics helps us to find potentially valuable knowledge, but as the size of the dataset increases, the computing cost also grows exponentially. In our previous work, BotCluster, we had designed a pre-processing filtering pipeline, including whitelist filter and flow loss-response rate (FLR) filter, for data reduction, which intended to wipe out irrelative noises and reduce the computing overhead. However, we still face a data redundancy phenomenon in which some of the same feature vectors repeatedly emerged. In this paper, we propose a data compacting approach aimed to reduce the input volume and keep enough representative feature vectors to fit DBSCAN's (Density-based spatial clustering of applications with noise) criteria. It purges the redundant vectors according to a purging threshold and keeps the primary representatives. Experimental results have shown that the average data reduction ratio is about 81.34%, while the precision has only slightly decreased by 1.6% on average, and the results still have 99.88% of IPs overlapped with the previous system.

原文English
主出版物標題BDCAT 2019 - Proceedings of the 6th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
發行者Association for Computing Machinery, Inc
頁面81-84
頁數4
ISBN(電子)9781450370165
DOIs
出版狀態Published - 2019 12月 2
事件6th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2019 - Auckland, New Zealand
持續時間: 2019 12月 22019 12月 5

出版系列

名字BDCAT 2019 - Proceedings of the 6th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies

Conference

Conference6th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2019
國家/地區New Zealand
城市Auckland
期間19-12-0219-12-05

All Science Journal Classification (ASJC) codes

  • 人工智慧
  • 電腦科學應用
  • 資訊系統
  • 決策科學(雜項)
  • 資訊系統與管理
  • 通訊

指紋

深入研究「A data compacting technique to reduce the NetFlow size in botnet detection with BotCluster」主題。共同形成了獨特的指紋。

引用此