TY - GEN
T1 - A Two-Phase Multi-Class Botnet Labeling Approach for Real-World Traffic
AU - Lo, Ta Chun
AU - Yang, Shan Hong
AU - Chang, Jyh Biau
AU - Chen, Chung Ho
AU - Shieh, Ce Kuen
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Within the realm of cybersecurity, botnets represent an increasingly formidable threat, characterized by diverse types exhibiting distinct behavioral patterns and characteristics. This study addresses the imperative need for real-time botnet activity detection by introducing a multi-class labeling system tailored for real-world network traffic. Employing clustering algorithms and a semi-supervised learning framework, this system efficiently labels benign traffic and performs multi-class labeling for various botnet traffic categories. Hierarchical Density-based Spatial Clustering of Applications with Noise (HDBSCAN) is harnessed for clustering both synthetic and real-world datasets, significantly enhancing labeling coverage. The remaining traffic is designated as 'unknown' and subjected to identification through a semi-supervised learning approach. A comparative analysis underscores the superiority of HDBSCAN over Density-based Spatial Clustering of Applications with Noise (DBSCAN), successfully clustering an additional 11% of data. Remarkably, our system exhibits substantial advancements in data labeling when juxtaposed with prior research efforts. This research introduces an effective solution for botnet labeling in the context of network security, thereby enhancing the capacity for detecting and mitigating malicious botnet activities.
AB - Within the realm of cybersecurity, botnets represent an increasingly formidable threat, characterized by diverse types exhibiting distinct behavioral patterns and characteristics. This study addresses the imperative need for real-time botnet activity detection by introducing a multi-class labeling system tailored for real-world network traffic. Employing clustering algorithms and a semi-supervised learning framework, this system efficiently labels benign traffic and performs multi-class labeling for various botnet traffic categories. Hierarchical Density-based Spatial Clustering of Applications with Noise (HDBSCAN) is harnessed for clustering both synthetic and real-world datasets, significantly enhancing labeling coverage. The remaining traffic is designated as 'unknown' and subjected to identification through a semi-supervised learning approach. A comparative analysis underscores the superiority of HDBSCAN over Density-based Spatial Clustering of Applications with Noise (DBSCAN), successfully clustering an additional 11% of data. Remarkably, our system exhibits substantial advancements in data labeling when juxtaposed with prior research efforts. This research introduces an effective solution for botnet labeling in the context of network security, thereby enhancing the capacity for detecting and mitigating malicious botnet activities.
UR - http://www.scopus.com/inward/record.url?scp=85189933653&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85189933653&partnerID=8YFLogxK
U2 - 10.1109/ICAIIC60209.2024.10463248
DO - 10.1109/ICAIIC60209.2024.10463248
M3 - Conference contribution
AN - SCOPUS:85189933653
T3 - 6th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2024
SP - 685
EP - 690
BT - 6th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2024
Y2 - 19 February 2024 through 22 February 2024
ER -