An optimized hybrid ensemble machine learning model combining multiple classifiers for detecting advanced persistent threats in networks
Document Type
Article
Publication Title
Journal of Big Data
Abstract
Advanced Persistent Threats (APTs) are among the most dangerous cyberattacks due to their stealth, persistence, and ability to evade traditional intrusion detection systems. This study proposes a novel and optimized hybrid ensemble-based machine learning model for detecting APTs, using the realistically simulated Unraveled dataset, which captures long-term, stealthy attack behaviors often missed by conventional datasets. Existing machine learning models often fall short in identifying such threats, particularly due to their inability to capture temporal dependencies and their reliance on monolithic feature spaces that limit adaptability. The model integrates Long Short-Term Memory (LSTM) networks, K-Nearest Neighbors (KNN), and Logistic Regression (LR) algorithms to leverage the unique strengths of each. A key novelty lies in the logical division of the top 21 predictive features across the three classifiers based on their suitability for temporal, statistical, and relational patterns. Feature selection techniques, including Information Value (IV), Weight of Evidence (WoE), and XGBoost were employed to identify these features. The initial ensemble model achieved 97.12% accuracy, demonstrating its effectiveness even before optimization. After fine-tuning LSTM and LR, the accuracy improves to 99.94%. This 2.82% gain confirms the impact of model-specific tuning and feature partitioning. This significant performance improvement highlights the critical role of strategic feature partitioning and individualized model tuning in enhancing APT detection capabilities. The proposed approach offers a scalable and interpretable solution to address the complex nature of APTs and strengthens the robustness of modern intrusion detection systems.
DOI
10.1186/s40537-025-01272-w
Publication Date
12-1-2025
Recommended Citation
Ibrahim, Nadim; Rajalakshmi, N. R.; Sivakumar, V.; and Sharmila, L., "An optimized hybrid ensemble machine learning model combining multiple classifiers for detecting advanced persistent threats in networks" (2025). Open Access archive. 11829.
https://impressions.manipal.edu/open-access-archive/11829