AI Algorithms for Anomaly Detection in Computer Networking
Ahmed Alkhuzaee *
Department of Electrical and Computer Engineering, King Abdulaziz University, Jeddah, Saudi Arabia.
*Author to whom correspondence should be addressed.
Abstract
This study investigates the use of lightweight artificial intelligence models for anomaly detection in computer networking. The work used the UNSW-NB15 and NSL-KDD datasets to evaluate K-Nearest Neighbour and Logistic Regression classifiers under baseline, Recursive Feature Elimination, and Recursive Feature Elimination with Condensed Nearest Neighbour configurations. Data pre-processing included cleaning, encoding, scaling, and the removal of attributes considered unsuitable for classification. Recursive Feature Elimination with a Random Forest estimator was applied to identify informative features, and Condensed Nearest Neighbour was used to reduce redundant training instances. Model performance was assessed using accuracy, precision, recall, F1-score, and runtime. The results indicate that K-Nearest Neighbour generally outperformed Logistic Regression across the evaluated settings. On the UNSW-NB15 dataset, K-Nearest Neighbour reached 90.29% accuracy and an F1-score of 89.56% with 10 selected features, while Logistic Regression achieved 88.13% accuracy and an F1-score of 87.41% with 30 features. On the NSL-KDD dataset, K-Nearest Neighbour achieved 79.97% accuracy and a 75.80% F1-score with 30 features. The Recursive Feature Elimination and Condensed Nearest Neighbour configuration substantially reduced runtime for UNSW-NB15 K-Nearest Neighbour from 86.77 s to 1.07 s, although with reduced accuracy and F1-score. Overall, the findings suggest that feature selection and instance reduction can support computationally efficient intrusion detection while preserving acceptable classification performance.
Keywords: Anomaly detection, computer networking, intrusion detection, lightweight machine learning, recursive feature elimination, condensed nearest Neighbour, K-Nearest Neighbour, logistic Regression, network security, computational efficiency