Intrusion Detection Using Data Mining
Abstract
Intrusion detection presents a viable and practical solution for providing a specific defense strategy within the context of our extensive and up-to-date computer and network systems. These intrusion monitoring systems rely on the analysis of host audit trails and network traffic, with a primary focus on the timely identification of potential threats. The rapid growth of data and internet connectivity has led to increased security vulnerabilities, making intrusion detection a critical defense mechanism for large networks. Intrusions encompass actions like peeping, snooping, and spying, threatening network security and data integrity. Data Mining is a tool to extract valuable insights from vast datasets, aiding in predictions and knowledge discovery. Cloud computing, a pivotal technology, offers on-demand resources but introduces security challenges. These challenges include data storage, access control, and integration with existing security measures. Intrusion detection and prevention are vital for mitigating risks and ensuring network and data security in the evolving digital landscape. This paper presents a comparative analysis of various classification techniques to determine their performance in the context of a Network Intrusion Detection System (NIDS). The primary goal is to identify the model that achieves the highest accuracy and precision while minimizing false positive rates. The dataset used was obtained from Kaggle. It has been meticulously created and audited, comprising raw TCP/IP dump data that simulates a LAN network environment. The chosen dataset is further divided into two subsets: one for training and the other for testing purposes. These datasets contain a total of 42 and 41 features, respectively, with each feature serving a unique purpose in the data processing and analysis. It explored their applications in network intrusion detection and evaluated their predictive accuracy and ease of implementation. The work employed a two-stage classification (Anomaly-Misuse) approach with Decision Tree, Random Forest, KNN, SVM, and Naive Bayes as classification methods. The results demonstrated a 100% accuracy rate in both stages, with Random Forest outperforming other classifiers. This approach exhibited high detection efficiency, reduced time consumption, and cost-effectiveness, albeit with a slightly higher false positive rate. These findings underscore the efficacy of Random Forest in network intrusion detection.
Authors: A.R. Adelusi, A. Oronti, O. Abereowo, O.Y. Ogunlola, B.I. Alese
Published in: International Conference for Internet Technology and Secured Transactions (ICITST-2023)
- Date of Conference: 13-15 November 2023
- DOI: 10.20533/ICITST.2023.0026
- ISBN: 978-1-913572-63-1
- Conference Location: St Anne’s College, Oxford University, UK