ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Systematic Review

Towards achieving lightweight intrusion detection systems in Internet of Things, the role of incremental machine learning: A systematic literature review

[version 1; peer review: 1 not approved]
PUBLISHED 24 Nov 2022
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

While the benefits of IoT cannot be overstated, its computational constraints make it challenging to deploy security methodologies that have been deployed in traditional computing systems. The benefits and computational constraints have made IoT systems attractive to cyber-attacks. One way to mitigate these attacks is to detect them. In this study, a Systematic Literature Review (SLR) has been conducted to analyze
the role of incremental machine learning in achieving lightweight intrusion detection for IoT systems. The study analyzed existing incremental machine learning approaches used in designing intrusion detection systems for IoT ecosystems, emphasizing the incremental methods used in detecting intrusions, the datasets used to evaluate these methods, and how the method achieves lightweight status. The SLR outlined the contributions of each study, focusing on their strengths and gaps, the datasets used, and the incremental machine learning model used. This study revealed that incremental learning approaches in detecting intrusion in IoT systems are in their infant stage. Over 12 years, from 2010 to 2022, a total of twenty-one (21) studies were carried out in IDSs using incremental machine learning, with eight (8) studies carried out in IoT systems. In addition to reviewing the literature, we offer suggestions for improving existing solutions and achieving lightweight IDS for IoT systems. We also discussed some problems with making lightweight IDS for IoT systems and areas where
more research could be done in the future.

Keywords

Internet of Things, Incremental Machine Learning, Online Machine Learning, Intrusion Detection System, Anomaly Detection, Network Security

Introduction

The past three decades have seen a massive paradigm shift in computing technology. This shift is mainly due to increased computing power and communication speed. The latter has enabled us to develop intelligent devices that can communicate with each other. These intelligent devices make up the Internet of Things (IoT) ecosystem. The IoT ecosystem is deployed in cities, healthcare, energy, agriculture, transportation, and industries. Moreover, the internet of things has become a household name because of its numerous benefits to the various domains it has been applied to. However, these benefits have made the IoT ecosystem attractive to cyber attackers. Over the years, security methodologies like encryption, authentication, data confidentiality, access control, and privacy have been proposed by several researchers to ensure security in IoT environments. Despite these security solutions, IoT systems are still vulnerable and highly susceptible to cyber-attacks. An alternative approach to fighting these attacks is to detect them using Intrusion Detection Systems (IDS). In traditional computer networks, IDS monitors the network’s activities. Although the concept of intrusion detection has been well explored in traditional computing and network systems as far back as the 1980s, the idea is in its infant stage in IoT security1. The computational constraints of IoT systems make it practically impossible to implement traditional IDS in IoT environments. Although many IDS solutions have been proposed, most use offline or batch machine learning models. This situation makes these models computationally expensive and difficult to deploy on IoT devices. An alternative to the above-mentioned the approach is to build IDS that learn from data streams, which can produce IDS with minimal computational usage. In this paper, the authors seek to present a systematic review of incremental ML-based IDS in IoT systems. Several surveys and review articles have been done in IDS for IoT. But to the best of our knowledge, none of these surveys or SLRs focuses on the role of incremental machine learning and how they can lead to lightweight IDS that are suitable for the IoT ecosystem. This study covers work done using incremental ML to develop IDS for IoT systems from 2010 to 2022. The period from 2010 to 2022 was adopted for this study because we wanted to analyze the research trend of incremental ML-based IDS in IoT systems since 2010. We used the methods identified by 2,3 to conduct our studies. This work differs from other existing studies by exploring how incremental machine learning methods achieve lightweight IDS that fit into the computational constraint nature of IoT systems. The study further considers some of the general potential problems facing the implementation of IDS in IoT environments. A total of 168 studies were returned based on our search criteria, but eight (8) studies were found to be IoT-based after applying our formulated inclusion and exclusion criteria. However, because the number was small, we decided to include studies that satisfied the inclusion criteria but were not IoT-based, which yielded 13 studies. This brought the total number of studies considered in this SLR to 21. However, most of our analyses were primarily focused on IoT-based studies since this study’s objective. The primary contributions of our study are as follows:

  • Conducting a comprehensive systematic literature review of incremental ML methods in designing IDS for IoT systems.

  • This work also provides a detailed analysis and discussion of how incremental ML models could effectively fit into real-time IDSs for IoT systems.

  • Furthermore, the study analyzes the strengths and weaknesses of the IoT-based articles considered in this systematic literature review.

  • This work also identifies the most critical problems with IDS research in IoT systems and suggests future research.

The following is how the paper is structured. Sections 2 and 3 discuss related works and the research methodology employed. Sections 4 and 5 discuss the findings, challenges, and future research directions. Section 6 discusses validity threats, while Section 7 discusses the study’s conclusion.

Related works

This section presents some survey and review papers that are closely related to our study. We considered existing survey and review studies focusing on intrusion detection in IoT systems for the past five years, 2017–2022. We considered the past five years because most surveys and SLRs for IoT-based intrusion detection systems that are of interest to our study were done during this period.

During their investigation, 4 conducted a comprehensive survey of the latest intrusion detection systems designed specifically for IoT systems. The study focused on the methods, features, and methods implemented in each study while providing insights into the various architectures used in IoT and some emerging vulnerabilities. The authors also looked at factors that affect the performance of IDS in intelligent environments. Some factors identified were detection accuracy, false positive rate, energy consumption, processing time, and the overall performance overhead.

In their research, 5 also presented a review of IDS for IoT environments, focusing on the techniques and deployment strategies used by each of the studies included in their work. The authors also considered the validation strategies and the datasets used in the respective works covered in their studies. Moreover, the study discussed some challenges facing intrusion detection in IoT systems.

In their work, 6 presented a survey that captures the practices and challenges facing intrusion detection systems in the internet of things. Benkhelifa et al.6 considered various IDS solutions used in IoT environments in their work. Each solution identified in the study was considered an improvement strategy to improve the detection methods.

Mishra et al.7 presented a study that compares models that detect and prevent distributed denial of service attacks. The study also discussed the different classifications of methods, models, and datasets used to build IDS. The study also looked at research challenges in IDS and proposed some solutions to mitigate these challenges. The authors presented some areas that can be considered studies for the future.

In their study, 8 provided an overview of the current security challenges of the IoT and how these challenges can be solved using IDS. The study also explored future challenges in IoT and how they can be addressed using intrusion detection.

In a similar study, 9 presented a review of machine learning-based intrusion detection systems in IoT environments, discussing various Machine Learning (ML) approaches used in designing IDS with emphasis on their advantages and disadvantages. The authors concluded their study by looking at some of the research challenges and possible future direction for work around IDS in IoT.

Arshad et al.10 conducted a comprehensive study on existing intrusion detection systems for IoT systems using three parameters, namely, computational overhead, energy consumption, and privacy implications. The study also identified some open challenges that exist in the area of their study.

In another study, 11 conducted a systematic review of the literature to examine existing works in anomaly-based intrusion detection that use deep learning techniques. The study also discussed the challenges faced by DL-based anomaly detection in the IoT domain and some areas that can be considered for future work.

Similarly, 1 conducted a survey of intrusion detection in IoT environments with a focus on the detection methods, placement strategy, security threats, and validation strategy.

Seyfollahi et al.12 reviewed machine learning techniques used in designing intrusion detection systems for the Low-Power and Lossy Networks (RPL) protocol. The study also identified open issues and challenges related to their study’s domain.

In their study, 13 showed an overview of intrusion detection systems for IoT networks and presented some suggestions for future work that could help make IoT networks more secure.

Chaabouni et al.14 also conducted a survey that sought to classify IoT security threats and challenges. The study analyzed and compared the state-of-the-art NIDS in the context of IoT networks. The study considered the architecture, detection methodologies, validation strategies, and deployed algorithms.

Saranya et al.15 evaluated the performance analysis of machine learning models used in the design of IDS for IoT systems. Besides the fact that none of the surveys or SLRs considered incremental ML-based IDSs in their studies, these studies had other gaps which have been considered in our study.

A summary of the related literature is shown in Table 1 below.

Table 1. Summary of Related Literature.

SNStudiesType of
Study
Year of
Publication
Research Gap
11Survey2017The study did not report on the strengths and weaknesses of the papers considered in the
study.
214Survey2019The study did not report on the strengths and weaknesses of the papers considered in the
study.
The study also did not report on how these IDSs methodologies impact IoT resources.
315Review2020The primary objective of the study is not focused on IoT environment.
The study did not report on the strengths and weaknesses of the papers considered in the
study.
412Survey2021The study is focused on IDSs for RPL routing protocol.
511Survey2021The study only considered anomaly-based IDSs in IoT that uses deep learning approaches.
610Review2020The study did not report on the strengths and weaknesses of the papers considered in the
study.
The study also did not report the impact these IDSs methodologies have on IoT devices.
79Review2020The study did not report on the strengths and weaknesses of the papers considered in the
study.
The study also did not report the impact these IDSs methodologies have on IoT devices.
88Review2019The study did not analyse the strengths and weaknesses of the papers selected for the study.
The study also did not report the impact these IDSs methodologies have on IoT devices.
97Review2021The study did not report the impact these IDSs methodologies have on IoT devices.
106Review2018The study focuses only on the architectural design and detection approaches used in
Intrusion Detection Systems.
115Review2021The study did not analyse the strengths and weaknesses of the papers selected for the study.
The study also did not report the impact these IDSs methodologies have on IoT devices.
124Review2018The study primary focuses on the general overview of IDS without considering the specific
challenges IoT based IDS faces.
The study did not analyse the strengths and weaknesses of the papers selected for the study.
The study also did not report the impact these IDSs methodologies have on IoT devices.
1313Review2019The study did not analyse the strengths and weaknesses of the papers selected for the study.
The study also did not report the impact these IDSs methodologies have on IoT devices.

Research method

In this section, we outlined this study’s method deployed by 2,3. We used general principles in conducting systematic reviews. The methodology proposed by 2 and 3 has five steps as follows:

  • The formulation of crucial research questions.

  • The formulation of the search process

  • The formulation of the general criteria for the selection of articles.

  • The data extraction process, and

  • The execution of analysis and classification

Research questions

The following four research questions were considered in selecting the various papers used in this study.

  • RQ1: What is the primary contribution of the paper?

  • RQ2: What incremental or online machine learning algorithm was used in this study?

  • RQ3: How does the proposed method handle data, feature, or concept drift?

  • RQ4: How do the proposed IDS handle the computational constraints of IoT systems?

RQ1 focuses on the primary contribution of each of the papers considered in our study. We looked at studies that used incremental or online machine-learning approaches to deploy intrusion detection in IoT environments. The goal is to provide readers and researchers with an overview of the problem and how it is addressed.

RQ2 examines which incremental or online machine-learning algorithm was used in each study.

RQ3 focuses on how the method proposed in RQ2 handles data, feature, or concept drift. Static models are generated by machine learning using historical data. However, once in production, ML models become unreliable, obsolete, and degrade over time. Changes in data distribution may occur during production, resulting in biased predictions. User behavior may have changed compared to the baseline data used to train the model, or there may have been additional factors in real-world interactions that influenced the predictions. Data drift is a significant cause of model accuracy deterioration over time.

The fourth research question (RQ4) aims to answer how the methods or models proposed in each of the studies handle the computation constraints of IoT devices. One limitation of IoT devices is their limited computational resources, which is one reason why traditional IDS cannot be deployed in IoT environments. It is in this regard that we looked at how each study handled the resource constraints of IoT systems while building an IDS for the same environment.

Protocol and phases of the study

This work was conducted using the guidelines stipulated in the Preferred Reporting Items for SLRs and Meta-Analyses (PRISMA)16. To suite the guidelines proposed by PRISMA to Computer Science, we incorporated the PRISMA guidelines with the guidelines proposed by Kitchenham17. Figure 1 below shows the flow diagram of inclusion and exclusion process.

524b4422-a190-4382-9252-bead809944f7_figure1.gif

Figure 1. PRISMA flow diagram18.

Inclusion and exclusion criteria

In this study, we considered articles published in peer-reviewed journals. In order for an article to be included in our study, it must fulfill seven criteria, which are elaborated on in Table 2.

Table 2. Inclusion and exclusion criteria.

SNCriteriaJustification
1The study must not be a review or survey paper but an
original research paper
Review and surveys papers will not fully answer our research questions.
2The proposed IDS must use incremental or online
machine learning methods and must either be deployed
in IoT environments or non IoT based environment.
The study seeks to analyse incremental ML based IDS in IoT. Therefore,
papers included in the study must use incremental ML approach to
solve IDS problems in IoT systems.
3The article must be written in EnglishThe English language was the common medium of communication for
all authors involved in this study.
4The IDS model proposed must be evaluated using a real
world dataset or network traffic
The study intends to inform readers about the applicability of the
proposed solutions, which can be accomplished when these solutions
are properly evaluated.
5The study must be a full-length paperShort papers like abstracts may not cover all the important aspects of
a study. Some details of proposed solutions could be left out as well
evaluation details.
6The study should have been published from 2010 to
2022
The period considered for this SLR was from 2010 to 2022.
7The study has to be published in a peer reviewed
journal and must not be a conference proceeding
Journal articles are rigorously peer reviewed.

Quality assessment criteria

To eliminate bias and to make our study easily reproducible, we used a quality assessment criteria procedure based on 17. Quality assessment criteria play a vital role in conducting systematic literature reviews. The concept of quality assessment criteria (QAC) is to use a process that improves the criteria for selecting research papers. The QAC was deployed using a set of quality assessment questions (QAQs). The QAQs were used to create a checklist against which we compared each paper to ensure that it met the QAC and answered our RQs. If a study answers a question from the QAC checklist, we mark it as "Yes," and if it doesn’t, it is marked "No." However, some papers partially answer some of the questions in the QAC. Such criteria are "P" to represent a partial response. Scores were assigned to each of the questions considered in the QAC. A "Yes" answer is worth one point, a "No" answer is worth zero points, and a "P" answer is worth 0.5 points. Each paper is evaluated against the QA, and the marks are summed. After awarding the mark to each QA, we decided to select papers whose summation was above 2.5. The value of 3.0 was chosen because we did not want to include papers that partially (50%) answered the quality assessment questions formulated for this study. Table 3 and Table 4 below show the quality assessment questions and the quality evaluation results we used in this study.

Table 3. Quality assessment questions.

NumberQuality Assessment Questions (QAQ)
QA 1Are the research’s goals or objectives clearly stated?
QA 2Is there any response to the posed RQs in the paper?
QA 3Is there any connection between the objectives, methodology, experimentation, and conclusion?
QA 4Is there an experimental validation in the study to answer the research question?
QA 5Are the study’s findings compared to other works?

Table 4. Quality evaluation of the selected studies.

SNStudyQA1QA2QA3QA4QA5Total Score
119YesYesYesYesYes5.0
220YesYesYesYesYes5.0
321YesYesYesYesYes5.0
422YesPYesYesYes4.5
523YesYesYesYesYes5.0
624YesPYesYesYes4.5
725YesYesYesYesYes5.0
826YesYesYesYesYes5.0
927YesYesYesYesYes5.0
1028YesYesYesYesYes5.0
1129YesPYesYesYes4.5
1230YesYesYesYesYes5.0
1331YesYesYesYesYes5.0
1432YesYesYesYesYes5.0
1533YesYesYesYesYes4.5
1634YesYesYesYesYes5.0
1735YesPYesYesYes4.5
1836YesPPYesYes4.0
1937YesYesYesYesYes4.5
2038YesPYesYesYes4.5
2139YesPYesYesYes4.5

Information sources and selection process

We manually searched for the articles included in this study in research six databases. The databases considered in this study are as follows;

  • IEEE Xplore

  • ScienceDirect

  • Wiley

  • ACM Digital Library

  • MDPI

  • Springer

The search process involved five keywords: incremental learning, online machine learning, internet of things, intrusion detection, and anomaly detection. The keywords were connected using the words "AND" and "OR." Generally, the search terms were framed as "Internet of Things AND Incremental Learning AND Intrusion Detection OR Anomaly Detection OR Online Machine Learning. The search terms were targeted at the author’s keywords provided in the paper.

Results

In this section, we present the results of the systematic literature review carried out.

Publications by journal

In Table 5, we looked at the research databases considered in our studies and the number of articles published in each journal during the period considered in our study before applying the inclusion and exclusion criteria. The search results returned a total of 159 articles. IEEE Xplore returned 68 results, Science Direct, returned 21 results, and Wiley returned 8 results. MDPI, Springer, and ACM returned 8, 44, and 10 results, respectively. Table 6 shows the number of articles considered in this study after applying our quality assessment criteria. A total of twenty-two (22) articles were selected from the six databases after applying the inclusion and exclusion criteria. IEEE Xplore had nine (9) publications, ScienceDirect had seven (7) publications meeting the QA criteria, MDPI had three (3) papers, Wiley had three (3) papers, and Springer and ACM Digital Library had 0 papers each.

Table 5. Publications by journal before applying inclusion and exclusion criteria.

SNJournalNumber of
Publications
1IEEE68
2Science Direct21
3Wiley8
4MDP18
5Springer44
6ACM10

Table 6. Publications by Journal after applying inclusion and exclusion criteria.

SNJournalNumber of
Publications
1IEEE9
2Science Direct6
3MDPI3
4Wiley3
5Springer0
6ACM0

Contributions of each study

The parameters considered in determining the contribution of a study are how these studies handle drift adaption, the lightweight status of models, the running time of models, and the memory consumption of models. The parameters considered for the contribution of the studies are shown in Figure 1 below.

In Table 7, we presented the contributions of each study based on the area of drift adaption, the lightweight status of models, the running time of models, and the memory consumption of models. Only the 8 IoT-based studies were considered in this analysis. Two of the eight (8) studies deployed solutions that could handle drifts in either data or concepts. Eight out of the nine studies focused on designing lightweight models. We also looked at how each of the studies handled computational complexity. Four (4) out of the eight (8) studies reported time complexity, while only two out of the eight (8) studies reported the space complexity of their proposed model. None of the eight IoT-based studies reports on the energy consumption of their proposed model.

Table 7. Contributions of each IoT based study.

SNStudyDrift
adaption
Lightweight
model
Model running
time
Memory
consumption
Energy
consumption
Computational
complexity
126 
230
329 
428 
539 
627 
731 
821 

Strength and weakness of each study

In Table 8 and Table 9, we presented a summary of the strengths and weaknesses of the IoT-based intrusion detection studies are considered in our study. We chose to report on the strengths and weaknesses of the IoT-based IDSs because that is the core of our studies.

Table 8. Strength and weakness of each study.

SNStudyModels usedStrengthsWeakness
126Incremental Support Vector
Machine
This article is unique in that it employs classifier
selection to determine whether the one-class SVM
classification is reasonably reliable.
The study did not report on how the
proposed method would impact on
the computational resource of cyber-
physical systems.
230Online sequential Extreme
learning machine Recursive
least squares based
classifiers Ensemble
learning
The research presents a general-purpose, online
learning, decentralized anomaly detection
framework with a diverse set of local anomaly
detection algorithms and computational
resources that are compatible with the stringent
limitations of embedded platforms commonly
used in WSNs.
Although the study used a simulator
to calculate the computational
complexity of the various methods,
it did not report the actual CPU
and memory consumption of their
proposed model.
329Online Deep Learning
Principal Component
Analysis
Using a deep neural network that adjusts neural
network sizes dynamically based on the Hedge
weighting mechanism. As new data becomes
available, the goal is to encourage continuous
learning and model adaptation.
Even though the study’s primary
focus is detection intrusion under
data and concept drifts, it is
important to report how the method
used to detect drifts affects the
model’s memory and training time.
427Convolutional Neural
Network
To reduce the overhead on the centralized edge
classifier, a distributed IDS concept is proposed,
resulting in the shortest possible latency between
the pre-processing and decision-making phases.
The study reported on the time complexity of the method used. The space
complexity wasn’t reported.
The dataset used for the
experimental validation is a non IoT-
based dataset.
531Adaptive Random Forest
Hoeffding Adaptive Tree
Using an incremental learning approach to detect
botnet attacks in IoT environments.
The study did not report on time
and memory consumption of the
proposed method.
The study did not report the
framework and libraries used to build
the proposed model.
628Light Gradient Boosting
Machine Optimized Adaptive
and Sliding Windowing
Particle Swarm Optimization
The study proposed Optimized Adaptive Sliding
Windowing (OASW), a novel drift adaptation
method, to address the problem of concept
drifting.
The study only focused on binary
classification.
739Online incremental Support
Vector Data Description
Adaptive Sequential Extreme
Learning Machine
On IIoT devices, a lightweight NIDS based on
an online incremental Support Vector Data
Description anomaly detection system and an
Adaptive Sequential Extreme Learning Machine
on a multi-access edge computing server is
proposed.
The proposed method’s time and
memory consumption were not
reported in the study.

Table 9. Strength and weakness of each study.

SNStudyModels usedStrengthsWeakness
821Online Growing
Random Trees
The study proposes an iterative anomaly detection method
for data streams based on tree ensembles. This unsupervised
technique adds a tree growth procedure that can incorporate new
data information into the existing model on a continuous basis.
The proposed method’s time and
memory consumption were not
reported in the study.

Datasets used for validation

This study also considered the datasets used for experimental validation in the 8 IoT-based papers considered in this work. The datasets used in the IoT-based studies include N-BaIoT, NSL-KDD, KDD CUP 99, UNSW-NB15, IoTID20 and DS2OS traffic trace datasets. The rest are Intel Lab, sensorscope, and the secure water treatment dataset. Among the datasets used, N-BaIoT, Intel Lab, UNSW-NB15, and IoTID20 are datasets based on IoT traffic. Table 10 shows the summary of the datasets used in each study.

Table 10. Dataset Used for validation.

DatasetPapersCount
N-BaIoT311
NSL-KDD28, 272
KDD CUP 99271
UNSW-NB15391
IoTID20281
DS2OS traffic traces291
Intel Lab301
Sensorscope301
Secure Water Treatment21, 262

Number of publications per year

In this subsection, we analyzed the number of publications per year using our established quality assessment criteria. The number of publications per year is shown in Table 11. From Table 11 below, no publication met the criteria of our studies in the years 2010, 2012, 2016, and 2018. The years 2011, 2014, 2017, and 2019 recorded one publication each. The highest number of publications was recorded in 2020, when nine (9) publications were recorded. There were 2 publications in 2013 and 2021, and 3 publications in 2022. Drawing our attention to the IoT-based studies, there was one (1) publication in 2015, two (2) publications in 2020, 3 publications in 2021, and 2 publications in 2022.

Table 11. Publication by Year.

SNYear of
Publication
Number of
Publications
120100
220111
320120
420132
520141
620151
720160
820171
920180
1020191
1120209
1220212
1320223

Challenges and directions for future work

In this section, we present some challenges we identified based on the analysis of our study. To begin with, we found out that 2 of the IoT-based studies used datasets (NSL-KDD and KDD CUP 99) that are no longer relevant when designing modern-day IDS.

Additionally, these datasets are non-IoT-based. Therefore, we recommend that future work use datasets from IoT environments to build and evaluate IDS for IoT systems. Secondly, from our studies, we discovered that seven (7) studies out of the 8 IoT-based studies designed lightweight IDS for IoT systems. However, only one reported on the proposed system’s memory consumption, and three (3) reported on the running time of the proposed methods. Only one study reported the computational complexity of the model used in designing their proposed IDS.

Additionally, in designing lightweight IDSs for IoT systems, parameters such as time and space complexity and power consumption of the proposed IDS should be evaluated. The portability of an IoT-based IDS is as important as its accuracy, precision, or recall. Therefore, we propose that future work include a performance matrix that measures the time and space complexity and power consumption of the proposed methods. Additionally, none of the IoT-based studies considered in this work deployed the proposed IDS on an IoT device. It is crucial not only to model IDSs for IoT ecosystems but these IDS models should be deployed on IoT devices. Deploying these models on real devices will help to evaluate parameters such as space complexity and energy consumption. Deployment models on real devices help to evaluate the model’s performance on drift adaptation and determine the model’s accuracy in production environments. We recommend that future studies on IDS for IoT systems incorporate model deployment on physical devices to evaluate how these models will perform in production environments.

Concept Drift in Machine Learning refers to a situation in which the statistical properties of the target variable change over time. In other words, the meaning of the input data used to train the model has changed significantly over time, but the model in production is unaware of the change and thus cannot make accurate predictions. Although incremental machine learning has the advantage of detecting concept drifts, only 2 out of 8 IoT-based studies considered in this work considered concept drift adaption in their studies. Network traffic is usually dynamic, and attackers try to circumvent IDSs by changing the attack signatures of knowns, which leads to a change in the target variable. Future IDSs for the IoT ecosystem must focus on how to build IDSs that can detect drifts and learn from those drifts with minimal human intervention.

Furthermore, the datasets used in the IoT-based studies and most datasets used in modeling IDSs are imbalanced, which gives these models higher accuracy but lower precision. To solve this challenge, more studies can be conducted on creating balanced datasets from IoT systems. Moreover, unlike traditional computing IDSs, which primarily focus on detection speed, precision, and accuracy, IDSs for the IoT ecosystem need a balance between accuracy, speed, precision, lightweight, and low energy consumption. Therefore, researchers must look at these parameters holistically to ensure that proposed IDSs for IoT systems can be deployed in such environments. It is recommended that future work in this domain should focus on using models that are not computationally intensive in designing IDSs for IoT systems.

Finally, the results from the various experimental validations done in IoT-based studies considered in this SLR show that incremental learning is capable of achieving the lightweight IDS status that most IDS problems in IoT systems seek to attain. However, more studies need to be done using the approach mentioned above in the IoT ecosystem to determine the viability of incremental learning to solve the problem of high speed, high accuracy, low energy, and minimal space complexity IDS for IoT systems.

Threats to validity of the study

Validity threats hampered the data extraction process and the quality assessment of the papers chosen for this SLR protocol. Using the threats identified by 40, we divided the threats into validity. Internal, external, construct, and conclusion validity are the threats identified by 40. Each of the threats is briefly described in the preceding paragraphs.

  • Internal validity: This threat focuses on implementing the SLR protocol, which includes search terms, the data extraction process, the method used for the research, and quality assessment criteria.

  • Construct validity: Construct validity is related to how search strings are constructed, the formulation of research questions, the online databases selected, and the inclusion and exclusion criteria. The search string used in this study was comprehensively formulated to answer the formulated research questions.

  • External validity: External validity focuses on the degree that the SLR results reflect the topic under review. We mitigated this threat by repeating the procedure used in our study.

  • Conclusion validity: The nature of SLR makes it not possible to capture all relevant studies that answer the formulated research questions. There is a probability that some papers were missed. Using inclusion and exclusion criteria lessens the gravity of personal bias and subjectivity.

Limitations of our study

We will discuss some of the study’s limitations in this section. The research focused on a few carefully selected but highly referenced databases in the field of study. We admit that, like most SLRs, we had difficulty locating all of the papers associated with this study. We also admit that some papers were left out due to the difficulty in identifying all papers related to this study. The method used in this study is meant to help us with our research on incremental machine learning-based intrusion detection in IoT systems.

This study’s analysis is limited to incremental machine learning-based intrusion detection systems on the internet of things and does not represent the complete analysis of the individual papers. We made every effort in this regard to analyzing the papers presented in this work in order to provide answers to the research questions posed in this study.

Conclusion

This study comprehensively analyzed incremental machine learning-based intrusion detection systems in the internet of things. The aim of the study was to help us understand existing work in the domain of our study and provide suggestions on how future work in IDS for IoT systems can be enhanced. The Internet of Things (IoT) has not only become a household name through its application in smart homes but has also been used in domains like agriculture, healthcare, transportation, and cities and grid systems. Whereas the advantages of IoT cannot be downplayed; its computational constraints make it difficult to deploy security methodologies that have been deployed in traditional computing systems. The study examined the existing state-of-the-art incremental machine learning approaches used to design lightweight intrusion detection systems for IoT environments, as well as the datasets used and how these studies are designing IDS without overburdening IoT device computational resources. As the number of things connected to the internet increases, researchers must use various methods to ensure the security of these things. The application of ML and DL in intrusion detection has proven to be an effective mitigation strategy on traditional computers, and the trend of current research shows that it will become an effective mitigation strategy in detecting intrusions in IoT environments.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 24 Nov 2022
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Agbedanu PR, Musabe R, Rwigema J et al. Towards achieving lightweight intrusion detection systems in Internet of Things, the role of incremental machine learning: A systematic literature review [version 1; peer review: 1 not approved]. F1000Research 2022, 11:1377 (https://doi.org/10.12688/f1000research.127732.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 24 Nov 2022
Views
7
Cite
Reviewer Report 15 May 2023
Yan Naung Soe, Department of Electrical and Information Engineering, Universitas Gadjah Mada, Yogyakarta, Special Region of Yogyakarta, Indonesia 
Not Approved
VIEWS 7
The authors conducted a review for the lightweight purpose of IoT-based introduction detection systems. It is interesting, but the following concerns have to be addressed.
  • Many typos are found.
     
  • In the
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Soe YN. Reviewer Report For: Towards achieving lightweight intrusion detection systems in Internet of Things, the role of incremental machine learning: A systematic literature review [version 1; peer review: 1 not approved]. F1000Research 2022, 11:1377 (https://doi.org/10.5256/f1000research.140269.r171434)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 24 Nov 2022
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.