Background

F1000Research

2046-1402

F1000 Research Limited

London, UK

10.12688/f1000research.161643.2

Research Article

Articles

Internet of things attack detection using machine learning algorithms

[version 2; peer review: 2 approved with reservations]

Abebe

Anduamlak

Conceptualization Data Curation Formal Analysis Investigation Methodology Resources Software Supervision Validation Visualization Writing – Original Draft Preparation Writing – Review & Editing https://orcid.org/0009-0001-1853-6984 a 1 Gebeyehu

Seffi

Formal Analysis Methodology Resources Software Validation Writing – Review & Editing 2 Alem

Abebaw

Conceptualization Investigation Methodology Resources Supervision Writing – Review & Editing 3 1Computer Science, Debre Tabor University, Debre Tabor, Amhara, Ethiopia 2Computer Science, Bahir Dar University, Bahir Dar, Amhara, Ethiopia 3Information Technology, Debre Tabor University, Debre Tabor, Amhara, Ethiopia

a anduamlak09@gmail.com

No competing interests were disclosed.

26 2 2026

2025

230

24 2 2026

2026

This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Background

The rapid growth of the Internet of Things (IoT) has brought transformative benefits across industries, yet it also presents significant security challenges due to the proliferation of connected devices.

Methods

This study proposes an artificial intelligence (AI) model leveraging machine learning algorithms to detect and classify multiple types of IoT attacks, including distributed denial of service (DDoS), reconnaissance, brute force, spoofing, and Mirai attacks, using the CICIoT2023 dataset. The dataset was divided into training and testing sets to ensure accurate performance assessment. After training, the models were tested, and their effectiveness was evaluated through metrics like accuracy and confusion matrices.

Results and conclusions

Among the algorithms used, the decision tree model outperformed than others, achieving an impressive accuracy of 98.34%. In contrast, Bayes classifiers, support vector machines (SVM), and logistic regression achieved accuracy rates of 92%, 91.5%, and 75%, respectively. These results highlight the significant potential of machine learning techniques in detecting and mitigating various IoT attacks, offering promising avenues for enhancing IoT security. The improvement of the performance of the IoT attack detection model using large datasets and the appropriate using deep learning algorithms with their parameters will be our future consideration in the domain.

Internet of Things cyber-attacks Internet of Things security machine learning

The author(s) declared that no grants were involved in supporting this work.

Revised Amendments from Version 1

We revised our manuscript by expanding the literature review section and conclusion section adjustment.

1. Introduction

The Internet of Things (IoT) is a network of hundreds of millions of gadgets that can communicate with one another with little help from users. IoT attack is a type of cyber-attack that targets systems made up of physical things, cars, buildings, and other objects integrated with software that allows them to exchange or collect data. ¹ As described by Anwer A. & et al., ² there were about 28 billion IoT devices in use in 2018. By 2022, this sum is predicted to reach 49.1 billion, and the IoT is projected to reach a display size of approximately ten trillion. IoT is acknowledged as a technique for appropriate mechanisms connected via servers, sensors, and different software. ²

According to the Ethiopian Information Network Security Administration (INSA) director report, they saved 23.2 billion birrs by defending against cyber-attacks. During 2022/2023, more than 6,859 cyber-attacks occurred and only 6,768 cyber-attacks got solutions. Banking and financial institutions, national intelligence security services, media institutions, selected governmental institutions, regional offices, health and higher institutions are the most targeted centers. According to the report, website attacks, malware attacks, port scans, distributed denial of service (DDoS), and structured query language (SQL) Injection are the most frequently occurring types of attacks in Ethiopia during 2022/23. ³

It is difficult to produce IoT security data that is useful for actual applications for several reasons. Having a vast network made up of multiple actual IoT devices, akin to the topologies of actual IoT applications, is one of the primary issues. Due to the widespread adoption of IoT, its inherent mobility, and standardization limitations, numerous researchers have looked into the risks that IoT devices pose to large corporations and smart towns. As a result, smart mechanisms that can automatically detect suspicious movement on IoT devices connected to local networks are required. ^{2,
4} The pervasive growth of the IoT creates an expanding attack surface for malicious actors. Detecting these attacks effectively is crucial for securing IoT systems and protecting sensitive data. This paper explored the use of machine learning (ML) for attack detection in IoT environments, focusing on the challenge of imbalanced datasets and potential solutions.

The IoT has become a crucial component of today’s technological landscape, as it allows various devices and systems to connect and communicate with each other over the Internet. This interconnected network of devices has revolutionized many industries, including healthcare, transportation, manufacturing, and smart homes. The IoT has become increasingly significant in today’s world by connecting everyday objects to the Internet, automating tasks and processes, enhancing data-driven decision-making, and creating new opportunities.

However, the widespread adoption of IoT devices has also introduced new security challenges and vulnerabilities. IoT devices are often designed with limited processing power and memory, making them more susceptible to attacks. Additionally, many IoT devices lack robust security features, such as encryption and secure authentication mechanisms, interconnectedness, and privacy concerns, making them easy targets for cybercriminals. There are different types of attacks targeting IoT devices namely; malware, DoS attacks, man-in-the-middle attacks, botnet attacks, and physical attacks. IoT devices, with their limited processing power, are vulnerable to cyberattacks, making them attractive targets for hackers seeking unauthorized access or control. These devices collect vast amounts of personal data, and inadequate security can lead to serious privacy breaches. Many are integrated into critical infrastructure, meaning attacks can cause widespread disruption and economic damage. Compliance with regulations is essential to avoid legal and reputational consequences. Security flaws in one device can compromise entire networks, emphasizing the need for robust protection. High-profile breaches can erode consumer trust, hinder adoption, and result in significant financial losses. If security risks are not addressed, innovation in IoT may slow down. Ensuring long-term sustainability requires continuous investment in security measures, and collaboration among organizations, developers, and policymakers is crucial for a secure IoT ecosystem.

The main contributions of this work are summarized as: (1)

Prominent result: The proposed model is focusing on evaluating ML algorithms’ performance using unbalanced datasets and the prominent result was resulted. Moreover, the authors also compared the results from the existed related works and performance has been improved.

(2)

Automation and efficiency: ML algorithms can analyze large amounts of IoT network data more quickly and accurately than manual methods. This could enable the detection of attacks in real time, enhancing the security of IoT systems.

(3)

Scalability: As the number of IoT devices continues to grow rapidly, ML based systems can scale efficiently to handle large networks with numerous devices, ensuring comprehensive attack identification and protection.

2. Related works

Several scholars used various methodologies to carry out studies on cyber-attack detection.

In their study, ² outlined a methodology for identifying suspicious network activity. They achieved a performance result of 85.34% using a random forest (RF) algorithm. Using the NSL KDD dataset, the suggested framework was used, and the results were compared for training, prediction time, specificity, and accuracy.

In their study, ⁵ several detection techniques are assessed using the recently created Bot-IoT dataset. During the implementation stage, seven distinct ML algorithms were employed, with the majority demonstrating exceptional performance. Throughout the deployment, new features were taken from the Bot-IoT dataset.

In their study, ⁶ they used six distinct algorithms RF, Logistic Regression (LR), SVM, NB, K-Nearest Neighbors (KNN), and multilayer perceptron (MLP) to conduct a comparative analysis of IoT cyber-attack detection techniques.

In their study, ⁷ To effectively detect attacks and abnormalities in IoT systems, the authors of the paper compared the performances of numerous ML models. LR, SVM, decision tree (DT), RF, and artificial neural network (ANN) are the ML algorithms that were employed in this case.

In their study, ⁸ they performed IoT behavior classification, monitoring the expected IoT behaviors and evaluating the efficacy of our optimally selected classifiers versus the superset of specialized classifiers by applying them to our IoT traffic traces.

In their study, ⁹ the study attempts to secure IoT devices by employing a Raspberry Pi as a honeypot to mimic IoT devices and verify the user’s intent, examine various attack patterns, and shield IoT devices from known threats. The purpose of these honeypots is to protect various protocols in IoT devices that are susceptible to assaults.

In their study, ¹⁰ Using an extended topology made up of multiple real IoT devices, they conducted a novel realistic IoT attack dataset, adopting IoT devices as both attackers and victims. They carried out, recorded, and gathered information from 33 attacks against IoT devices, categorized into seven types, and they showed how they could be replicated. Using the CICIoT2023 dataset, they assessed how well ML and deep learning algorithms classified and detected benign or malicious IoT network traffic.

In their study, ¹¹ applied a hybrid deep learning technique to handle the problem of uneven data classification in attack detection. Convolutional neural networks (CNNs) and long short-term memory (LSTM) networks are two components of a hybrid deep learning model that the authors suggest using to enhance classification performance. They draw attention to the difficulties that imbalanced datasets present in precisely identifying attacks. CNNs are useful for extracting spatial properties from the data, they say, whereas LSTM networks are better at extracting temporal dependencies from sequential data. The hybrid deep learning model’s performance is compared with that of conventional ML methods by the authors through experimentation on attack datasets that are not balanced. The results demonstrate that the hybrid deep learning approach outperforms traditional methods in detecting attacks in imbalanced datasets, showcasing the effectiveness of combining CNNs and LSTM networks for improved classification accuracy.

In their study, ¹² explains in detail the many ML methods that are employed to identify IoT botnets. In the IoT ecosystem, botnets pose an increasing threat, as the review emphasizes the significance of IoT security. It covers the many ML techniques and algorithms that have been put forth to identify and lessen IoT botnet threats. To give readers an understanding of the current status of this field of research, the manuscript carefully assesses the advantages and disadvantages of different methodologies. For those working on botnet detection and IoT security, the paper is an invaluable resource overall.

The study, ¹³ examined how ML approaches applied to Industrial Internet of Things (IIoT) systems security are affected by imbalanced datasets. To better understand how class imbalances in datasets impact ML models’ ability to identify security vulnerabilities in IIoT environments, the study looks into how these imbalances may impact model performance and accuracy. Within the framework of IIoT security, it addressed several problems and difficulties associated with unbalanced datasets, including minority class misclassification and biased model predictions. Additionally, to improve the efficacy of machine learning-based security mechanisms in IIoT systems, the book suggests possible approaches and answers to these problems. Overall, the study provided valuable insights into the implications of imbalanced datasets on the security of IIoT and offers recommendations for improving the robustness and reliability of security measures in industrial IoT settings.

According to Radanliev et al., ¹⁴ as data strategies evolve, dependency models have become increasingly valuable for managing contemporary cyber risk challenges. These models aid in cyber risk estimation and general impact assessments by illustrating the intricate relationships among various digital components. The literature underscores the importance of a comprehensive understanding of cyber risks, particularly in relation to the Internet of Things (IoT), where conventional assessment methods may fall short. The paper advocates for the adoption of innovative risk assessment and management strategies that can effectively address the unique challenges presented by emerging IoT cyber threats. By utilizing these methods, the cybersecurity community can enhance its defenses and better navigate the constantly shifting landscape of digital vulnerabilities.

According to Radanliev et al., ¹⁵ the study focuses on the role of AI-based Bill of Materials (BOMs) in ensuring the trustworthiness and quality of AI systems, evaluating CHERI’s security features for addressing cybersecurity threats, and using AI techniques to identify and analyze threats, exploits, and vulnerabilities in Software Bill of Materials (SBOMs). The results indicate that combining CHERI with AI BOMs significantly improves the security and transparency of AI systems. This integration not only aids in identifying and mitigating specific threats and vulnerabilities but also enhances trust and security within AI systems, highlighting the potential of AI-driven approaches to bolster the security of SBOMs.

However, the security issue of IoT has not addressed yet and further investigations are required. Therefore, we the authors are focusing on such issues to improve the performances of the existing works and evaluating other ML algorithms in this paper.

3. Methods

This study followed crucial steps illustrated in the proposed IoT attack detection architecture to conduct rigorous experiments, as shown in Figure 1 designed by the authors.

Figure 1. Proposed model architectures of IoT attack detection.

This figure has been created by the author.

3.1 Dataset information

One of the most frequent problems faced by ML researchers is locating reliable datasets with the necessary properties. Regardless of the size of the dataset, selecting a specific learning technique is not as crucial as creating a well-cleaned representative dataset. ¹⁶ In our investigation, we used a distinct IoT attack dataset from the CICIoT2023, which has a total of 221,834 occurrences that were recorded as Comma Separated Values (CSV) files. In our study, 42 relevant features were extracted, and the total dataset was labeled namely Benign Traffic, DDoS, Spoofing, SQL Injection, Recon, and Mirai. The following three key reasons were taken into account why selecting the aforementioned dataset: i) the dataset contains 42 attributes extracted from different categories of IoT attack features; ii) the dataset contains 221,834 dataset instances which are cleaned, imbalanced, and contain the required features as shown in Table 1; iii) the dataset contains raw datasets so that it is possible to generate new features as needed.

Table 1. Dataset information.

IoT attack classes	Collected dataset	Dataset source
Mirai	50,632	Canadian Institute for Cyber Security CICIoT2023
Recon	6,094
SQL Injection	185
Benign Traffic	21102
DDoS	137,941
Spoofing	5880
Total dataset	221,834

3.2 Data Preprocessing and feature selection

Preprocessing data and feature extraction for IoT attack detection with an imbalanced dataset is an important step to ensure the effectiveness of ML approaches. The researcher implemented dimensionality reduction, data splitting, and data cleaning. To ensure its quality and reliability, the researcher handles missing values, outliers, and any inconsistencies in the dataset.

Feature selection involves selecting and transforming relevant features from the raw data to improve the performance of the ML model. The researcher extracted 42 informative features using principal component analysis techniques.

3.3 Train-test dataset spit ratios

Train-test dataset splits are required before feeding datasets to the learning algorithms. This is because it’s anticipated that learning model(s) would be evaluated using unidentified datasets to assess how well they can forecast new IoT threats. Most studies employed train-test dataset split ratios of 80%:20%. ¹⁷ However, the study groups could not agree on how much train-test dataset split ratio to use for how many dataset instances. This is why the suggested study chose a dataset split ratio that yields improved training and testing set accuracy for each classifier by using 80%:20% train-test dataset split ratios on each classifier.

As a result, for our model experiment from the total dataset, we have taken 80% (177,467) of the dataset used for training, and 20% (44,367) used for testing our model performance accuracy.

3.4 Implementation Tools and Algorithms

The study conducted extensive experiments using Python to test and train the suggested Supervised ML algorithms using high-speed computing. Python was chosen as the implementation language for the study due to its abundance of libraries and packages tailored for ML research.

We the authors employed four well-known ML algorithms, namely; decision tree, SVM with default parameters, SVM with sigmoid kernel, LR, and Naïve Bayes ^{18–
21} to identify IoT attacks.

DTs are versatile and intuitive models that make predictions by recursively splitting the data based on different features. They are known for being interpretable and can handle both categorical and numerical data. We used default DT parameters like Max depth, minimum samples per leaf, splitting criteria, and maximum features per split.

SVM is a powerful algorithm that separates data points into different classes by finding the best hyperplane that maximizes the margin between the classes. The default parameters refer to the default values set by the algorithm, which may vary depending on the implementation. SVM can also utilize different kernels, such as the sigmoid kernel, which allows for non-linear separation of data points. The sigmoid kernel maps the data into a higher-dimensional space to find a decision boundary.

Despite its name, LR is a classification algorithm rather than a regression algorithm. It calculates the probability of an instance belonging to a certain class using a logistic function. It’s commonly used for binary classification problems. To control the degree of regularization, penalizing complex models, and reducing overfitting we used the regularization parameter (lambda). Chooses gradient descent algorithm used to find the optimal model parameters. Sets the maximum number of iterations for the solver to find the optimal parameters.

Naïve Bayes is a probabilistic classifier that calculates the probability of an instance belonging to a particular class based on Bayes’ theorem, assuming that all features are independent. We used the following key parameters to implement the Naïve Bayes algorithm for IoT attack detections. Smoothing parameter (Alpha): Adds a small value to the estimated probabilities to avoid division by zero and improve stability, especially with sparse data. Feature selection: Choosing the subset of features most relevant for classification can improve performance and interpretability.

4. Experimental result evaluation 4.1 Evaluation metrics

It’s critical to specify performance metrics appropriate for the task at hand when assessing ML models. We employed the most significant performance metrics for, the accuracy, and confusion matrix to assess our findings. ²²

Accuracy is calculated as the sum of two accurate predictions (TP + TN) divided by the total number of data sets (P + N). The best accuracy is 1.0, and the worst is 0.00. ²² Accuracy = TP + TN P + N (1)

4.2 Experimental results and comparisons

To attain better performance results, we conducted data preprocessing techniques. The dataset is transformed into a structure appropriate for ML using pre-processing data transformation techniques. ²³ To make the dataset more accurate and efficient, this stage also involves cleaning it by deleting any irrelevant or corrupted data.

We employed various supervised ML techniques, including LR, DT, SVM, and NB, to carry out this investigation. DT outperformed other ML algorithms by achieving accuracy of 98.34%, as shown in Table 2.

Table 2. Applied ML algorithm performance result.

Machine learning algorithms	Accuracy %	Remark
Decision tree (DT)	98.34%
Support Vector Machine (SVM)	91.5%	With default hyperparameters
Support Vector Machine (SVM)	69.27%	With sigmoid kernel
Logistic Regression (LR)	75%
Naïve Bayes (NB)	92%

Accuracy is one of the most relevant performance evaluation metrics in ML as well as deep learning algorithms. This metric is also deployed in this work, as shown in Table 2 that shows DT was the highest-performing algorithm, followed by NB and SVM with default value. SVM with a sigmoid kernel received the lowest performance score of 69.27%, making it the least effective algorithm. Despite having a high-performance score, NB was notably slower than the other algorithms. Graphically, the performance result is shown in Figure 2.

Figure 2. Machine learning approach performance applied to the CICIoT2023 dataset.

In addition to accuracy, confusion matrix is also used to evaluate the performance. An N x N matrix, where N is the total number of target classes, is called a confusion matrix and is used to assess how well a classification model performs. The ML model’s predicted outcomes are compared with the actual target values in the matrix. The confusion matrix was obtained when we employed different ML algorithms of SVM, LR, NB, and DT algorithms respectively, as shown in Figure 3.

Figure 3. Confusion matrix obtained in the identification process conducted using different machine learning models (SVM (A), LR (B), NB (C), and DT (D)).

In addition to comparing and evaluating the performance of the ML algorithms deployed in this work, the authors also compared such algorithms with the existed related works, as shown in Table 3. In most of cases, the performance improvements have been achieved in the state-of-the-art even though there are different limitations and challenges that need further investigations in the domain area.

Table 3. Result comparison from the related works.

Related works	Title of related work	Methods used	Performance %
⁵	Internet of Things Cyberattacks Detection Using Machine Learning	NB	79%
²	Attack Detection in IoT Using Machine Learning	SVM, RF	85.34%
⁴	Cyberattack Detection Using Machine Learning	KNN & RF	88%
⁷	Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches	DT, RF & ANN	99.4%
¹⁰	Botnet Attack Detection in IoT Using Machine Learning Technique	DT, LR	94%
Our proposed work	Artificial intelligence model for internet of things attack detection using machine learning algorithms	DT, NB, SVM, LR	98.34%

5. Conclusions

IoT security attacks have been a hot issue in recent time. This paper aimed to design a multi-class IoT attack detection model using ML algorithms. The employed four supervised ML algorithms, namely; DT, SVM, LR, and NB were used to address the proposed problem related to identifying IoT attacks. The recent Canadian Institute of Cyber Security CICIoT2023 dataset, which contains the imbalanced instances and multi-class types of attacks with six classes, was used for designing and evaluating the proposed model. The dataset was splited into 80%:20% ratio for training and testing the model, respectively. The experiments are conducted using Python in Google Co-Lab.

To evaluate the model performance, we used tabular representation (accuracy) and confusion matrix for each employed algorithm. The prominent performance result has been found. In DT, we attained the maximum prediction accuracy rate of 98.34%. DT outperforms SVM at 91.5%, LR at 75%, and Bayes classifiers (NB) at 92%. Our model performs superior accuracy in the prediction of these IoT attacks when compared to other benchmarks of ML classification approaches.

In the area of IoT threat detection, our suggested model result offers several contributions, including resolving unbalanced data issues, enhancing detection precision, increasing imbalanced data awareness, improving performance, and forwarding future directions in the area. Therefore, the result could be enhancing security, reducing response time, and enabling adaptive defense to provide a significant contribution to the domain of IoT security. The work on IoT security attack identification using ML approaches holds great promise in improving IoT security.

The findings from the multi-class IoT attack detection model highlight several urgent actions for the industry, including the need to expand and diversify datasets for reliability, build resilience against adversarial attacks, enhance detection precision through continual algorithm refinement, promote awareness and education on IoT security challenges, and foster collaboration among stakeholders, researchers, and cybersecurity experts to strengthen defense mechanisms.

The design of IoT security attack detection systems faces several limitations. Firstly, the dataset used may be too small or homogeneous, affecting the reliability and general applicability of the assessments. Secondly, adversarial attacks can manipulate IoT network traffic, potentially evading machine learning detection systems and exploiting vulnerabilities within the models or their input data, making accurate detection challenging. Lastly, the study relied solely on machine learning algorithms rather than incorporating deep learning methods, which are crucial for enhancing performance with larger datasets.

Based on the limitations mentioned earlier, the improvement of the performance of IoT attack detection model using large datasets and the appropriate deep learning algorithms with their parameters will be our future consideration in the domain.

Ethics and consent

Ethical approval and consent were not required.

Data availability

All necessary data available from Kaggle and download it after filling CIC DATASET DOWNLOAD FORM for “CIC_IOT_Dataset2023” form. https://www.unb.ca/cic/datasets/iotdataset-2023.html.

References 1

Abdul-Qawy

Magesh

Tadisetty

: The Internet of Things (IoT): An Overview. Int. J. Eng. Res. Appl. 2015.

Tharwat

: Classification assessment methods. Applied Computing and Informatics. 2020;17(1):168–192. 10.1016/j.aci.2018.08.003

Belay

: Web Security Vulnerability Analysis of Ethiopian Government Offices. 2nd world conference on Engineering and Technology. Brussels, Belgium:2021.

Soka

: Cyber attack assessment report in Ethiopia during 2023. Addis Abeba: INSA-የኢንፎርሜሽን መረብ ደህንነት አስተዳደር;2023.

Haseeb

Mansoori

Al-Sahaf

: IoT Attacks: Features Identification and Clustering. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). Western Sydney:2020.

Jadel Alsamiri

: Internet of Things Cyber Attacks Detection using Machine Learning. (IJACSA) International Journal of Advanced Computer Science and Applications. 2019;10. 10.14569/IJACSA.2019.0101280

Mohammed

AHK

Jebamikyous

H-H

: IoT Cyber-Attack Detection: A Comparative Analysis. ACM. 2021. 10.1145/3460620.3460742

Deepthi Reddy

: Cyber Attacks Detection using Machine Learning. Neuroquantology. 2022.

Hasan

Islam

Zarif

MII

: Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches. Elsevier B.V.;2019. 10.1016/j.iot.2019.100059

Pashamokhtari

Batista

Gharakheili

: Efficient IoT Traffic Inference: from Multi-View Classification to Progressive Monitoring. ACM Transactions on Internet of Things. 2023. 10.1145/3625306

Goyal

Krishna

Kumar

: Detection And Prevention Of Cyber Attacks On Multi-purpose IoT Devices Using Honeypot. 2nd International Conference on “Advancement in Electronics & Communication Engineering (AECE 2022)”. 2022.

Neto

ECP

Dadkhah

Ferreira

: CICIoT2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment. Sensors. 2023;23:5941. 2023. 37447792

10.3390/s23135941

PMC10346235

Anwer

Khan

Farooq

: Attack Detection in IoT using Machine Learning. Engineering, Technology & Applied Science Research. 2021;11(3):7273–7278. 10.48084/etasr.4202

Radanliev

De Roure

Maple

: AI security and cyber risk in IoT systems. Front. Big Data. 2024;7. 39449740

10.3389/fdata.2024.1402745

PMC11499169

Radanliev

Santos

Brandon-Jones

: Capability hardware enhanced instructions and artificial intelligence bill of materials in trustworthy artificial intelligence systems: analyzing cybersecurity threats, exploits, and vulnerabilities in new software bills of materials with artificial intelligence. J. Def. Model. Simul. Appl. Methodol. Technol. 2024;23(1):147–175. 10.1177/15485129241267919

Laurent Sindayigaya

: Machine Learning Algorithms: A Review. International Journal of Science and Research (IJSR). 2022;11:1127–1133. 2319-7064. 10.21275/SR22815163219

Sarke

: Machine Learning: Algorithms, Real-World Applications and Research Directions. Springer Nature Singapore. 10.1007/s42979-021-00592-x

Manisha

KCJ

Manjramkar

: Cyber Security Using Machine Learning Techniques. Advances in Computer Science Research. 2023. 10.2991/978-94-6463-136-4_59

Shaukat

Luo

Chen

: Cyber Threat Detection Using Machine Learning Techniques: A Performance Evaluation Perspective. 2020 International Conference on Cyber Warfare and Security (ICCWS). Islamabad, Pakistan:2020.

Kibreab Adane

: Machine learning and deep learning based phishing websites detection: the current gaps and next directions. Review of Computer Engineering Research. 2022;9(1):13–29. 10.18488/76.v9i1.2983

Abdullahi

Baashar

Alhussian

: Detecting Cybersecurity Attacks in Internet of Things Using Artificial Intelligence Methods: A Systematic Literature Review. Electronics. 2022;11(2):198. 2022. 10.3390/electronics11020198

Nazir

Zhu

: Advancing IoT security: A systematic review of machine learning approaches for the detection of IoT botnets. Journal of King Saud University - Computer and Information Sciences. 2023;35(10):101820. 10.1016/j.jksuci.2023.101820

Zolanvari

Teixeira

Jain

: Effect of Imbalanced Datasets on Security of Industrial IoT Using Machine Learning. 2018 IEEE International Conference on Intelligence and Security Informatics (ISI). 2018.

10.5256/f1000research.197072.r464887

Reviewer response for version 2

Gonaygunta

Hari

1 Referee 1University of the Cumberlands, Williamsburg, USA

Competing interests: No competing interests were disclosed.

17 3 2026

2026

This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

recommendation

approve-with-reservations

Summary of the Article:

The article of the research is on the implementation of several methods of the machine learning algorithm to identify and categorize diverse forms of cybersecurity attacks that occur within the Internet of Things systems. The internet of things is growing exponentially and has been estimated to have several ten trillion connected devices. Such expansion is escalating insecurity threats to a similar proportion. The purpose of the study is to create a multi-class attack detection framework based on the different methods of the machine learning algorithm on a specified set of data related to the CICIoT2023 competition having plenty of attack data of various types, including DDoS, reconnaissance, spoofing, Mirai, SQL injection, and numerous others.

A. Develop Literature Review and Framework of Contributions.

Issue: According to the information provided to the authors by the reviewer, it was suggested that the article may be further elaborated through comparison and contrasting the findings and results of the paper with the recent changes and advances in the sphere of IoT security and attack detection through different models and techniques, specifically using AI and deep learning techniques and models to guarantee cybersecurity and resilience to adversarial attacks and imbalanced data.

Recommendations:

To further enrich the paper, one may compare the results and findings of the paper with the recent developments and advancements of the field of IoT security and attack detection by using different models and techniques, particularly the ones that imply the usage of AI and deep learning models and techniques to guarantee cybersecurity against attacks.

Furthermore, one may compare the results and findings of the models used in the paper with the recent models, techniques, methods, and approaches including CNNs, LSTM, and others.

B. Problem Dataset Limitations and Generalizability.

Issue: Even though the authors have used a dataset of 221,834 instances of the CICIoT2023 dataset, more details concerning the size of the dataset utilized by the authors could be elaborated and justified by the authors to add more details to the paper.

Recommendations:

Talk about the shortcomings of the dataset, e.g. is it representative of actual IoT traffic? Do the patterns of attack vary enough?

To further prove how generally your models can be applied, perform your models on other datasets or even actual traffic.

Report on the skew of the dataset, where any, by your features selection / augmentation techniques.

Minor Points:

Better would be sharing more information about the hyperparameters that were optimized to apply the different machine learning algorithms that were applied in the paper. An example is the depth of decision tree algorithm that was implemented within the paper or the kernel function that was implemented within the SVM algorithm within the paper.

More information on the feature selection procedure that was adopted in the paper would be more helpful. As an example, the aspects that the paper chose to use and the steps of applying the PCA algorithm in the paper.

Conclusion:

The article is effectively rooted on experimental ground and confirms its arguments with the outcomes of the experiments to demonstrate the high accuracy of the models in the measurement of attacks with decision trees. However, as the required alterations and enhancements are included in the article, it will contribute to its scientific validity and applicability to the area of IoT security even further.

Is the work clearly and accurately presented and does it cite the current literature?

Yes

If applicable, is the statistical analysis and its interpretation appropriate?

Yes

Are all the source data underlying the results available to ensure full reproducibility?

Partly

Is the study design appropriate and is the work technically sound?

Yes

Are the conclusions drawn adequately supported by the results?

Yes

Are sufficient details of methods and analysis provided to allow replication by others?

Yes

Reviewer Expertise:

AI, Block Chain, Cybersecurity, Quantum Computing, ML

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

10.5256/f1000research.177702.r370251

Reviewer response for version 1

Radanliev

Petar

1 Referee https://orcid.org/0000-0001-5629-6857 1University of Oxford, Oxford, England, UK

Competing interests: No competing interests were disclosed.

17 3 2025

2025

recommendation

approve-with-reservations

The article is well-structured and well-written. It deserves consideration for indexing. There are some corrections, which I outline in more detail below:

The article is a bit short, I am not certain about the journal page limit, but if you have space, try to expand with a focus on contribution. One way to improve your contributions is to improve your review and compare existing literature and knowledge. For example, you have done a great job reviewing so many articles, but only a few articles on cyber risk from future developments in new technologies, such as AI, which seems to be all the rage at the moment. There are recent articles on this topic that review recent and relevant literature, for example, on the related topic of cybersecurity threats, exploits, and vulnerabilities in new software bills of materials with artificial intelligence - see: [Ref 1] and on the related topic of ‘AI security and cyber risk in IoT systems’ - see: [Ref 2] It would be interesting to see a few sentences reviewing and comparing your work in relations to these recent studies in related topics.

- in conclusion, could you highlight your conclusions on what urgent measures can be taken to help the industry adapt to these findings?

I hope the comments and feedback are helpful, and well done for writing such an interesting article. I am looking forward to reading the updated version.

Is the work clearly and accurately presented and does it cite the current literature?

Yes

If applicable, is the statistical analysis and its interpretation appropriate?

I cannot comment. A qualified statistician is required.

Are all the source data underlying the results available to ensure full reproducibility?

Partly

Is the study design appropriate and is the work technically sound?

Yes

Are the conclusions drawn adequately supported by the results?

Yes

Are sufficient details of methods and analysis provided to allow replication by others?

Yes

Reviewer Expertise:

AI security, IoT, cyber risk, blockchchain security, post-quantum cryptography.

References 1

: Capability hardware enhanced instructions and artificial intelligence bill of materials in trustworthy artificial intelligence systems: analyzing cybersecurity threats, exploits, and vulnerabilities in new software bills of materials with artificial intelligence. The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology .2024; 10.1177/15485129241267919

10.1177/15485129241267919

: AI security and cyber risk in IoT systems. Front Big Data .2024;7: 10.3389/fdata.2024.1402745 1402745

39449740

10.3389/fdata.2024.1402745

Abebe

Anduamlak

Computer Science, Debre Tabor University, Debre Tabor, Amhara, Ethiopia

Competing interests: No any competing interest

17 3 2025

Thank you for your constructive comment. We acknowledge the reviewer’s concerns regarding to expansion of existing literature and knowledge comparision. We also acknowledge the reviewer’s concerns regarding the conclusion section.

We will revise as per your comment.