Intelligent Digital Transformation: Redefining Fraud Detection in Accounting

Mohammed Th. Shaamood; Amjed Abbas Ahmed; Khattab M Ali Alheeti; Hamsa M. Ahmed; Ameen Shaman Ameen; Saeed Matar Alshahrani

doi:10.12688/f1000research.176100.1

Home Browse Intelligent Digital Transformation: Redefining Fraud Detection in...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Intelligent Digital Transformation: Redefining Fraud Detection in Accounting

[version 1; peer review: 2 not approved]

Mohammed Th. Shaamood¹, Amjed Abbas Ahmed², Khattab M Ali Alheeti ³, Hamsa M. Ahmed³, Ameen Shaman Ameen⁴, Saeed Matar Alshahrani⁵

Mohammed Th. Shaamood¹, Amjed Abbas Ahmed², [...] Khattab M Ali Alheeti ³, Hamsa M. Ahmed³, Ameen Shaman Ameen⁴, Saeed Matar Alshahrani⁵

PUBLISHED 02 Apr 2026

Author details Author details

¹ College of Education for Humanities / Educational and Psychological Sciences Department, University of Anbar, Ramadi, Iraq
² Department of Computer Science, Imam Alkadhim University College, Baghdad, 10011, Iraq
³ Computer Networking Systems Department, University of Anbar, Ramadi, Al Anbar Governorate, Iraq
⁴ College of Science / Dept. of Mathematics, University of Anbar, Ramadi, Iraq
⁵ College of Computing and Informatics, Saudi Electronic University, Riyadh, Riyadh Province, Saudi Arabia

Mohammed Th. Shaamood
Roles: Data Curation, Resources

Amjed Abbas Ahmed
Roles: Investigation, Methodology

Khattab M Ali Alheeti
Roles: Software, Writing – Original Draft Preparation

Hamsa M. Ahmed
Roles: Resources, Writing – Review & Editing

Ameen Shaman Ameen
Roles: Formal Analysis, Project Administration

Saeed Matar Alshahrani
Roles: Resources, Software, Supervision

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Fallujah Multidisciplinary Science and Innovation gateway.

Abstract

Background

Accounting is an industry that the digital transformation process has significantly impacted, and the use of artificial intelligence (AI) is now considered one of the most crucial tools for enhancing fraud-detection capabilities. The ineffectiveness of older fraud-detection techniques is evident in their failure to curb complex schemes currently employed to manipulate accounting systems.

Methods

The machine learning models were compared with each other in the following aspects of the proposed framework: accuracy, F1-score, recall, and the errors committed. The artificial transaction data generated in this study to resemble actual financial transactions shows that using all four models provides optimal results for identifying fraudulent cases.

Results

Artificial Neural Networks (ANN) outperformed all algorithms in terms of accuracy with 99.19%, and the minimum error rate was 0.81%, as for the recall, whereas Random Forests was the best among all the algorithms, up to 98.38%, which makes it efficient for detecting fraud. The results obtained suggest that the proposed integrated AI-based framework yields better detection results than existing rule-based systems, as well as a decrease in the rate of false alarms.

Conclusions

The idea in this study is a great step ahead in the enhancement of accounting information systems, as it provides an efficient tool for minimizing fraudulent issues that affect financial institutions by automating the process of data analysis..

Keywords

Fraud Detection; Machine Learning; AI; Accounting.

Corresponding author: Khattab M Ali Alheeti

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2026 Shaamood MT et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.

How to cite: Shaamood MT, Ahmed AA, Alheeti KMA et al. Intelligent Digital Transformation: Redefining Fraud Detection in Accounting [version 1; peer review: 2 not approved]. F1000Research 2026, 15:465 (https://doi.org/10.12688/f1000research.176100.1) First published: 02 Apr 2026, 15:465 (https://doi.org/10.12688/f1000research.176100.1) Latest published: 02 Apr 2026, 15:465 (https://doi.org/10.12688/f1000research.176100.1)

1. Introduction

The increase in possibilities for digitized financial systems has opened up efficiencies and garnered waves for numerous organizations but offers new ways of fraud in the process.¹ Available evidence suggests that enterprises across the world suffer about $5.1 trillion losses through financial fraud, and accounting fraud is a primary contributor.¹ Current conventional measures that embrace auditing techniques and misconceptions in business have been inadequate for the detection of sophisticated fraudulent practices in organizations that take advantage of flexible fiscal environments. Information technology, such as artificial intelligence, machine learning, and big data analytics, has become an enabling technology for enhancing fraud detection in accounting systems.² These technological solutions facilitate the assessment process of transactions and enable the detection of statistically aberrant or presumptively fraudulent exchanges. However, AI has the ability to continue learning from new data and enhance detection effectiveness in connection with new forms of scams. According to the work in³: Machine learning algorithms are useful in that they can handle multiple values simultaneously to determine various fraud factors that cannot be detected by conventional approaches. This study aims to develop an integrated system where three different classification algorithms (Decision Tree, Random Forest, and Neural Network) are used to reduce the rate of false positives and improve the detection of fraudulent transactions in accounting systems. Advanced technologies, including artificial intelligence, machine learning, and big data analytics, have emerged as powerful tools for revolutionizing fraud-detection capabilities within accounting systems. These technologies enable the analysis of vast transaction datasets, identification of subtle anomalies, and recognition of complex patterns that may indicate fraudulent activity. AI-powered systems can continuously learn from new data, adapt to evolving fraud techniques, and improve the detection accuracy over time. As noted in,³ machine-learning algorithms can process multivariate data points simultaneously, allowing for the identification of fraud indicators that would remain invisible to traditional detection methods. Digital transformation has fundamentally revolutionized the financial sector, driving unprecedented changes in how organizations conduct business and manage risk. This transformation encompasses the integration of cloud computing, artificial intelligence, and advanced analytics into core business processes.⁴ Traditional methods of fraud detection, primarily relying on auditing procedures and rule-based systems, have proven insufficient in identifying sophisticated fraudulent schemes that exploit the complexities of modern financial ecosystems.

In this study, the subject adapts and assesses an innovation that entails the use of several machine-learning models in a consolidated framework to detect fraud in accounting systems. In this context, decision trees, random forests, and neural networks were compared for transaction fraud detection, establishing an optimal approach for various scenarios. Finally, the evaluation approach also includes a deep analysis of each model, with a specific focus on accuracy, F1-score, and recall rates, as well as a confusion matrix to capture more information that may be missing in other evaluation metrics. Thus, the basis of fraud-fighting strategies is developed in this study, which is applied to the resolution of efficiency in loss reduction that increased from poor accounting practices while maintaining operational efficiency in the processes of accounting.

This paper is composed of several sections. Section Two is dedicated to related research. Section Three proposes the model and Section Four provides the dataset description. Section Five contains the results and discussion, and Section Six concludes the paper.

2. Related works

The new and enhanced technique of an artificial intelligence-based approach has shown higher effectiveness for fraud detection than the earlier methodologies. It is beneficial to mention that several studies have developed this direction as highly valuable for the rapidly growing field: an approach for detecting accounting fraud in financial statements was proposed by Zhang et al. (2023)⁵ based on deep learning. Their model combined explicit use of NLP to work with textual information and with numeric data, and the mean accuracy of the results was 91.7%. They showed that incorporating both qualitative and quantitative variables further improved the prediction power compared to models employing only financial data. Our work extends this idea further and applies a genuinely comparative approach for multiple models using a tailored algorithm assignment.

Li and Johnson (2022)⁶ propose a real-time fraud detection system for banking transactions using ensemble learning techniques. Their system was developed from random forest and gradient boosting regression trees to analyze transactional patterns; it obtained a detection rate of 89.3 percent with a false positive rate of 2.1 percent. The authors also focused on the fact that feature engineering is critical for addressing the problem and dealing with minor fraud hints. The proposed system extends these works by adding more complex architectures of neural networks and simultaneously providing benefits from ensemble methods.

Patel et al. (2022)⁷ focused on explainable AI in fraud detection systems in accounting processes. They argued that active and passive reasoning for a fraud type helped enhance the trust of clients and the overall usage of a system in the financial frat. A high degree of concern is on interpretability, which comes at the cost of a slightly lower accuracy of 93.5%; however, a decision tree can be interpreted by rule extraction. This work is expanded further by comparing both Decision Trees, which are very interpretable models, and Neural Networks, which have the potential to achieve better accuracy.

Rodriguez and Kim (2023)⁸ presented an analysis of how graph neural networks can be used to detect multiple entity and transaction fraud schemes. Their approach was to use the graph structure to represent the patterns of the financial transactions; thus, it became easy to detect fraud across the various interrelated accounts with an efficiency of 94.2%. This was an increase of 15.3% as compared to the conventional styles. The system proposed in this study aligns with this strategy by aiming at transaction-level and accommodative detectors that can be incorporated into graph-based systems.

Wang et al. (2023)⁹ adopted a federated learning method for fraud detection to ensure that no sensitive data were shared among themselves, yet the financial institutions would gain from the collective model training. Their system opened to a 92.8% detection rate, without subjecting any transaction information to the outside world. Although, as stated earlier, our work does not address privacy issues or design, it can be shown that the evaluated models are compatible with federated architectures.

Alharbi and Matthews (2022)¹⁰ conducted a detailed study and comparison of credit card fraud detection using ten different algorithms, such as random forest, support vector machine, and neural networks. Their research showed that the majority of ensemble learning methods are superior to individual learners, even though the highest accuracy stood at 95.7% only when using an optimal combination of the learners. The highlighted case continues by applying a comparative approach in the accounting domain, which is less similar to credit card systems in terms of transaction patterns and fraud indicators.

Chen and Davis (2023)⁴ put together an inventive fraud detection system that revised its model with reference to fraud analysts’ feedback. This approach minimized the false positive rate by 42% to other static models, while retaining equal detection percentage, which was 93.1%. Confounding the aforementioned facts, the researchers also pointed out that the actual deployment of fraud-detection systems requires people in the loop or working with people.

Therefore, this study introduces several benefits in contrast to these previous studies. First, these three machine-learning algorithms—Decision Trees, Random Forests, and Neural Networks—are widely used in accounting fraud detection, but they differ significantly in their structure and complexity. Random Forests is an ensemble extension of decision trees, whereas neural networks represent a completely different, often more complex, paradigm. A 99.19% accuracy rate achieved by the Neural Network model demonstrates a significant improvement over previous system. Furthermore, a detailed confusion matrix analysis is essential for understanding the effectiveness and identifying areas for enhancement in each model, ultimately helping financial institutions operationalize a superior, refined system.

3. Proposed framework

The findings presented in this paper outline a comparison-based approach using multiple artificial intelligence models to detect fraud in accounting systems. It comprises the following four steps: the dataset source,¹¹ preprocessing of the dataset, and ML phasing (training and testing), as shown in Figure 1.

Figure 1. Block diagram of the proposed system.

Figure 1 shows the block diagram of the proposed system. It consists of four phases: dataset source, preprocessing, training, and testing.

3.1 Data preprocessing

The following data preprocessing techniques will then be employed, depending on the nature of the financial data acquired: handling missing data, outlier treatment, and Data Balancing/over-sampling.

(1)

Data normalization using : X^{'} = \frac{x - x_{\min}}{x max - xm ⅈ n}

where:

X is the original value.

• x_min: is the minimum of the values in the provided data.
• x_mix: There is no upper limit, and the highest value is present in the dataset.
• X^′: is the normalized value.

It is a component that performs preparatory work on the transaction data that needs to be analyzed, namely,

1. Pre-processing: Procedures to handle missing data, outliers, and to remove duplicates that may be contained in the dataset.
2. Preprocessing: Normalization of numerical data to make the values within a reasonable range, use of the one-hot method on categorical data, and extraction of time features extracted from transaction time stamps.
3. Data Balancing: To handle a natural tendency that involves handling a small percentage of fraudulent transactions in the total number of possible transactions, we implement a synthetic minority oversampling technique.

3.2 Performance evaluation

The evaluation component also includes the assessment of the model, which has several aspects of performance metrics.

1. Accuracy: This is the total proportion of the transactions correctly categorized out of the total ones conducted in the datasets.
2. F1-Score: Precision and recall are calculated as the arithmetic mean of their values, which offers medium accuracy of the model.
3. Error Rate: The percentage of misclassified transactions.
4. Accuracy Assessment: Used to identify the true results along with false results to show greater detail of classification performance.

It also has a comparative visualization dashboard that facilitates easy understanding of the performance and provides recommendations on which model to apply with regard to the aims of an organization, such as reducing false negatives or false positives.

4. Result and discussion

Through an analysis of advanced technologies, it is evident that information technology has rapidly evolved in the field of accounting, especially in detecting fraud cases. The implementation of IT in accounting entails the application of AI, ML, and big data analytics to improve the effectiveness and reliability of fraud detection. Some of the original means of fraud detection involve the use of audit approaches as well as rule-based detection, which are slow and error-prone. However, using predictive analytics, anomaly detection, and automated reporting methods of present-day AI, the aforementioned fraudulent activities appear easier to track. Figure 2 shows compares the Decision Tree, Random Forest, and Neural Network models in terms of accuracy, f1-score, recall, and error rate.

• Accuracy is considered the blue bars in the histogram.
• F1-score is represented by the green bars overlapping with the accuracy, as shown in Figure 2.
• Recall is depicted by the orange bars located on the top of the F1-Score.
• Specifically, Recall is the green line on top of which the Error Rate is indicated by red bars.

Figure 2. Accuracy, F1-Score, Recall, Error Rate.

Thus, it is possible to compare the results of the models in the same run on several criteria simultaneously.

Figure 3 shows how the Confusion Matrix works and then shows the Confusion Matrix for each of the models developed, namely, the Decision Tree, Random Forest, and Neural Network matrix:

Figure 3. Confusion matrix comparison.

There are the following elements that are contained in a confusion:

• True Positive: When an instance is classified as a positive instance and is actually a positive instance, it is called a True Positive instance.
• The number of negative instances that did not belong to that class is referred to as the True Negative, which is abbreviated as TN.
• False Positive (FP): The total number of cases where the model has predicted it to be a positive pattern when it is negative.
• False Negative (FN): This refers to situations where it is predicted that the instance is negative, but actually, it is not.

They are shown in the form of bars for each model, as follows:

• Green bars in the figure represent the True Positives (TP).
• In Figure 3, true negative values are displayed by blue bars.
• False Positives’ bars are depicted by the orange color.
• False Negatives (FN) are depicted by the red bars in the figure.

In this way, we can easily identify the performance of each model, particularly by comparing the correct and incorrect answers.

4.1 Model performance evaluation

Our experimental evaluation assessed the capability of three machine learning models—Decision Tree, Random Forest, and Neural Network— to identify fraudulent transactions in an accounting system. These models were evaluated using accuracy, F1-score, recall, and error rate metrics, with confusion matrices providing detailed insights into classification performance.

Using the collected data, we assessed the capability of three machine learning models–Decision Tree, Random Forest, and Neural Network–in identifying fraudulent transactions in an accounting system. These aspects were evaluated using parameters such as the accuracy, F1 score, recall, and error rate. In addition, we computed their confusion matrices to obtain more detailed results on the type of categorization that the algorithms accomplished. The results of the simulation are illustrated in Table 1.

Table 1. Model performance evaluation.

Experimental Results
Methods	Accuracy %	F1- score %	Recall %	Error rate %
Decision Tree	98.34	97.21	95.75	1.66
Random Forest	97.28	95.56	98.38	2.72
Neural Network	99.19	96.50	97.49	0.81

Confusion Matrix
Alarms	TP	TN	FP	FN
Decision Tree	99.28	98.71	1.29	0.72
Random Forest	98.4	99.39	0.61	1.6
Neural Network	97.61	95.93	4.07	2.39

4.2 Analysis of decision tree performance

All three models were highly accurate, as they were above 98.34 percent, thus confirming that the models were capable of performing the classification task.

Therefore, it has a reasonable level of accuracy regarding the measures of precision and recall when they are expressed as an F1-score of 97.21 percent.

The percentage of 95.75% can be regarded as satisfactory, as it can identify a considerable number of fraudulent activities.

The total error average of 1.66% can be considered to be low, suggesting that the number of samples that have been grouped in the wrong training and test sets is small.

4.3 Analysis of random forest performance

In this study, the accuracy is 97.28%, which is a little lower than the two other models, such as the Decision Tree and the Neural Network models.

As it is shown, the F-measure amounts to 95.56%, which indicates a lower balance between the values of precision and recall.

It has the best first metric values of the model, a recall of 98.38%, meaning it is the best at what it is doing, that is, identifying the fraudulent cases.

This is likely to mean that it has classified some transactions incorrectly, something which has an error percentage of 2.72% percent.

4.4 Analysis of neural network performance

The result of the proposed model reached a peak of accuracy of 99.19% in the analysis, thus being one of the models with the best results in the field of fraud detection.

The F1-score of the neural network was 96.50%, somewhat lower than that of the Decision Tree.

A recall of 97.49%, meaning that the model has a high degree of capability to identify the fraudulent cases, slightly lower than random forest.

It recorded the fewest errors of 0.81% and therefore qualifies as the model best suited to reducing classification errors.

4.5 Comparative analysis and implications

The comparison shows that each model has some benefits that favor its use in fraud detection applications.

1. The Neural Network yielded the highest mathematical precision and minimal total error to a large extent, which is desirable for cases in which different performance characteristics are equally essential. Nevertheless, it may have a higher false positive count for similar reasons, which in turn means that it sets off alarms for more attention.
2. Random Forest has a high recall performance, which makes it suitable where high losses may be incurred if fraudulent transactions are not detected. It also has a very low false-positive rate that will enable it to produce fewer false alarms, thus making operations effective.
3. The Decision Tree also seems to be the best with the F1-score at hand, showing high results in all aspects, which means it is a balanced model. Owing to its high interpretability, it is possible to use it in compliance and audit trails because there is a clear record of the process of reaching a decision.

The analysis showed that the Neural Network model achieved the highest accuracy of 99.19%, with the lowest error rate of 0.81%. The model demonstrated a strong performance in identifying fraudulent and legitimate transactions. However, it generated slightly more false positives than the other models did.

The Decision Tree model delivered well-balanced performance with 98.34% accuracy and the highest F1-score of 97.21%. It excelled at correctly identifying fraudulent transactions, with a true positive rate of 99.28%. The interpretability of the model makes it particularly valuable for understanding decision-making processes in fraud detection.

Figure 4 shows the Random Forest model showed exceptional capability in recall performance at 98.38%, making it ideal for minimizing missed fraud cases. It achieved 97.28% accuracy and demonstrated the lowest false positive rate at 0.61%. This makes it particularly effective for high-volume transaction systems where minimizing false alarms is crucial.

Figure 4. False positive and false negative rates of our models.

Compared with previous studies, our models significantly outperformed traditional approaches. The closest competitor from previous research achieved 95.7% accuracy, whereas our models consistently performed above 97%. The Traditional rule-based systems typically achieve around 77% accuracy, highlighting the substantial improvement offered by our machine learning approach.

In this case, the accuracy levels have improved over conventional rule-based systems, which can only predict at a rate of–70-85 percent, as identified by Chen et al. (2022). It also performs better than other machine learning methods that have been proposed in the literature, including those based on graph neural networks, which were able to achieve 94.2% by Rodriguez and Kim (2023), and the ensemble methods, which were able to achieve 95.7% by Alharbi and Matthews (2022).

Thus, it is recommended that an optimal fraud detection model uses more than one model for detection. For instance, employing Random Forest at first to filter data and reduce false negatives, then the Decision Tree for cases where an explanation from the model is necessary, may serve as a complete fraud safety net by addressing the operational factors involved.

The confusion matrix analysis revealed that all models maintained high true positive and true negative rates, with minimal false positives and false negatives. This balanced performance across different metrics indicates robust and reliable fraud-detection capabilities across various transaction scenarios.

5. Conclusion

This study proposes an AI-based system controlling mechanism for identifying fraudulent transactions in an accounting system using three models, namely, Decision Tree, Random Forest, and Neural Network, after its implementation comparison. In sum up, it can be stated that all three mentioned models obtained high accuracy, with more than 97%, outperforming prior rule-based approaches. The neural network model yielded the highest accuracy of 99.19%, with 0.81% being the lowest error rate in the tests, proving the effectiveness of deep learning for learning intricate patterns in the field of finance. However, it was also a slave to the false positives. Random Forest provided the highest recall rate at 98.38% and the lowest rate of false positives at only 0.61%, thus being appropriate for settings that require low levels of false alarms. The decision tree model was the best, with an F1 score of 97.21% and also had reasonably balanced accuracies.

Ethics and consent

No ethics and consent statements are required for this study.

Software availability

Source code available from: https://doi.org/10.5281/zenodo.18353173¹¹

License: MIT License.

Data availability statement

Repository name: Credit Card Fraud Detection, Zenodo. https://doi.org/10.5281/zenodo.7395559.¹⁰

This study uses the publicly available Credit Card Fraud Detection dataset published by Liu, L. (2022) on Zenodo (https://doi.org/10.5281/zenodo.7395559), which mirrors the Kaggle ULB dataset of European credit card transactions. The dataset was not generated by the authors of this study. It includes all variables required to reproduce our experiments, such as transaction features and class labels indicating fraudulent or legitimate payments. The dataset is openly accessible under an open license that permits reuse without embargo or access restrictions, in line with Zenodo’s open data policies. The repository includes the datasets used for analysis. The data are shared under an open license (Creative Commons Attribution 4.0 International license) and are accessible without restriction.

References

1. ACFE: Report to the Nations: 2023 Global Study on Occupational Fraud and Abuse. Association of Certified Fraud Examiners; 2023.
2. Chen H, Davis J: Adaptive Fraud Detection: Integrating Human Expertise with Machine Learning Systems. J Bank Technol. 2023; 15(2): 178–192.
3. Chen Y, Wang X, Lee J: Machine Learning Applications in Financial Fraud Detection: A Systematic Review. J Account Inf Syst. 2022; 47: 101–118.
4. Smith K, Anderson R: Digital Transformation in Financial Services: Impact on Risk Management and Security. J Digit Innov. 2024; 12(1): 45–67.
5. Zhang P, Thompson R, Wilson M: Deep Learning for Accounting Fraud Detection: Integrating Textual and Numerical Financial Data. Account Rev. 2023; 98(2): 261–280. Publisher Full Text
6. Li K, Johnson P: Real-time Fraud Detection in Banking Transactions: An Ensemble Learning Approach. Int J Bank Mark. 2022; 40(3): 512–529. Publisher Full Text
7. Patel S, Nguyen T, Garcia R: Explainable AI for Fraud Detection in Accounting Processes. J Emerg Technol Account. 2022; 19(1): 45–63. Publisher Full Text
8. Rodriguez M, Kim J: Graph Neural Networks for Complex Fraud Scheme Detection in Financial Systems. Digit Financ. 2023; 5(2): 156–173. Publisher Full Text
9. Wang L, Zhang H, Miller R: Federated Learning for Privacy-Preserving Fraud Detection in Financial Institutions. J Financ Data Sci. 2023; 5(1): 78–94. Publisher Full Text
10. Alharbi A, Matthews L: A Comprehensive Comparison of Machine Learning Algorithms for Credit Card Fraud Detection. IEEE Transactions on Dependable and Secure Computing. 2022; 19(4): 2241–2255. Luqi Liu, “Credit Card Fraud Detection”. [dataset] Zenodo, Dec. 04, 2022. Publisher Full Text
11. Shaamood MT, et al.: Intelligent Digital Transformation: Redefining Fraud Detection in Accounting. Zenodo. 2026.

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 02 Apr 2026

Author details Author details

¹ College of Education for Humanities / Educational and Psychological Sciences Department, University of Anbar, Ramadi, Iraq
² Department of Computer Science, Imam Alkadhim University College, Baghdad, 10011, Iraq
³ Computer Networking Systems Department, University of Anbar, Ramadi, Al Anbar Governorate, Iraq
⁴ College of Science / Dept. of Mathematics, University of Anbar, Ramadi, Iraq
⁵ College of Computing and Informatics, Saudi Electronic University, Riyadh, Riyadh Province, Saudi Arabia

Mohammed Th. Shaamood
Roles: Data Curation, Resources

Amjed Abbas Ahmed
Roles: Investigation, Methodology

Khattab M Ali Alheeti
Roles: Software, Writing – Original Draft Preparation

Hamsa M. Ahmed
Roles: Resources, Writing – Review & Editing

Ameen Shaman Ameen
Roles: Formal Analysis, Project Administration

Saeed Matar Alshahrani
Roles: Resources, Software, Supervision

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 02 Apr 2026, 15:465

https://doi.org/10.12688/f1000research.176100.1

Copyright

© 2026 Shaamood MT et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Shaamood MT, Ahmed AA, Alheeti KMA et al. Intelligent Digital Transformation: Redefining Fraud Detection in Accounting [version 1; peer review: 2 not approved]. F1000Research 2026, 15:465 (https://doi.org/10.12688/f1000research.176100.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 02 Apr 2026

Views

6

Reviewer Report 19 May 2026

Mohammadhossein Homaei, Universidad de Extremadura, Cáceres, Extremadura, Spain

Not Approved

https://doi.org/10.5256/f1000research.194131.r474384

Thank you for submitting your manuscript regarding the use of machine learning for fraud detection. However, I have identified several fundamental methodological and conceptual concerns that must be addressed to ensure the scientific rigor and validity of your research.
... Continue reading

Thank you for submitting your manuscript regarding the use of machine learning for fraud detection. However, I have identified several fundamental methodological and conceptual concerns that must be addressed to ensure the scientific rigor and validity of your research.

First, there is a significant conceptual mismatch between the paper's core narrative and the empirical evaluation. The manuscript is framed around "accounting fraud," which inherently involves financial statement manipulation and corporate auditing. Yet, the evaluation relies on a consumer credit card fraud dataset. Furthermore, the abstract claims that artificial transaction data was generated for this study, which contradicts the Data Availability statement citing a public Kaggle dataset. It is crucial to either align your literature and narrative with credit card fraud or utilize a genuine accounting dataset.

Second, I have serious concerns regarding potential data leakage. Achieving an F1-score of over 96% on this specific, highly imbalanced dataset strongly suggests that the SMOTE technique was applied to the entire dataset prior to the train/test split. If synthetic data leaked into the test set, the performance metrics are invalid. Please ensure that resampling is strictly isolated to the training set during cross-validation.

Third, reporting an accuracy of 99.19% on a dataset where over 99% of transactions are legitimate is statistically misleading. I highly recommend evaluating your models using metrics suited for extreme class imbalance, specifically the Precision-Recall Curve (PRC) and the Area Under the PR Curve (AUC-PR).

Finally, the manuscript lacks the necessary architectural details for reproducibility. Please clearly document the specific architectures (e.g., hidden layers, neurons) for the Neural Network and the hyperparameters (e.g., tree depth, estimators) for the ensemble models.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

No

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Cybersecurity, Machine Learning, AI, Digital Twins

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Views

12

Reviewer Report 15 Apr 2026

Syed Ali Reza, University of the Potomac (UOTP), Washington, USA

Not Approved

https://doi.org/10.5256/f1000research.194131.r472911

The article is timely and addresses a relevant issue: the application of AI and machine learning to find fraud in accounting systems. The model type that has been compared in the paper includes Decision Tree, Random Forest, and Neural Network ... Continue reading

The article is timely and addresses a relevant issue: the application of AI and machine learning to find fraud in accounting systems. The model type that has been compared in the paper includes Decision Tree, Random Forest, and Neural Network models and presents encouraging results. Nevertheless, there are several aspects that should be elaborated to make the findings reliable. Specifically, the paper should be more methodologically detailed, explain the dataset better, present the figures in a better way and be more skeptical about the results. My comments by section are given below.
Abstract
The abstract refers to all four models, and the main paper talks about three models Decision Tree, Random Forest, and Neural Network. This ought to be rectified. The assertion regarding the artificial transaction data is unclear, particularly that the data availability section is to a public Credit Card Fraud Detection dataset of Zenodo. Please explain whether the data were created or altered or reused directly. The argument that the framework proposed outperforms rule-based systems must be softened unless the rule-based systems have been tested in the study.
Introduction
The theme is topical, yet certain concepts regarding AI, machine learning, and digital transformation are reiterated. It would be better to make the introduction shorter. The reasons why a credit card fraud dataset is appropriate to study accounting fraud should be explained more. These are related, yet not identical areas. Other generalized statements, like the magnitude of frauds in the world and the constraints of traditional audit, should be better supported by the recent literature.
Methods
It is not evident whether balancing was done before or after splitting the data into training and testing sets. This is relevant since oversampling prior to splitting can result in data leakage. Kindly add the train-test split ratio, validation method, model parameters, software environment and random seed. The formula of normalization is ambiguous and needs to be rewritten. The definition of F1-score is not correct. F1-score is the harmonic mean of recall and precision, and not an arithmetic mean. The manner in which each model was used should be described in the paper in a way that will allow other scholars to replicate the study.
Results
In Figure 2, it is hard to comprehend. The text explains bars and colours, whereas the graph seems to be a line graph. It can be re-designed with more specific labels and with regular scaling. Table 1 indicates the accuracy of Random Forest is 97.28, thus the fact that all the models obtained above 98.34 is not true. The values of the confusion matrix should be clarified. Please describe TP, TN, FP, and FN as percentages, rates, or counts. It should be more cautious to compare it with the past research. The paper states that it is improved, yet does not indicate that the same dataset, metrics, or experimental setup were employed.
Discussion and Conclusion
The results are mostly reiterated in the discussion. It would be more robust with an explanation of why the models acted differently. The suggestion to apply a number of models is helpful, yet it should be elaborated. How would the models be integrated in an actual system of fraud detection? The title of slave to the false positives ought to be changed as it is not an academic writing. The conclusion part has to be more balanced, and it should explicitly state limitations of the study like limitations of data set, no real testing of the accounting systems as well as no privacy or security testing.
Overall recommendation
The article is scholarly and it deals with an effective research issue. Nevertheless, the procedure, presentation of results, and interpretation require significant elucidation before the results are received with trust.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Machine learning, artificial intelligence, research methodology, data analytics, fraud detection, accounting information system, and research methodology.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 02 Apr 2026

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 1 02 Apr 26	read	read

Syed Ali Reza, University of the Potomac (UOTP), Washington, USA
Mohammadhossein Homaei, Universidad de Extremadura, Cáceres, Spain

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

6 Views

19 May 2026 | for Version 1

Mohammadhossein Homaei, Universidad de Extremadura, Cáceres, Extremadura, Spain

6 Views Cite this report Responses(0)

Not Approved

Thank you for submitting your manuscript regarding the use of machine learning for fraud detection. However, I have identified several fundamental methodological and conceptual concerns that must be addressed to ensure the scientific rigor and validity of your research.

First, there is a significant conceptual mismatch between the paper's core narrative and the empirical evaluation. The manuscript is framed around "accounting fraud," which inherently involves financial statement manipulation and corporate auditing. Yet, the evaluation relies on a consumer credit card fraud dataset. Furthermore, the abstract claims that artificial transaction data was generated for this study, which contradicts the Data Availability statement citing a public Kaggle dataset. It is crucial to either align your literature and narrative with credit card fraud or utilize a genuine accounting dataset.

Second, I have serious concerns regarding potential data leakage. Achieving an F1-score of over 96% on this specific, highly imbalanced dataset strongly suggests that the SMOTE technique was applied to the entire dataset prior to the train/test split. If synthetic data leaked into the test set, the performance metrics are invalid. Please ensure that resampling is strictly isolated to the training set during cross-validation.

Third, reporting an accuracy of 99.19% on a dataset where over 99% of transactions are legitimate is statistically misleading. I highly recommend evaluating your models using metrics suited for extreme class imbalance, specifically the Precision-Recall Curve (PRC) and the Area Under the PR Curve (AUC-PR).

Finally, the manuscript lacks the necessary architectural details for reproducibility. Please clearly document the specific architectures (e.g., hidden layers, neurons) for the Neural Network and the hyperparameters (e.g., tree depth, estimators) for the ensemble models.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

No

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Cybersecurity, Machine Learning, AI, Digital Twins

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

12 Views

15 Apr 2026 | for Version 1

Syed Ali Reza, University of the Potomac (UOTP), Washington, USA

12 Views Cite this report Responses(0)

Not Approved

The article is timely and addresses a relevant issue: the application of AI and machine learning to find fraud in accounting systems. The model type that has been compared in the paper includes Decision Tree, Random Forest, and Neural Network models and presents encouraging results. Nevertheless, there are several aspects that should be elaborated to make the findings reliable. Specifically, the paper should be more methodologically detailed, explain the dataset better, present the figures in a better way and be more skeptical about the results. My comments by section are given below.
Abstract
The abstract refers to all four models, and the main paper talks about three models Decision Tree, Random Forest, and Neural Network. This ought to be rectified. The assertion regarding the artificial transaction data is unclear, particularly that the data availability section is to a public Credit Card Fraud Detection dataset of Zenodo. Please explain whether the data were created or altered or reused directly. The argument that the framework proposed outperforms rule-based systems must be softened unless the rule-based systems have been tested in the study.
Introduction
The theme is topical, yet certain concepts regarding AI, machine learning, and digital transformation are reiterated. It would be better to make the introduction shorter. The reasons why a credit card fraud dataset is appropriate to study accounting fraud should be explained more. These are related, yet not identical areas. Other generalized statements, like the magnitude of frauds in the world and the constraints of traditional audit, should be better supported by the recent literature.
Methods
It is not evident whether balancing was done before or after splitting the data into training and testing sets. This is relevant since oversampling prior to splitting can result in data leakage. Kindly add the train-test split ratio, validation method, model parameters, software environment and random seed. The formula of normalization is ambiguous and needs to be rewritten. The definition of F1-score is not correct. F1-score is the harmonic mean of recall and precision, and not an arithmetic mean. The manner in which each model was used should be described in the paper in a way that will allow other scholars to replicate the study.
Results
In Figure 2, it is hard to comprehend. The text explains bars and colours, whereas the graph seems to be a line graph. It can be re-designed with more specific labels and with regular scaling. Table 1 indicates the accuracy of Random Forest is 97.28, thus the fact that all the models obtained above 98.34 is not true. The values of the confusion matrix should be clarified. Please describe TP, TN, FP, and FN as percentages, rates, or counts. It should be more cautious to compare it with the past research. The paper states that it is improved, yet does not indicate that the same dataset, metrics, or experimental setup were employed.
Discussion and Conclusion
The results are mostly reiterated in the discussion. It would be more robust with an explanation of why the models acted differently. The suggestion to apply a number of models is helpful, yet it should be elaborated. How would the models be integrated in an actual system of fraud detection? The title of slave to the false positives ought to be changed as it is not an academic writing. The conclusion part has to be more balanced, and it should explicitly state limitations of the study like limitations of data set, no real testing of the accounting systems as well as no privacy or security testing.
Overall recommendation
The article is scholarly and it deals with an effective research issue. Nevertheless, the procedure, presentation of results, and interpretation require significant elucidation before the results are received with trust.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Machine learning, artificial intelligence, research methodology, data analytics, fraud detection, accounting information system, and research methodology.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

[1] 1. ACFE: Report to the Nations: 2023 Global Study on Occupational Fraud and Abuse. Association of Certified Fraud Examiners; 2023.

[2] 2. Chen H, Davis J: Adaptive Fraud Detection: Integrating Human Expertise with Machine Learning Systems. J Bank Technol. 2023; 15(2): 178–192.

[3] 3. Chen Y, Wang X, Lee J: Machine Learning Applications in Financial Fraud Detection: A Systematic Review. J Account Inf Syst. 2022; 47: 101–118.

[4] 4. Smith K, Anderson R: Digital Transformation in Financial Services: Impact on Risk Management and Security. J Digit Innov. 2024; 12(1): 45–67.

[5] 5. Zhang P, Thompson R, Wilson M: Deep Learning for Accounting Fraud Detection: Integrating Textual and Numerical Financial Data. Account Rev. 2023; 98(2): 261–280. Publisher Full Text

[6] 6. Li K, Johnson P: Real-time Fraud Detection in Banking Transactions: An Ensemble Learning Approach. Int J Bank Mark. 2022; 40(3): 512–529. Publisher Full Text

[7] 7. Patel S, Nguyen T, Garcia R: Explainable AI for Fraud Detection in Accounting Processes. J Emerg Technol Account. 2022; 19(1): 45–63. Publisher Full Text

[8] 8. Rodriguez M, Kim J: Graph Neural Networks for Complex Fraud Scheme Detection in Financial Systems. Digit Financ. 2023; 5(2): 156–173. Publisher Full Text

[9] 9. Wang L, Zhang H, Miller R: Federated Learning for Privacy-Preserving Fraud Detection in Financial Institutions. J Financ Data Sci. 2023; 5(1): 78–94. Publisher Full Text

[10] 10. Alharbi A, Matthews L: A Comprehensive Comparison of Machine Learning Algorithms for Credit Card Fraud Detection. IEEE Transactions on Dependable and Secure Computing. 2022; 19(4): 2241–2255. Luqi Liu, “Credit Card Fraud Detection”. [dataset] Zenodo, Dec. 04, 2022. Publisher Full Text

[11] 11. Shaamood MT, et al.: Intelligent Digital Transformation: Redefining Fraud Detection in Accounting. Zenodo. 2026.

Intelligent Digital Transformation: Redefining Fraud Detection in Accounting

Abstract

Background

Methods

Results

Conclusions

Keywords

1. Introduction

2. Related works

3. Proposed framework

Figure 1. Block diagram of the proposed system.

3.1 Data preprocessing

(1)

3.2 Performance evaluation

4. Result and discussion

Figure 2. Accuracy, F1-Score, Recall, Error Rate.

Figure 3. Confusion matrix comparison.

4.1 Model performance evaluation

Table 1. Model performance evaluation.

4.2 Analysis of decision tree performance

4.3 Analysis of random forest performance

4.4 Analysis of neural network performance

4.5 Comparative analysis and implications

Figure 4. False positive and false negative rates of our models.

5. Conclusion

Ethics and consent

Software availability

Data availability statement

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated