PestNeuroVision: A mobile application based on convolutional neural networks (CNNs) and computer vision for the detection of agricultural pests in the Ca&ntilde;ete Valley, Lima, Peru

Alexander Leandro-Mendoza; Gustavo Fernández-Gutiérrez; Juan Chávez-Saldaña; Jesus Guevara-Ramos; Alex Pacheco

doi:10.12688/f1000research.180260.1

Home Browse PestNeuroVision: A mobile application based on convolutional neural...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Software Tool Article

PestNeuroVision: A mobile application based on convolutional neural networks (CNNs) and computer vision for the detection of agricultural pests in the Cañete Valley, Lima, Peru

[version 1; peer review: awaiting peer review]

Alexander Leandro-Mendoza ¹, Gustavo Fernández-Gutiérrez¹, Juan Chávez-Saldaña¹, Jesus Guevara-Ramos¹, Alex Pacheco¹

Alexander Leandro-Mendoza ¹, Gustavo Fernández-Gutiérrez¹, [...] Juan Chávez-Saldaña¹, Jesus Guevara-Ramos¹, Alex Pacheco¹

PUBLISHED 02 Jun 2026

Author details Author details

¹ Professional School of Systems Engineering, Universidad Nacional de Cañete, San Vicente de Cañete, Lima, 15701, Peru

Alexander Leandro-Mendoza
Roles: Conceptualization, Methodology, Software, Writing – Original Draft Preparation, Writing – Review & Editing

Gustavo Fernández-Gutiérrez
Roles: Conceptualization, Methodology, Writing – Original Draft Preparation, Writing – Review & Editing

Juan Chávez-Saldaña
Roles: Conceptualization, Methodology, Writing – Original Draft Preparation, Writing – Review & Editing

Jesus Guevara-Ramos
Roles: Conceptualization, Methodology, Writing – Original Draft Preparation, Writing – Review & Editing

Alex Pacheco
Roles: Conceptualization, Methodology, Writing – Original Draft Preparation, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Abstract*

Background

Environmental degradation has increased the frequency of agricultural pests, resulting in crop yield losses of 10–30% in tropical and subtropical regions. In the Cañete Valley, Lima, Peru, species such as Spodoptera frugiperda, Liriomyza huidobrensis, and Bemisia tabaci pose critical threats to agriculture. Traditional pest monitoring methods are slow, subjective, and imprecise. To address this problem, this study proposes PestNeuroVision, a mobile application that implements convolutional neural networks (CNNs) and computer vision via the YOLO11s model for agricultural pest detection through local and autonomous inference.

Methods

The dataset consisted of 900 insect photographs uniformly distributed across nine agricultural pest classes present in the Cañete Valley. The images were divided into training (70%), validation (15%), and testing (15%) subsets. The YOLO11s model was trained using transfer learning and fine-tuning. The application was developed under the Model-View-ViewModel (MVVM) architectural pattern using Kotlin, integrating the trained algorithm in TensorFlow Lite format for local inference.

Results

The model achieved a precision of 92.4%, recall of 87.7%, mAP@50 of 91.7%, and mAP@50–95 of 78.0%, reaching 100% effectiveness in detecting adult specimens of Ceratitis capitata, Dione juno, Ligyrus gibbosus, and Spodoptera frugiperda. The application successfully executed detection-related functions, such as local inference on images, detection history management, technical species consultation, and generation of statistical charts for population fluctuation analysis.

Conclusions

PestNeuroVision demonstrates that the implementation of CNNs and computer vision on mobile devices is a viable technical solution for automating phytosanitary field monitoring. This proposal constitutes a technical contribution to precision agriculture.

Keywords

Agricultural pests, mobile application, convolutional neural networks, computer vision, YOLO11s, detection, precision agriculture

Corresponding author: Alexander Leandro-Mendoza

Competing interests: No competing interests were disclosed.

Grant information: This work was funded by the Directorate of Innovation and Technology Transfer of the Vice-Presidency for Research of the Universidad Nacional de Cañete (UNDC) under the "Research Contest for the Development of Innovations and Intellectual Property" [contract number 015-2024].
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2026 Leandro-Mendoza A et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Leandro-Mendoza A, Fernández-Gutiérrez G, Chávez-Saldaña J et al. PestNeuroVision: A mobile application based on convolutional neural networks (CNNs) and computer vision for the detection of agricultural pests in the Cañete Valley, Lima, Peru [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:859 (https://doi.org/10.12688/f1000research.180260.1) First published: 02 Jun 2026, 15:859 (https://doi.org/10.12688/f1000research.180260.1) Latest published: 02 Jun 2026, 15:859 (https://doi.org/10.12688/f1000research.180260.1)

Introduction

Accelerated climate change and ecosystem degradation have increased the frequency and intensity of crop pests, posing a growing threat to the viability of production systems.¹ This phenomenon causes yield losses in both tropical and subtropical regions, with values ranging from 10% to 30%, directly impacting agrarian economic stability.² The Peruvian context is no stranger to this situation, as the presence of various agricultural infestations has historically been documented. Such is the case for the lepidopteran Spodoptera frugiperda, which has been found extensively in production plots and is considered a direct threat to maize crop profitability because of its polyphagous nature and in-field aggressiveness.³ Similarly, the presence of the dipteran Liriomyza huidobrensis has been reported in the Cañete Valley, a pest native to Peru that causes severe damage to various vegetable and flower crops worldwide.⁴ Furthermore, the hemipteran Bemisia tabaci has been reported in cassava and sweet potato fields in this locality and is classified as highly detrimental and difficult to control because of its high resistance to chemical pesticides.⁵ These species are part of the nine pest classes addressed in this study.

Traditional pest control methods are subjective, imprecise, and labor-intensive.⁶ Furthermore, they require advanced knowledge and specialized resources, which are inefficient in meeting the demands of modern farming.⁷ Given this problem, accurate and precise monitoring is an essential pillar of precision agriculture and sustainable agrochemical management.⁸ Early identification of harmful insects is essential to prevent severe damage.⁹ Therefore, the implementation of automated detection tools is fundamental for optimizing field resources and ensuring the efficacy of the control of harmful phytophagous entomological agents.¹⁰

Deep learning has progressed significantly in pest detection, surpassing conventional methods.¹¹ Convolutional Neural Networks (CNNs) have emerged as disruptive technologies for the identification of harmful entomofauna and crop diseases.¹² Computer vision is a fundamental technological tool in smart farming, particularly for detecting agricultural pests.¹³ In this sense, these advancements are presented as promising solutions for the efficient detection of harmful insects in farming.¹⁴ YOLOv11, an architecture based on these technologies, incorporates C3k2 blocks, SPPF modules, and a C2PSA spatial attention mechanism, which allows the algorithm to prioritize key areas of the image, improving the detection of objects with varying dimensions and locations.¹⁵ These technical capabilities are ideal for the present study because they facilitate the identification of entomological agents with diverse morphologies and sizes. However, the previously described models face implementation challenges on resource-constrained equipment because of their high computational load and memory requirements.¹⁶ This operational limitation highlights the need to develop lightweight and optimized solutions for mobile devices.¹⁷

In response to this need, PestNeuroVision, a native Android mobile application, was developed with the objective of detecting agricultural pests in the Cañete Valley, Lima, Peru. This system operates through image processing, using images from the gallery or captured with the device camera. Unlike existing cloud-based solutions, this software employs CNNs and computer vision, specifically through the YOLO11s model, to execute inferences entirely on the device and offline, guaranteeing its functionality in areas without connectivity. In addition to automating detection, the application allows for the local storage of the results obtained from this process, consultation of technical information per insect, and generation of statistical charts of population fluctuations. These functionalities enable the traceability of the collected data and analysis of infestation levels. Thus, PestNeuroVision contributes to the development of precision mobile technologies with a direct focus on smart agriculture and research.

Methods

This research addressed the detection of nine classes of agricultural pests in the Cañete Valley, Lima, Peru. These categories were based on seven species, two of which were analyzed independently in their larval and adult stages. The insects considered for this study were: Bemisia tabaci (adult), Ceratitis capitata (adult), Dione juno (larva and adult), Ligyrus gibbosus (adult), Liriomyza huidobrensis (adult), Myzus persicae (nymph), and Spodoptera frugiperda (larva and adult). These are illustrated in Figure 1.

Figure 1. Agricultural pests analyzed in this research.

Source: Own elaboration based on images obtained from the iNaturalist. Note: The mosaic displays eight of the nine pest classes addressed in this study. The image of the species Liriomyza huidobrensis is not included because of copyright restrictions; to view the photograph, please access the original repository link, which is detailed in the “Data availability” section of this paper.

Image credits:

Bemisia Tabaci (adult), Mihajlo Tomić, iNaturalist, observation 60075120, CC BY 4.0.

Ceratitis Capitata (adult), Jesse Rorabaugh, iNaturalist, observation 69185885, CC0 1.0.

Dione juno (larva), Alberto Reyes Bautista, iNaturalist, observation 194123388, CC BY 4.0.

Dione juno (adult), Sarah Angulo, iNaturalist, observation 240437816, CC0 1.0.

Ligyrus gibbosus (adult), Sinaloa Silvestre, iNaturalist, observation 225035301, CC0 1.0. Second photo in the iNaturalist observation gallery (thumbnail position 2).

Myzus persicae (nymph), Jesse Rorabaugh, iNaturalist, observation 38478019, CC0 1.0.

Spodoptera frugiperda (larva), Megan Kossa, iNaturalist, observation 17408158, CC BY 4.0. Third photo in the iNaturalist observation gallery (thumbnail position 3).

Spodoptera frugiperda (adult), henrya, iNaturalist, observation 309965749, CC0 1.0.

Implementation

Development technologies

Data labeling was performed using Label Studio v1.22.0. The model was developed using Python v3.12.12 within the Google Colaboratory (Colab) runtime environment, operating on Linux Ubuntu 22.04.5 LTS with an NVIDIA Tesla T4 GPU (15 GB VRAM) and CUDA 13.0. Ultralytics v8.4.21 was employed to implement the YOLO11s architecture, and PyTorch v2.10.0 was used to generate the heatmaps.

The mobile application v1.0.0 was developed using Kotlin v2.0.21 within the Android Studio Narwhal v2025.1.1 integrated development environment (IDE). API 30 (Android 11) was established as the minimum SDK level, and API 36 (Android 16) was set as the target to ensure compatibility with most devices currently in use. SQLite v3.28.0 and Room v2.6.0 were employed for local database creation and management, MPAndroidChart v3.1.0 was used for statistical chart visualization, and TensorFlow Lite v2.14.0 was utilized for the integration of the YOLO11s model.

Development phases

PestNeuroVision adopted a phased development approach inspired by the models of Fernández-Gutiérrez et al.¹⁸ and Ramos-Miller and Pacheco.¹⁹ Both studies applied a five-phase methodology. The former implemented the stages of data preprocessing and preparation, stratified division and calculation of class weights, model design and configuration, training and validation, and integration with the interface and local deployment to develop a CNN-based image-processing system for dermatological diagnostics. The latter structured the process into analysis, planning, implementation, review, and deployment to build an inventory control web-based system. These sequential approaches enable the traceability of system requirements and ensure software functionality for pest detection.

• Phase 1: Obtaining the dataset
Images were obtained from three primary sources: iNaturalist, Roboflow and Kaggle. Additionally, to supplement classes with lower representation, photographs were incorporated from Google Images via web scraping. This procedure was based on the method described by Xu et al.,²⁰ who used Google Image Search for data collection. The initial collection comprised 729 images distributed unevenly across nine classes, as detailed in Table 1 and Figure 2.
• Phase 2: Dataset preprocessing

Table 1. Initial dataset composition.

Class	Life stage	Total images
Bemisia tabaci	Adult	100
Ceratitis capitata	Adult	75
Dione juno	Larva	25
Dione juno	Adult	100
Ligyrus gibbosus	Adult	54
Liriomyza huidobrensis	Adult	100
Myzus persicae	Nimph	75
Spodoptera frugiperda	Larva	100
Spodoptera frugiperda	Adult	100

Figure 2. Initial dataset distribution.

Source: Own elaboration. Note: A significant imbalance was evident in the collected data. Dione juno (larva) exhibited the lowest number of representative samples.

Data augmentation

The obtained images were resized to 640 × 640 pixels, which is the standard input size required by YOLO11s. To balance the number of images per class, data augmentation was applied using the ImageDataGenerator tool from the Python’s tensorflow.keras library. The transformations consisted of geometric variations that simulated real-world field capture conditions. Table 2 lists the parameters used in this process. The final dataset was balanced with 100 images per class, resulting in a total of 900 samples.

Table 2. Parameters used for data augmentation.

Parameter	Value
rotation_range	30°
zoom_range	0.3
width_shift_range/height_shift_range	0.1
horizontal_flip/vertical_flip	True
fill_mode	reflect

Data labeling

The 900 images were labeled. Each visible insect instance was labeled individually using bounding boxes. The resulting annotations were exported in a plain text format (. txt) that is compatible with YOLO. Table 3 summarizes the composition of the final dataset, including the total number of annotated instances per class, which reflects the variability in the number of specimens per image.

Table 3. Final dataset composition.

Class	Life stage	Total Images	Labeled instances	Annotation density (Inst./image)
Bemisia tabaci	Adult	100	596	5.96
Ceratitis capitata	Adult	100	101	1.01
Dione juno	Larva	100	783	7.83
Dione juno	Adult	100	101	1.01
Ligyrus gibbosus	Adult	100	132	1.32
Liriomyza huidobrensis	Adult	100	110	1.10
Myzus persicae	Nimph	100	1151	11.51
Spodoptera frugiperda	Larva	100	114	1.14
Spodoptera frugiperda	Adult	100	102	1.02

Although each class was balanced at 100 images, the number of instances per photograph varied according to the nature of each insect because some species were gregarious and others solitary, generating inherent variability in the annotation density.

• Phase 3: Model development
Architecture selection
YOLO11s (small variant) was used as the CNN architecture and computer vision model for pest detection. This algorithm, which features 9.2 M parameters and 16.7 FLOPs (B), achieved an mAP@50–95 of 47.0%, compared to the mAP@50–95 of 47.2% obtained by YOLO26s, which possesses a structure of 9.5 M parameters and 20.7 FLOPs (B).²¹ Based on these values, YOLO11s requires 19% fewer FLOPs and presents a difference of 0.2 mAP points, indicating a lower computational load per inference. Furthermore, in comparison with YOLO11m, YOLO11l, and YOLO11x, this model achieves an adequate balance between precision and computational efficiency.²² Therefore, these characteristics make it an appropriate option for integration into mobile devices.

Model development

The dataset was randomly divided into three subsets: 70% for training, 15% for validation, and 15% for testing. The paths for each split, as well as the number and names of the classes, were recorded in a data.yaml file, which was subsequently read by the YOLO11s model to locate the images and identify the target categories.

To reduce the training convergence time, Transfer Learning was employed using pre-trained weights from the YOLO11s backbone, followed by fine-tuning to adapt the algorithm to the morphological patterns of the nine defined pest classes.

The model was trained for 200 epochs using a seed of 50 to ensure reproducibility and was configured to process input images with dimensions of 640 × 640 pixels. These hyperparameters were explicitly defined. The remaining values were maintained at their default settings as established by Ultralytics. Table 4 details the configuration used for the training process.

Table 4. Hyperparameters used for model training.

Hyperparameter	Value
batch	16
Conf	null
dropout	0.0
epochs*	200
Imgsz*	640
iou	0.7
lr0	0.01
lrf	0.01
momentum	0.937
optimizer	auto
patience	100
seed*	50
warmup_epochs	3.0
weight_decay	0.0005

Model explainability

Eigen-CAM was employed to visualize the features identified by the model following inference. This explainability algorithm generated two heatmaps—original and inverse ( $1 - CAM$ )—for each sample subjected to the detection process, as results tend to vary based on image composition. The one showing the highest intensity concentrated on the specimen was selected.

• Phase 4: Mobile application development
The mobile application adopted the Model-View-ViewModel (MVVM) software architecture pattern, as shown in Figure 3. This framework offers superior performance compared to the Model-View-Presenter (MVP), specifically in terms of efficient CPU and memory utilization in Android environments.²³
The Model layer managed persistence and local processing through three components: DAO interfaces to execute SQL statements for detection logs, the pest catalog, and user data; repositories to centralize data access; and the TensorFlow Lite model as the inference engine. Figure 4 illustrates the physical design of the database.
The View layer managed the user interface through three resources: layouts to structure the screens; drawables to store graphic resources, icons, and backgrounds; and state selectors to define the visual behavior of components based on user interactions.
The ViewModel layer served as an intermediary between the View and Model. Using the LiveData observable, user interface updates were managed in response to any changes occurring in the data repository.
• Phase 5: Integration of the trained model into the application
Following training, the YOLO11s model was exported to the TensorFlow Lite (.tflite) format and integrated into the Android Studio project. This implementation enabled the application to detect pests in images sourced from the device’s gallery or camera, projecting the inference results onto the processed samples.

Figure 3. MVVM software architecture pattern adopted by the mobile application.

Source: Own elaboration. Note: The diagram illustrates the data flow and separation of concerns between the layers of the MVVM pattern. The local inference engine (.tflite) for pest detection was integrated into this architectural design.

Figure 4. Physical database design of the application.

Source: Own elaboration. Note: The diagram details the relational structure of the tables designed to manage detection logs, user data, insect catalog, and pest control measures.

Operation

The PestNeuroVision application runs locally on Android devices and does not require an Internet connection. The requirements for proper operation are detailed below.

Minimum mobile device requirements (end user)

• Operating System: Android 11 (API 30) or higher.
• Procesador: Octa-core 2.0 GHz or higher.
• RAM: 4 GB or higher.
• Storage capacity: 150 MB or higher.

Local development environment requirements (developer)

• Operating System: Windows 10 or higher, Linux Ubuntu 22.04 LTS or higher, or macOS 13 or higher.
• Processor: Intel Core i5 or equivalent.
• RAM: 8 GB or higher.
• Storage capacity: 10 GB or higher.
• IDE: Android Studio Narwhal v2025.1.1 or higher.
• Physical device: Recommended (the application uses the device’s camera and gallery).
• Emulator (Optional): Android Virtual Device (AVD), included in Android Studio. Functional for gallery testing, although without camera support.

Execution from the development environment

PestNeuroVision is deployed from the Android Studio IDE, where the project is loaded and dependencies resolved automatically by Gradle. The connection to the mobile device is established via USB or Wi-Fi, with debugging mode previously enabled from the developer options of the Android operating system. Optionally, an AVD emulator can be used, which is functional for gallery image testing, although without camera support. The application is launched by pressing the “Run” button or the Shift + F10 keys in the development environment.

The login credentials for the PestNeuroVision application are as follows:

• User: admin
• Password: admin

Workflow

PestNeuroVision operates in three sequential stages. In the input stage, the user selects an image from the device gallery or captures it directly using a camera. The application accepts standard formats (JPEG, JPG, PNG, WebP), preferring square dimensions ( $width = height$ ); however, the software internally resizes the file to 640 × 640 pixels before prediction. During the inference stage, the model processes the image on the local hardware to locate pests. Finally, in the output stage, the detections are visualized through bounding boxes overlaid on the original image, including the class label, confidence level, and names of the identified instances. If required, the user can store these results in the local database, making them available for later consultation in the History module.

Ethical considerations

The images used in this study were exclusively employed for academic and research purposes. The system was designed as a technical support tool and not as a substitute for the professional judgment of an expert. As there were no human participants, biological samples, or personal data, ethical approval and informed consent were not required.

Results

Model performance

The YOLO11s model was evaluated on the test subset, which consisted of 135 images and 368 annotated instances. The global metrics recorded a precision of 92.4%, recall of 87.7%, mAP@50 of 91.7%, and mAP@50–95 of 78.0%. Regarding the processing speed on an NVIDIA Tesla T4 GPU, per-image latencies of 4.8, 9.4, and 3.5 ms for preprocessing, inference, and post-processing were achieved, respectively. Table 5 presents the evaluation results disaggregated by class.

Table 5. Evaluation metrics disaggregated by class.

Class	Image	Instance	Precision (%)	Recall (%)	mAP@50 (%)	mAP@50–95 (%)
All	135	368	92.4	87.7	91.7	78.0
Bemisia tabaci (adult)	19	41	96.4	85.4	92.8	70.3
Ceratitis capitata (adult)	11	11	94.5	100.0	99.5	94.3
Dione juno (adult)	13	13	92.9	100.0	99.5	92.5
Dione juno (larva)	12	77	80.1	77.9	77.6	58.9
Ligyrus gibbosus (adult)	12	13	92.9	100.0	98.4	87.1
Liriomyza huidobrensis (adult)	21	25	100.0	91.0	96.0	85.8
Myzus persicae (nymph)	15	148	92.4	74.2	86.1	54.8
Spodoptera frugiperda (adult)	17	17	89.5	100.0	99.0	90.9
Spodoptera frugiperda (larva)	15	23	93.3	60.9	76.8	67.7

Training and validation curves

The training loss curves (train/box_loss, train/cls_loss, and train/dfl_loss) showed a sustained and consistent reduction throughout the 200 epochs, converging without signs of pronounced overfitting. Validation losses (val/box_loss, val/cls_loss, and val/dfl_loss) followed a similar downward trend, with greater variability in the intermediate epochs, which is expected given the small size of the validation set, stabilizing toward the end of training. No divergence was observed between the training and validation loss functions at the end of the iteration.

Confusion matrix

The normalized confusion matrix recorded a perfect classification rate (100.0%) for four classes, corresponding to the adult stage of Ceratitis capitata, Dione juno, Ligyrus gibbosus, and Spodoptera frugiperda. High precision levels were also obtained for Liriomyza huidobrensis (adult) and Bemisia tabaci (adult) at 92.0% and 90.0%, respectively.

The lowest recall rates were recorded for Spodoptera frugiperda (larva), Myzus persicae (nymph), and Dione juno (larva) at 61.0%, 79.0%, and 83.0%, respectively. Additionally, 4% of Spodoptera frugiperda (larva) were misclassified as adults. Regarding the background, a false positive rate of 54.0% was obtained for Dione juno (larva) and a 30.0% false negative rate for Spodoptera frugiperda (larva).

Pest detection

Figure 7 shows the performance of the YOLO11s model on the nine classes of the dataset. Confidence levels exceeding 90.0% were observed for most species, with peak values of 96.0% for Bemisia tabaci (adult) and 95.0% for Dione juno (larva). The model demonstrated the capability to detect multiple instances simultaneously in scenarios with high population densities, such as in the cases of Bemisia tabaci (adults) and Myzus persicae (nymphs). In the latter class, the scores ranged between 34.0% and 84.0%.

Figure 5. Training and validation curves of the YOLO11s model.

Source: Automatically generated by the YOLO11s model after training. Note: The graphs show the evolution of the loss functions (box, cls, dfl) and performance metrics (precision, recall, mAP 50, and mAP 50–95) over 200 epochs.

Figure 6. Normalized confusion matrix of the YOLO11s model on the test set.

Source: Automatically generated by the YOLO11s model after training. Note: The values on the main diagonal represent the correct classification rate for each class, whereas the off-diagonal values indicate the proportion of samples misclassified between classes or erroneously assigned to the background. All data are presented in decimal format within the interval [0, 1], where, for example, a value of 0.96 equals a 96% recall rate for that category.

Figure 7. Detection of the eight pest classes using the YOLO11s model.

Source: Own elaboration based on images generated by the YOLO11s model after the inference. Note: The bounding boxes indicate the localized region, and the numerical values represent the confidence level of each prediction in decimal format [0, 1], where, for example, a value of 0.96 corresponds to a 96% certainty. The image of the species Liriomyza huidobrensis is not included because of copyright restrictions.

Image credits:

Bemisia Tabaci (adult), Mihajlo Tomić, iNaturalist, observation 60075120, CC BY 4.0.

Ceratitis Capitata (adult), karuquebec, iNaturalist, observation 104783008, CC0 1.0. Third photo in the iNaturalist observation gallery (thumbnail position 3).

Dione juno (larva), Pietro, iNaturalist, observation 243189131, CC BY 4.0.

Dione juno (adult), Sarah Angulo, iNaturalist, observation 240437816, CC0 1.0.

Ligyrus gibbosus (adult), Sinaloa Silvestre, iNaturalist, observation 225035301, CC0 1.0. Second photo in the iNaturalist observation gallery (thumbnail position 2).

Myzus persicae (nymph), Jesse Rorabaugh, iNaturalist, observation 5287581, CC0 1.0. Second photo in the iNaturalist observation gallery (thumbnail position 2).

Spodoptera frugiperda (larva), Eben Preston, observation 236906070, CC0 1.0.

Spodoptera frugiperda (adult), Fernando Sessegolo, observation 69886018, CC0 1.0.

Eigen-CAM heatmaps

Figure 8 shows the heatmaps for three representative cases. In Bemisia tabaci (adult), the maximum activation was concentrated on the specimens’ bodies, with minimum values in the background. In Spodoptera frugiperda (larva) and Ceratitis capitata (adult), the activation was low-intensity and diffuse, distributed across both the insects’ bodies and the leaf surfaces.

Figure 8. Eigen-CAM heatmaps generated for three representative classes.

Source: Own elaboration based on the images generated by the Eigen-CAM algorithm. Note: The left column shows the original image, the central column contains the generated heatmap, and the right column contains the heatmap superimposed on the original image. Red indicates maximum activation, and blue indicates minimum activation.

Image credits:

Bemisia tabaci (adult), Mike Bowie, iNaturalist, observation 185192259, CC0 BY 4.0.

Spodoptera frugiperda (larva), Megan Kossa, iNaturalist, observation 17408158, CC BY 4.0. Third photo in the iNaturalist observation gallery (thumbnail position 3).

Ceratitis capitata (adult), Sebastián Fornés, iNaturalist, observation 108328659, CC BY 4.0.

Use Cases

Case 1. Pest detection using the PestNeuroVision application

The process begins when the user accesses the main PestNeuroVision module. In the central area, the user selects an image (from the gallery or camera) in JPEG, JPG, PNG, or WebP format and presses the “Run Detection” button, after which the application performs the inference locally. As a result, the bounding boxes, class names, and confidence percentages for each detection are displayed on the image. Finally, the user clicks the “Save Detection” button to persist the data in the system. Figure 9 illustrates this use case.

Figure 9. Mobile application workflow for pest detection.

Source: Own elaboration. Note: (A) Interface prior to image upload. (B) Imported photograph. (C) Inference results: identification of two specimens of Bemisia tabaci (adult) with 96% and 94% confidence.

Image credits: Bemisia tabaci (adult), Mihajlo Tomić, iNaturalist, observation 60075120, CC BY 4.0.

Case 2. Consultation of detection history

The detection history module displays a list of records ordered chronologically, where each list item contains an ID, the date and time of the executed inference, the names of the identified insects, and the number of instances located. Additionally, a search filter based on the scientific name of the pest is implemented. Upon expanding a record, a panel is displayed detailing the name of each specimen, the number of individuals detected, and the options to view the detection photograph (View Photo) and delete the record (Delete). Figure 10 illustrates this use case.

Figure 10. Detection history.

Source: Own elaboration. Note: (A) General detection history. (B) Data filtering by the scientific name of the species. (C) Photographic record of processed detection.

Image credits: Bemisia tabaci (adult), Mihajlo Tomić, iNaturalist, observation 60075120, CC BY 4.0.

Furthermore, tapping on a species’ name grants access to the technical consultation section, which details its essential information: reference photograph, common and scientific nomenclature, risk level, physical description, and control measures for mitigation. This functionality serves as a supporting tool for user; however, the final judgment of an agricultural professional remains essential. Figure 11 illustrates the described interface.

Figure 11. Interface for technical information consultation on a specific pest.

Source: Own elaboration. Note: (A) Photographic record of the species. (B) Biological description, risks, and control measures.

Image credits: Bemisia tabaci (adult), Mihajlo Tomić, iNaturalist obs. 60075120, CC BY 4.0.

Case 3. Statistical information

The statistics module displays a summary of the detections recorded by the application. This interface integrates three types of representations: line, bar, and pie charts.

The line chart shows the temporal evolution of the detections of each pest over the last seven days, allowing for the identification of fluctuations in their presence. Furthermore, it includes a multi-selection filter to display charts based on one or more selected species. This tool is fundamental for comparative analyses of pest population growth. Figure 12 illustrates the components described above.

Figure 12. Line chart of pest population fluctuations.

Source: Own elaboration. Note: Temporal evolution of detections for (A) one, (B) four, and (C) nine classes of agricultural pests is shown.

The bar chart displays the volume of detections per species. Tapping on one of the bars displays the pest name and the number of registered instances in the lower, adjacent section. Furthermore, it incorporates a temporal segmentation filter for visualizing data by day, week, or month. Figure 13 illustrates the components described above.

Figure 13. Bar chart of detection volume per species.

Source: Own elaboration. Note: The distribution of detections per pest is shown.

The pie chart displays the proportion of each species relative to the total. Tapping on a section of the circle displays the pest name and the corresponding percentage in the lower, adjacent section. Furthermore, it incorporates a temporal segmentation filter for visualising data by day, week, or month. Figure 14 illustrates the components described.

Figure 14. Pie chart of the proportion of each species relative to the total.

Source: Own elaboration. Note: The percentage of the number of detections per pest is shown.

These statistical representations provide a comprehensive overview of the population growth of entomological infestations over specific periods. Consequently, they serve as decision-making support tools for integrated pest management (IPM).

The modular design and clear interfaces of PestNeuroVision ensure the adaptability of the application to diverse agricultural environments. This system represents a technical contribution to the field of precision agriculture, where technology and fieldwork converge to optimize IPM under the criteria of efficiency and sustainability.

Discussion

The model achieved an accuracy of 92.4% and mAP@50 of 91.7%, demonstrating robust detection performance across the nine target classes. These results are consistent with those of Li et al.,²⁴ who obtained an accuracy of 89.9% and an mAP@50 of 93.7% using a YOLOv8s-based CNN architecture to detect agricultural pests and diseases. However, the detector developed in this study showed superior accuracy (92.4% vs 89.9%), resulting in a reduced false-positive rate for species identification. Furthermore, this study aligns with the findings of Dai et al.,²⁵ who reported an mAP@50 of 87.0% using an improved algorithm derived from YOLO11 for pest identification. These data validate the efficacy of the proposed solution, establishing a solid technical foundation for its use as a precision agricultural tool.

High confidence levels were obtained following the inferences performed on images of the nine pest classes. These results demonstrate the model’s robust capability to distinguish the diverse morphological characteristics of insects. However, greater precision variability (between 34.0% and 84.0%) was observed in Myzus persicae (nymph), which may be explained by their small body size or the presence of adjacent or overlapping individuals. These findings align with those of Yang et al.,²⁶ who, while developing a YOLO11-based pest detection architecture, concluded that the algorithm was affected by high specimen concentration and occlusion.

The heatmaps generated by Eigen-CAM showed patterns consistent with the quantitative metrics obtained in two of the three cases analyzed. In Bemisia tabaci (adult), the highest intensity was concentrated on the specimens’ bodies, a result consistent with its 96.4% precision and 92.8% mAP@50. Conversely, for Spodoptera frugiperda (larva) and Ceratitis capitata (adult), the focus of attention was diffused between the insect and the leaves. However, the interpretation of this phenomenon differs: for Spodoptera frugiperda (larva), such dispersion is congruent with its 60.9% recall and 67.7% mAP@50–95, whereas for Ceratitis capitata (adult), it is incongruous despite its high mAP@50 of 99.5%. Because Eigen-CAM does not use class information or gradients to weight relevant features,²⁷ an activation shift occurs toward prominent textures in scenes with visually dominant backgrounds. This behavior is consistent with the limitations reported by Dusza et al.,²⁸ who stated that no CAM method provides universally reliable explanations.

The main limitations of this study are the limited dataset volume and the variable instance density per image. Furthermore, model validation in real agricultural environments is required to evaluate detection accuracy outside of controlled conditions.

In future work, the detection catalog will be expanded to include new species and foliar diseases. Additionally, the data volume will be increased to improve the precision of the proposed model.

Conclusion

PestNeuroVision demonstrated the feasibility of applying CNNs and computer vision for agricultural pest detection on offline-operating mobile devices. The global metrics of the proposed model recorded a precision of 92.4%, recall of 87.7%, mAP@50 of 91.7%, and mAP@50–95 of 78.0%. Furthermore, per-image latencies of 4.8, 9.4, and 3.5 ms in preprocessing, inference, and postprocessing were obtained, respectively. These results demonstrate the robust performance of the detector across the nine entomological classes addressed.

Likewise, PestNeuroVision integrates these capabilities into three main modules: pest detection in images from the device’s gallery or camera, detection history management, and statistical visualization of population fluctuations. This solution enables the automation of the phytosanitary monitoring workflow in the field without connectivity dependence, establishing itself as an operational tool for IPM.

However, this study has certain limitations, such as the reduced dataset volume obtained from public repositories, which is restricted to 100 samples per class, and the variable instance density per image. These factors may affect the generalization capability of the model in uncontrolled environments. Future research should address these challenges by incorporating data captured directly in the field, expanding the number of pest classes, validating the results with farmers under real-world conditions, and evaluating the accuracy and performance across a broader diversity of Android devices.

Finally, PestNeuroVision demonstrates that the implementation of CNNs and computer vision on Android mobile devices constitutes a viable and operational solution for automating entomological monitoring. Its offline capability ensures execution in agricultural areas where connectivity is unstable or non-existent. In this sense, this proposal represents a technical contribution to precision agriculture.