Convolutional neural networks for real-time wood plank detection and defect segmentation

Mazhar Mohsin; Oluwafemi Samson Balogun; Keijo Haataja; Pekka Toivanen

doi:10.12688/f1000research.131905.1

Home Browse Convolutional neural networks for real-time wood plank detection and...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Convolutional neural networks for real-time wood plank detection and defect segmentation

[version 1; peer review: 1 approved with reservations]

Mazhar Mohsin ¹, Oluwafemi Samson Balogun¹, Keijo Haataja¹, Pekka Toivanen¹

PUBLISHED 23 Mar 2023

Author details Author details

¹ School of Computing, University of Eastern Finland, Kuopio, Kuopio, 70150, Finland

Mazhar Mohsin
Roles: Conceptualization, Data Curation, Formal Analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Oluwafemi Samson Balogun
Roles: Project Administration, Supervision, Writing – Review & Editing

Keijo Haataja
Roles: Project Administration, Resources, Supervision

Pekka Toivanen
Roles: Project Administration, Resources, Supervision, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Background: Defect detection and segmentation on product surfaces in industry has become one of the most important steps in quality control. There are many sophisticated hardware and software tools used in the industry for this purpose. The need for the real-time classification and detection of defects in industrial quality control has become a crucial requirement. Most algorithms and deep neural network architectures require expensive hardware to perform inference in real-time. This necessitates the design of architectures that are light-weight and suitable for deployment in industrial environments.
Methods: In this study, we introduce a novel method for detecting wood planks on a fast-moving conveyor and using a convolutional neural network (CNN) to segment surface defects in real-time. A backbone network is trained with a large-scale image dataset. A dataset of 5000 images is created with proper annotation of wood planks and defects. In addition, a data augmentation technique is employed to enhance the accuracy of the model. Furthermore, we examine both statistical and deep learning-based approaches to identify and separate defects using the latest methods.
Results: Our plank detection method achieved an impressive mean average precision of 97% and 96% of global pixel accuracy for defect segmentation. This remarkable performance is made possible by the real-time processing capabilities of our system, which can run at 30 frames per second (FPS) without sacrificing accuracy.
Conclusions: The results of our study demonstrate the potential of our method not only in industrial wood processing applications but also in other industries where materials undergo similar processes of defect detection and segmentation. By utilizing our method, these industries can expect to see improved efficiency, accuracy, and overall productivity.

Keywords

Artificial Intelligence, Convolutional Neural Networks, Data Augmentation, Defect Detection, Deep Learning, Neural Networks.

Corresponding author: Mazhar Mohsin

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2023 Mohsin M et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Mohsin M, Balogun OS, Haataja K and Toivanen P. Convolutional neural networks for real-time wood plank detection and defect segmentation [version 1; peer review: 1 approved with reservations]. F1000Research 2023, 12:319 (https://doi.org/10.12688/f1000research.131905.1) First published: 23 Mar 2023, 12:319 (https://doi.org/10.12688/f1000research.131905.1) Latest published: 23 Mar 2023, 12:319 (https://doi.org/10.12688/f1000research.131905.1)

Introduction

Defect detection and segmentation are crucial steps for quality control in automated production lines in various industries such as wood, textiles, and medicine.¹^–³ These processes involve the use of machine-learning algorithms to identify and classify defects, which can be surface imperfections, structural issues, or other abnormalities that affect product quality. Although it is easy for humans to detect and recognize defects on product surfaces, machines are not always accurate at performing this task. Therefore, industries require automatic defect detection systems in their quality control processes, ensuring that only high-quality products are released for sale. These systems take input images and produce segmentation of areas containing defects.⁴ The location of the defect is crucial in these applications, and real-time processing is important in industrial environments.

There have been significant developments in deep learning-based defect detection in recent years, with attempts to create generic datasets⁵ that can be used for all types of defect detection systems. However, every application requires datasets in a specific industry domain, such as wood and metal.¹^,⁶ If a model is trained on a specific defect dataset, it may not always produce the same accurate results on other defects datasets as the properties of the product surface may have different background colors or different types of defects.⁷ Methods⁸ have been developed that are trained on different datasets and use knowledge transfer to perform defect detection on other datasets; however, generic defect detection does not seems to work well for all types of defects.

Our paper introduces a new approach for the automated detection and segmentation of defects on rapidly moving wood plank surfaces, using our novel method. Our method first detects the wood plank itself and then the extracted plank image is passed to another module, where defect segmentation is performed in real-time. The input frame is divided into two parts, each with regions of interest (ROI). When a plank enters the first ROI and its front is captured, it is assigned an identification number (ID). The machine then flips the plank and it enters into the second ROI to capture the other side using the same ID. Defects are detected and segmented on both sides, the final classification of the plank is determined based on the severity of the detected defects. We classified wood planks into six categories based on their quality, with 1 indicating a higher quality plank and 6 indicating the lowest quality. Table 3 shows the defect types and corresponding severity of the defects. Our approach uses both sides of the wood panel to achieve better results than previous research, which typically used only one side of the wood surface or employed multiple cameras, resulting in increased hardware costs.⁹^,¹⁰ Figure 1 provides a visual representation of the industrial environment in which a wood plank is moving on a conveyor, making it easy to understand how our approach is implemented.

Figure 1. Industrial scenario: A wood plank is being conveyed on a conveyor belt.

Our contribution in this paper is threefold:

• We propose a novel CNN-based approach that can accurately detect wood planks in real-time and perform segmentation of any surface defects present on the planks.
• We created a dataset of 5000 labeled images for wood defect segmentation.
• We trained our novel model, deployed it on an edge device, and optimized it for use in industrial settings.

In the section “literature review”, we review the most recent methods for defect detection and segmentation, with a focus on the wood industry. In the section “methods”, we describe our novel method, and in section dataset, we explain the details of our dataset. The training of our model is described in the section “training”. We evaluate our method in section evaluation. Finally, we conclude the paper and sketch some new future research work ideas in the section “conclusion and future work”.

Literature review

Defect detection and segmentation

The objective of defect detection and segmentation is to automatically identify patterns on product surfaces during quality control. This is an important task in industrial quality control and manufacturing, and numerous methods have been developed to address this issue. Some of the methods used include k-means clustering,¹¹ active contours,¹² region growth,¹³ and graph cuts¹⁴ and deep learning techniques, such as CNN,¹⁵ encoder-decoder models,¹⁶ R-CNNs,¹⁷ recurrent neural network models,¹⁸ and generative adversarial network models.¹⁹ However, most of these approaches are performed on a single image and executed offline. These methods are not sufficiently efficient for real-time industrial applications, where efficiency is a key consideration.

Traditional methods

Several techniques have been suggested for identifying defects in various materials and surfaces using statistical methods. For example, the method described in²⁰ employed fuzzy connected components to identify defects present on strip steel surfaces by calculating the maximum and sum of fuzzy connected areas, resulting in a detection rate of 96.8%. In,¹⁰ an artificial vision-based system for evaluating the quality of slate slabs is presented, using 3D and 2D color data that is processed and analyzed to detect six specific traits. An unsupervised approach to identifying defects in images was proposed in.²¹ The method focuses on surface texture and utilizes low-rank representation combined with a texture prior. However, the effectiveness of the method is partially contingent on the quality of the prior map and assumes that the defects are in the foreground, implying that if the background is more prominent than the defect, the method may not be able to detect it.²² employs random decision forest methods for defect detection. This method combines feature extraction and classification techniques to detect defects in fabrics. The advantage of using random decision forests is their ability to handle both continuous and discrete variables, prevent overfitting as a classifier, and perform efficiently when handling large datasets. In,²³ traditional classification methods such as local binary patterns and gradient local binary patterns were used to classify defects on the surface of birch veneer. However, the proposed method classifies only two types of defects, cracks and mineral lines, and fails to classify other forms of defects, which is a key limitation. To determine the location of defects in an image,²⁴ proposed the use of gradient local binary patterns. This method leverages the non-continuity of pixels within a local area, reducing the potential area of defect existence, improving accuracy, and saving time for further defect detection. The detection of defects on complex pattern surfaces, such as fabrics, was also explored using traditional statistical methods.²⁵

Deep learning based methods

In addition to traditional methods, deep-learning based techniques have been used to address the problem of defect detection. Many deep learning models were originally trained to detect a variety of objects, such as people, animals, cars, and other objects in real-world scenes.²⁶ These models are typically trained on a large-scale image datasets, such as microsoft common objects in context (MS-COCO)²⁷ and imageNet large scale visual recognition challenge (ILSVRC).²⁸ For defect detection and other industrial quality assurance tasks, these pre-trained models can be re-trained using a technique called transfer learning to adapt them to the target task of defect detection and classification.²⁶ Many recent methods for defect detection and classification in industrial settings use transfer learning and train CNN-based deep learning models, such as ResNet,²⁹ RetinaNet,³⁰ AlexNet,³¹ DenseNet,³² VGG16³³ and GoogleNet³⁴ to detect, classify and segments defects. The result from these show that all CNN based deep learning methods significantly improve the final prediction accuracy of the detection of defects as compared with the traditional methods.

A wood knot detection and classification method based on a residual network, called TL-ResNet34,¹ was proposed. The results from the method claims that TL-ResNet34 is far more accurate than other methods for detection and classification. A weakly supervised CNN-based method for detection and classification was proposed in.³⁵ The model was trained using a small number of labeled images. One limitation of this method is underfitting, in which the model fails to detect and classify different types of defects. An automatic visual inspection system⁹ has been proposed that can be used to detect and classify defects on wood surfaces. The main contribution of this method is speed optimization of the defect identification task. The results showed that data augmentation and transfer learning techniques can be used together to achieve good results. The pre-trained ResNet152 neural network model achieved an average accuracy of 80.6%.

A deep regression and classification-based framework for defect detection was developed in,³⁶ which has four modules: detection, false positive reduction, connected component analysis, and classification. The proposed method has good accuracy, but it is too computationally intensive for even small input images, and thus, it is limited to offline usage. Another method³⁷ combines a neural architecture search and one of the most famous instance segmentation methods, Mask-RCNN,³⁸ for the detection and segmentation of defects on the surfaces of wood veneers. Regarding the detection of defects on wood surfaces, the proposed method is more accurate and faster than other currently available techniques. However, the segmentation task requires significant amount of time, making it unsuitable for real-time industrial inspection systems. Most of the proposed methods only focused on accuracy, and very few methods have improved efficiency. An improved single shot detector³⁹ based method was proposed to improve the detection efficiency. The trained model detected very few types of defects and was limited. A mixed-FCN (fully convolutional neural network) method was proposed in,¹⁵ which is an improved, fully convolution network (FCN) for the detection and recognition of wood defects that outperforms the existing methods while requiring little or no image preprocessing for feature extraction. The model was trained to identify only six types of wood defects. A FCN and regional convolutional neural network (R-CNN)-based method⁴⁰ were proposed to detect and segment building cracks. The proposed method has limitations owing to its low performance in real-time applications, similarly to other detection and segmentation methods. An improved CNN based method for weld classification was proposed in.⁴¹ This method uses image convolution to enhance edge features and combines them with integral images to create a more accurate segmentation. The algorithm can extract the weld edge and divide the region quickly and accurately while maintaining the processing time within the real-time requirements.

In,⁴² a modification was made to a CNN-based method named U-Net, replacing its softmax layer with a random forest to detect small surface defects with high accuracy. This method is slow and is limited to an offline setup. A CNN based method was proposed in⁴³ to segment defects on standing trees using LIDAR (light detection and ranging) data. The input data for this method are the point cloud data. A mesh is reconstructed from the point cloud data. Then reconstructed mesh is then used to make a relief map and is taken as input to the U-net for segmentation. This method is computationally expensive and suitable only for offline applications. To reduce the computation time of CNN,⁴⁴ proposed a method that uses a non-subsampled shearlet transform (NSST) to preprocess images. Then, the images were passed to the CNN for detection and classification. This method has the advantage of faster training speed; however, the speed of inference is not conducive to its utilization in industrial applications.

In addition to CNN-based deep learning methods, auto-encoders and generative adversarial networks have also been used for defect detection. The dual auto-encoder generative adversarial network (GAN) method,⁴⁵ a deep learning method, was proposed for defect detection in different types of products. The GAN has the benefit of generating a large amount of data that can be used for training, which makes the model more accurate for predicting defects in unseen data.

In the literature, most researchers have used classical image processing methods or deep learning methods to extract features to detect and classify defect locations. Most of these methods are offline and unsuitable for real-time industrial use. These methods have many limitations, such as inference speed and detection accuracy. In all these methods, the focus is only on detecting and classifying defects on the plank/wood surface, and none have focused on detecting the wood plank itself. Therefore, we propose a novel CNN method that can detect and classify wood planks and then detect, classify, and segment defects on wood plank surfaces, which can be deployed in industrial environments and outperforms all these methods.

Methods

The automatic detection of wood planks and separating it from the background on a fast-moving conveyor in real-time is a challenging task. Furthermore, defect segmentation on the surface of wood planks in real-time adds more to task complexity. We propose a novel method consisting of a backbone network that performs feature extraction, a detection algorithm for wood plank detection, and a segmentation module, that performs defect segmentation in real-time. Finally, each plank is classified into different categories, from level 1 to 6, depending on the severity of the defects on each plank surface. Figure 2 shows the architecture of the proposed method. Details of these networks and modules are described in the following sub-sections.

Figure 2. The overall system architecture.

Backbone network

High prediction performance in CNN training requires a substantial amount of annotated datasets, but acquiring such a large quantity of data can be challenging and expensive, especially for image labeling.⁴⁶ To address this issue, transfer learning is often utilized with a limited number of datasets, demonstrating its effectiveness as a solution. This occurs when the backbone network is first trained with such a large dataset.

The backbone network, also known as the baseline network, is responsible for extracting features from input images. There are several state-of-the-art deep CNNs, such as VGG,³³ GoogleNet,³⁴ AlexNet,³¹ and ResNet,²⁹ that can be used as feature extractors or backbone networks for object detection. These networks are known for their accuracy, but they may not be as efficient in terms of inference speed.

We use MobileNetV3⁴⁷ as the backbone network for feature extraction. The network employs depthwise separable convolutions instead of traditional convolutions to reduce computational complexity. Standard convolutions have spatial dimensions and input/output channels and require a significant number of multiplications. By contrast, depthwise separable convolutions splits the standard convolution into two separate operations: a depthwise convolution that applies filters to each input channel and a 1x1 pointwise convolution that combines the outputs from the depthwise step. This approach results in a smaller model size, fewer parameters, and faster computation times.

To improve the classification accuracy, we pre-trained MobileNetV3 on a large-scale image dataset called MS-COCO.²⁷ The COCO 2019 object detection dataset’s training and validation subsets, comprising of over 200,000 images, were downloaded. The dataset can be accessed for free on the website. The last three layers, in conjunction with the fully connected layers, are then fine-tuned on the wood planks and defect dataset. Figure 3 shows the MobileNetV3 block containing an input, an expansion convolution, depthwise convolutions, projection layer, and output layer. A residual connection is established if the input and output have the same number of channels.

Figure 3. MobileNetV3 building block image from.⁴⁷

Wood plank detection

We use Faster-RCNN⁴⁸ network with MobileNetV3 backbone to detect wood planks. The detector extracts features from different layers of a pre-trained backbone network and sends them to a regional proposal network (RPN) and region of interest (ROI) pooling module. The RPN is responsible for identifying the location of potential wood planks in the input image, whereas the ROI pooling module extracts fixed-sized window features and passes them to the final two fully connected layers for class and bounding box predictions. The network takes an input image of size $H \times W$ with three color channels. The first layer is a 3x3 convolutional layer with a stride of two, which reduces the spatial dimensions of the image by half and outputs 40 feature maps. Hardswish⁴⁹ is used as an activation function Equation 1, which is a nonlinear activation function. The subsequent layers are a series of mobile inverted residual bottleneck (MBConv) blocks. Each block consists of a depthwise convolutional layer followed by a pointwise convolutional layer, with skip connections between the input and output of the block. The MBConv2 block has a stride of two, while the rest have a stride of one. The dilation for all the convolutional layers is one, meaning that there is no dilation applied. After the MBConv blocks, a feature pyramid network (FPN) is used to generate feature maps of different spatial resolutions. The FPN is used to address the problem of scale variance in plank detection, where planks of different sizes may require different spatial resolutions for detection. The FPN in this network produces feature maps with a spatial resolution of $\frac{H}{16} \times \frac{W}{16}$ and 256 feature maps. The RPN (region proposal network) generates object proposals at different locations and scales based on the feature maps from the FPN. The RoIAlign layer then extracts features from each proposal and feeds them into two parallel fully connected layers: a classification layer (Cls) that outputs a probability distribution over the classes (including background) for each proposal, and a bounding box regression layer (BBx) that predicts the offsets to the default bounding boxes for each proposal.

The final output of the network is a set of class probabilities and bounding box offsets for each anchor box, which can be used to generate the final plank detections. The configuration details of the detection module is shown in Table 1.

Table 1. Details of the layers in the detection module with a MobileNetV3 backbone.

Layer	Output size	Kernel/stride/padding	Dilation	Activation
Input	$H \times W \times 3$	-	-	-
Convolution 1	$\frac{H}{2} \times \frac{W}{2} \times 40$	$3 \times 3 / 2$	1	Hardswish
Bottleneck 1	$\frac{H}{2} \times \frac{W}{2} \times 24$	$3 \times 3 / 1$	1	Hardswish
Mobile inverted residual bottleneck convolution 2	$\frac{H}{4} \times \frac{W}{4} \times 40$	$3 \times 3 / 2$	1	Hardswish
Mobile inverted residual bottleneck convolution 3	$\frac{H}{4} \times \frac{W}{4} \times 40$	$3 \times 3 / 1$	1	Hardswish
Mobile inverted residual bottleneck convolution 4	$\frac{H}{4} \times \frac{W}{4} \times 40$	$3 \times 3 / 1$	1	Hardswish
Mobile inverted residual bottleneck convolution 5	$\frac{H}{8} \times \frac{W}{8} \times 112$	$3 \times 3 / 2$	1	Hardswish
Mobile inverted residual bottleneck convolution 6	$\frac{H}{8} \times \frac{W}{8} \times 112$	$3 \times 3 / 1$	1	Hardswish
Mobile inverted residual bottleneck convolution 7	$\frac{H}{8} \times \frac{W}{8} \times 160$	$3 \times 3 / 1$	1	Hardswish
Feature pyramid network	$\frac{H}{16} \times \frac{W}{16} \times 256$	-	-	-
Regional proposal network	-	-	-	-
Region of interest align	-	-	-	-
Classification layer	$N_{anchors} \times (C + 1)$	-	-	Softmax
Bounding box regressor	$N_{anchors} \times 4$	-	-	Linear

Defect segmentation

In addition to wood plank detection, the process of defect segmentation involves identifying and separating each pixel from the wood plank belonging to a defect. Each defect type is labeled with a specific color. To extract features at multiple scales and to reduce the computational complexity of the model at the inference stage, we use atrous convolution operations and the atrous spatial pyramid pooling (ASPP) module in the segmentation part. Atrous convolution, is a type of convolution operation in which the filter kernel is dilated with zeros before being applied to the input signal. This allows the operation to have a larger receptive field, without increasing the number of parameters. The ASPP module is a type of multi-scale pooling operation that utilizes atrous convolutions at different dilation rates to capture information at multiple scales. It is used for semantic segmentation to improve the network’s ability to capture objects of varying sizes. The ASPP module consists of parallel atrous convolution branches with different dilation rates, followed by global average pooling, which aggregates the information across all spatial positions. The output of each branch is then concatenated to form the final feature representation.

The first layer of the network is a 3x3 convolutional layer with stride two, which reduces the size of the image and increases the number of channels. This layer helps to extract features from the input image. The MobileNetV3 blocks are then used to further extract features from the image. These blocks use depthwise convolutions to reduce the number of parameters in the network while still maintaining good performance. Hardswish is used as an activation function Equation 1.⁴⁹ By using MobileNetV3 blocks, the network is able to learn features that are specifically relevant for identifying defects in images. After backbone, the ASPP module is used to capture information at multiple scales in the image. This module applies 1x1 convolutions with different dilation rates to the feature map in order to capture information at different scales. This helps the network to identify defects of different sizes and shapes. The decoder module is then used to upsample the feature map back to the original resolution of the input image. This module creates a segmentation map that matches the size and shape of the input image. Finally, the logits layer outputs a probability distribution over the classes of interest which are the defect and non-defect regions of the image. The Softmax activation function⁵⁰ then create a probability distribution that identify the location of defects in the image. Table 2 shows the complete configuration of the segmentation module along with backbone network.

(1)

HardSwish (x) = \{\begin{cases} 0 & if x \leq - 3 x \\ if x \geq 3 x \cdot (x + 3) / 6 & otherwise \end{cases}

Table 2. Details of the layers in the segmentation module with MobileNetV3 backbone.

Layer	Output size	Kernel/stride/padding	Dilation	activation
Input	$H \times W \times 3$	-	-	-
Convolution	$\frac{H}{2} \times \frac{W}{2} \times 40$	$3 \times 3 / 2$	1	Hardswish
Bottleneck 1	$\frac{H}{2} \times \frac{W}{2} \times 24$	$3 \times 3 / 1$	1	Hardswish
Mobile inverted residual bottleneck convolution 2	$\frac{H}{4} \times \frac{W}{4} \times 40$	$3 \times 3 / 2$	1	Hardswish
Mobile inverted residual bottleneck convolution 3	$\frac{H}{4} \times \frac{W}{4} \times 40$	$3 \times 3 / 1$	2	Hardswish
Mobile inverted residual bottleneck convolution 4	$\frac{H}{8} \times \frac{W}{8} \times 40$	$3 \times 3 / 2$	2	Hardswish
Mobile inverted residual bottleneck convolution 5	$\frac{H}{8} \times \frac{W}{8} \times 40$	$3 \times 3 / 1$	4	Hardswish
Mobile inverted residual bottleneck convolution 6	$\frac{H}{8} \times \frac{W}{8} \times 40$	$3 \times 3 / 1$	4	Hardswish
Mobile inverted residual bottleneck convolution 7	$\frac{H}{8} \times \frac{W}{8} \times 112$	$3 \times 3 / 1$	4	Hardswish
Mobile inverted residual bottleneck convolution 8	$\frac{H}{8} \times \frac{W}{8} \times 112$	$3 \times 3 / 1$	8	Hardswish
Mobile inverted residual bottleneck convolution 9	$\frac{H}{8} \times \frac{W}{8} \times 160$	$3 \times 3 / 1$	8	Hardswish
Atrous spatial pyramid pooling	$\frac{H}{8} \times \frac{W}{8} \times 960$	-	1, 2, 3, 6	-
Decoder	$\frac{H}{4} \times \frac{W}{4} \times 40$	-	-	-
Logits	$H \times W \times C$	-	-	Softmax

Dataset

Our dataset consists of labeled images of planks and defects. For a period of one month, a machine vision camera was installed in a wood industry located in Finland to capture images. The planks move very fast on a conveyor; therefore, we used multi-threading and hardware acceleration to process the high frame rates and capture sharp images. A total of 5000 images were collected using this equipment. These images were visually inspected and discarded if they were blurry or if there was no plank at any particular time. The dataset contains six classes of defects (Table 3) for defect segmentation and two classes for plank detection (plank or background). The dataset is distributed among subsets, such as training, testing, and validation subsets (Table 4). To segment the defects accurately and reduce the number of false positives, we labeled the wood planks themselves and created a dataset for the detection method. We then created a dataset for the defect segmentation model, which takes the extracted plank image as input and draws a segmentation mask for each defect. Currently we are unable to share this full dataset therefore, an example dataset has been created so that is possible to test the proposed methods⁵¹

Table 3. Defect types, severity and properties.

Defect types	Severity	Properties
Knot	1	Small size defects with least severity.
Stain	2	Medium size defects having variable sizes.
Branch	3	Medium size defect, can be visible on both sides of plank.
Lines	4	Narrow and long size defect.
Edge	5	Always appear on the edges of the plank.
Area	6	The most severe defect and covers a large area of the plank.

Table 4. The quantity of images in each subset of the dataset for various types of defects.

Defect types	Training	Validation	Testing
Knot	1580	316	237
Branch	1200	240	180
Area	1270	254	190
Lines	1340	268	201
Edge	1150	230	172
Stain	1350	270	202

Defect classes

We categorized each plank according to different defect levels. A plank can contain multiple defects of different classes. Each defect is categorized based on different severity level. The final classification of a plank is determined based on the highest severity of the defect. The Figure 4 shows different defect types. 3 shows the defect types, defect severity, and properties of the defects.

Figure 4. Defect types: from left to right (1) Area (2) Branch (3) Edge (4) Knots (5) Line (6) Stain.

Data annotation and labeling

We used the CVAT (Computer Vision Annotation Tool)⁵² to label the images in our dataset. This open-source tool was deployed on an Amazon Web Services⁵³ virtual machine. The complete dataset was uploaded to storage and then loaded into CVAT. Each frame from the dataset was labeled individually by drawing a rectangle on the frame for the wood plank dataset and a polygon around the defects for the defect segmentation dataset, and assigning a class name. After annotating the training set, the JSON (javascript object notations) annotation labels and images were downloaded for model training. Then dataset is passed through an augmentation process.

Data augmentation

To make the model more generalizable for inference, we used a technique called data augmentation, which included multiple image operations such as random flip, shear, translation, and rotation. The pytorch⁵⁴ library has builtin functions to perform these operations. After this process, the dataset contained an increased number of images. Table 4 shows the number of images in each subset after data augmentation process. The model is then trained on the training set with increased number of images.

Training

We used Pytorch⁵⁴ (RRID:SCR_018536) distributed data-parallel training module to train the model on multiple graphical processing units (GPUs). The model is trained using two NVIDIA Quadro RTX 8000 48GB GPUs. The starting hyperparameter learning rate is set to 0.1, a batch size of 128 images per GPU, a learning decay of 0.01 per 5 epochs and RMSprop as an optimizer at eps=0.0316. We initially configured the number of epochs to 300; however, after observing the loss on the validation set, we implemented early stopping and the training script reduced the number of epochs to 200.

Quantization and pruning

Network quantization is a method of reducing the number of bits per weight of the network. Some hardware supports a faster inference speed with a quantized network. We deployed our model on an Nvidia Jetson Xavier device for real-time inference. The device supports quantized models, by default. We tested the model with precision of 8-bit integers (INT8). The number of parameters are reduced by 40% and the inference speed was increased by approximately 3x. The model was fitted to the MS-COCO dataset through a process of quantization aware training by refining a non-quantized version, where activation and weights are quantized to lower precision, without sacrificing the overall accuracy of the model and making it faster for real-time inference. To make the model more efficient, we pruned it to remove redundant elements in the network. This did not affect the accuracy of detection and segmentation in our results.

Evaluation

To evaluate the performance of the model for the detection of planks, we used mean average precision (mAP) metric Equation 2 on the plank test set, and for the segmentation of defects, we used the mean intersection over union (mIOU) and global pixel accuracy on the defects test set. mIOU is used to evaluate the similarity between the predicted segmentation mask and the ground truth segmentation mask, while global pixel accuracy is used to evaluate the overall performance of a semantic segmentation model. It measures the proportion of pixels in an image that are correctly classified by the model. Given a predicted segmentation mask and the ground truth segmentation mask for an image, the global pixel accuracy can be calculated as the ratio of correctly classified pixels to the total number of pixels in the image. The plank detection mAP was 97% and the defects segmentation mIOU was 76%, with a global pixel accuracy of 96%.

(2)

mAP = \frac{1}{|classes|} \sum_{c \in classes} \frac{|{TP}_{c}|}{|{FP}_{c}| + |{TP}_{c}|}

Results and discussion

The network takes an input image and utilizes a region proposal network (RPN) to identify possible regions of interest, in this case, region containing wood planks, which are then classified and their bounding box coordinates refined. A classifier and bounding box regressor network are then used to further refine the identified planks and classify them into their respective classes. The output from this process is a set of detected wood planks with their class labels and bounding box coordinates. The resulting bounding box surrounding each plank is then cropped from the output image and passed to the segmentation module. The segmentation module is designed to receive an image containing a wood plank as input, and it specifically focuses on identifying and isolating any defects present on the plank.

After processing the input image, the segmentation module produces an image where each defect is highlighted with segmentation and labeled with its respective class color. Using this information, each wood plank is then classified according to the categories listed in Table 3.

Figure 5 shows an input image with two defined ROIs. Figure 6 and Figure 7 show the outputs from the detection and segmentation modules, respectively. The final output after plank classification is shown in Figure 8.

Figure 5. Input image with two region of interests highlighted in green.

Figure 6. Detection module output from second ROI.

Figure 7. Output from segmentation module.

Figure 8. Final output showing plank ID on the top left, severity or final classification, defects segmentation with unique colors and labels.

Deployment and testing

To evaluate the real-time performance of our trained model, we deployed it in a wood industry where wood planks move rapidly on a conveyor. We used the Nvidia Jetson AGX Xavier as the processing unit for real-time video encoding, model inference, decoding, and producing output to an external monitor or remote sink. For the video capture, we used an Allied Vision camera system with an adjustable lens. The video capture frame rate was tested with variable settings ranging from 15 to 50 fps. The ideal frame rate to capture the fast-moving wood plank was 25-30 FPS to avoid blurriness and capture sharp frames. The model performed very well without any latency issues at 30 fps and did real-time wood plank detection, defect segmentation, and final plank classification.

The hardware specification for the industry setup are shown in the Table 5.

Table 5. Hardware specifications.

Names	Technical specifications
CPU	8-core NVIDIA Carmel Armv8.2 64-bit CPU
GPU	512 NVIDIA CUDA cores
Memory	64GB 256-bit LPDDR
Storage	32GB eMMC 5.1
Camera	1800 U-511c Sony IMX547 color 1/1.8" 2472x2064
Lens	Allied Vision C-6-F2.8-6MP-T1-1.8

Conclusion and future work

In this study, we proposed a novel method for the automatic detection of wood planks and defect segmentation. Our method employs several techniques to improve the accuracy of results. First, data augmentation was used to increase the number of images in the dataset. This technique involved applying various image operations such as random flip, shear, translation and rotation to each image. Second, we utilized transfer learning to improve the ability of our model to detect planks and segment defects on the plank surfaces. The results showed that these techniques were effective in achieving high accuracy in detection and segmentation tasks. Specifically, the highest mean average precision value of 97 and global pixel accuracy value of 96 were achieved. Moreover, our method demonstrated real-time performance at 30 frames per second, making it suitable for industrial wood processing applications. These results also suggest that our method has the potential for application to other industrially processed materials. As a future work, further research could investigate the transferability of our method to other materials and industries. This requires the creation of new datasets with proper annotation and training of the model on these datasets. Moreover, our approach is specific to detecting and segmenting surface defects. The future work could explore the detection and segmentation of other types of defects such as internal defects. Future research could focus on the robustness and adaptability of our approach to changes in the lighting conditions, camera angles, and other environmental factors. One potential approach could be to incorporate online learning techniques, which would allow the model to adapt to environmental changes over time. Our method relies on a convolutional neural network for detecting and segmenting surface defects. Although this has proven to be effective, there are other deep learning architectures that could be explored in future work. For example, recurrent neural networks can be used to analyze the temporal data from a conveyor belt. Similarly, attention mechanisms can be employed to selectively focus on regions of an image that are more likely to contain defects.

Data availability

Source data

MS COCO dataste T.-Y. Lin et al., “Microsoft COCO: Common Objects in Context,” CoRR, vol. abs/1405.0312, 2014. This project contains the following source data:

• COCO 2019 Object Detection Task.

The annotations in this dataset along with this website belong to the COCO Consortium and are licensed under a Creative Commons Attribution 4.0 License.

Underlying data

This project is currently ongoing and a non disclosure agreement (NDA) has been signed. We are currently unable to provide the full underlying data for this article due to the terms of the NDA. We have supplied a shortened example dataset that can be used to confirm the functionality of the methods presented in our article. This example dataset is available in the extended data section of the article.⁵¹ At the moment there are certain legal and contractual obligations that prevent us from providing a restricted route of access to the complete dataset. As the project continues, we will keep the issue of data access in mind and will work with the company to determine if and when we might be able to share the complete dataset in the future.

Extended data

Figshare: Code and Data. https://doi.org/10.6084/m9.figshare.22189003.⁵¹

This project contains the following extended data:

- Defects.csv
- Dataset.zip (Example shortened dataset)

Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).

Software availability

Archived source code at the time of publication: https://doi.org/10.6084/m9.figshare.22189003.⁵¹

License: CC0 1.0 Public domain dedication

References

1. Gao M, Qi D, Hongbo M, et al.: A transfer residual neural network based on resnet-34 for detection of wood knot defects. Forests. 2021; 12(2). 1999-4907. Publisher Full Text
2. Shahrabadi S, Castilla Y, Lopez MAG, et al.: Defect detection in the textile industry using image-based machine learning methods: a brief review.2022; 2224: 012010. Publisher Full Text
3. Liu X, Song L, Liu S, et al.: A review of deep-learning-based medical image segmentation methods. Sustainability. 01 2021; 13: 1224. Publisher Full Text
4. Ferguson M, Ak R, Lee Y-TT, et al.: Detection and segmentation of manufacturing defects with convolutional neural networks and transfer learning.2018. Publisher Full Text Reference Source
5. Bergmann P, Batzner K, Fauser M, et al.: The mvtec anomaly detection dataset: A comprehensive real-world dataset for unsupervised anomaly detection. Int. J. Comput. Vis. Apr 2021; 129(4): 1038–1059. Publisher Full Text
6. Wang S, Xia X, Ye L, et al.: Automatic detection and classification of steel surface defect using deep convolutional neural networks. Meta. 2021a; 11(3). 2075-4701. Publisher Full Text Reference Source
7. Saberironaghi A, Ren J, El-Gindy M: Defect detection methods for industrial products using deep learning techniques: A review. Algorithms. 2023; 16(2). 1999-4893. Publisher Full Text Reference Source
8. Kumagai A, Iwata T, Fujiwara Y: Transfer anomaly detection by inferring latent domain representations.Wallach H, Larochelle H, Beygelzimer A, et al., editors. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2019; volume 32. .
9. Urbonas A, Raudonis V, Maskeliūnas R, et al.: Automated identification of wood veneer surface defects using faster region-based convolutional neural network with data augmentation and transfer learning. Appl. Sci. 2019; 9(22). 2076-3417. Publisher Full Text
10. Iglesias C, Martínez J, Taboada J: Automated vision system for quality inspection of slate slabs. Comput. Ind. 2018; 99: 119–129. 0166-3615. Publisher Full Text
11. Pham VH, Lee BR: An image segmentation approach for fruit defect detection using k-means clustering and graph-based algorithm. Vietnam Journal of Computer Science. Feb 2015; 2(1): 25–33. 2196-8896. Publisher Full Text
12. Gan Y, Qunfei Z: An effective defect inspection method for lcd using active contour model. Instrumentation and Measurement, IEEE Transactions on. 09 2013; 62: 2438–2445. Publisher Full Text
13. Si X, Zhen H, Xuemin H: Fabric defect detection based on regional growing pcnn. J. Multimed. 10 2012; 7: 372–379. Publisher Full Text
14. Lee J, Yoo SI: Defect detection on images using multiple reference images: solving a binary labeling problem using graph-cuts algorithm. Journal of Electronic Imaging. July 2012; 21: 033011–033014. Publisher Full Text
15. He T, Liu Y, Chengyi X, et al.: A fully convolutional neural network for wood defect location and identification. IEEE Access. 08 2019;1–1. Publisher Full Text
16. Uzen H, Turkoglu M, Hanbay D: Multi-dimensional feature extraction-based deep encoder–decoder network for automatic surface defect detection. Neural Comput. & Applic. February 2023; 35(4): 3263–3282. Publisher Full Text
17. Xu L, Boyu L, Hong M, et al.: Improved faster r-cnn algorithm for defect detection in powertrain assembly line. Procedia CIRP. 2020; 93: 479–484. 53rd CIRP Conference on Manufacturing Systems 2020. 2212-8271. Publisher Full Text Reference Source
18. Xu Q, Zhao Q, Yu G, et al.: Rail defect detection method based on recurrent neural network. 2020 39th Chinese Control Conference (CCC). 2020a; pages 6486–6490. Publisher Full Text
19. Wang Q, Yang R, Chongjun W, et al.: An effective defect detection method based on improved generative adversarial networks (igan) for machined surfaces. J. Manuf. Process. 2021b; 65: 373–381. 1526-6125. Publisher Full Text Reference Source
20. Zhang J, Wang H, Tian Y, et al.: An accurate fuzzy measure-based detection method for various types of defects on strip steel surfaces. Comput. Ind. 2020; 122: 103231. 0166-3615. Publisher Full Text
21. Huangpeng Q, Zhang H, Zeng X, et al.: Automatic visual defect detection using texture prior and low-rank representation. IEEE Access. 2018; 6: 37965–37976. Publisher Full Text
22. Deotale NT, Sarode TK: Fabric defect detection adopting combined glcm, gabor wavelet features and random decision forest. 3D Research. Jan 2019; 10(1): 1–5. 2092-6731. Publisher Full Text
23. Li S, Li D, Yuan W: Wood defect classification based on two-dimensional histogram constituted by lbp and local binary differential excitation pattern. IEEE Access. 2019; 7: 145829–145842. Publisher Full Text
24. Liu X, Xue F, Lu T: Surface defect detection based on gradient lbp. 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC). 2018; 133–137. Publisher Full Text
25. Wang W, Deng N, Xin B: Sequential detection of image defects for patterned fabrics. IEEE Access. 2020; 8: 174751–174762. Publisher Full Text
26. Rosebrock A: Deep Learning for Computer Vision with Python, Practitioner Bundle.3.0.0 edition2019. Reference Source
27. Lin T-Y, Maire M, Belongie S, et al.: Microsoft coco: Common objects in context.2015.
28. Russakovsky O, Deng J, Hao S, et al.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV). 2015; 115(3): 211–252. Publisher Full Text
29. He K, Zhang X, Ren S, et al.: Deep residual learning for image recognition. CoRR, abs/1512.03385. 2015.
30. Lin T-Y, Goyal P, Girshick RB, et al.: Focal loss for dense object detection. CoRR, abs/1708.02002. 2017.
31. Krizhevsky A, Sutskever I, Hinton GE: Imagenet classification with deep convolutional neural networks.Pereira F, Burges CJC, Bottou L, et al., editors. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2012; volume 25. .
32. Huang G, Liu Z, Weinberger KQ: Densely connected convolutional networks. CoRR, abs/1608.06993. 2016.
33. Simonyan K, Zisserman A: Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations. 2015.
34. Szegedy C, Liu W, Yangqing J, et al.: Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). June 2015; pages 1–9. Publisher Full Text
35. Liang X, Lv S, Deng Y, et al.: A weakly supervised surface defect detection based on convolutional neural network. IEEE Access. 2020b; 8: 42285–42296. Publisher Full Text
36. He Z, Liu Q: Deep regression neural network for industrial surface defect detection. IEEE Access. 2020; 8: 35583–35591. Publisher Full Text
37. Shi J, Li Z, Zhu T, et al.: Defect detection of industry wood veneer based on nas and multi-channel mask r-cnn. Sensors. 2020; 20(16). 1424-8220. PubMed Abstract | Publisher Full Text | Free Full Text
38. He K, Gkioxari G, Dollár P, et al.: Mask R-CNN. CoRR, abs/1703.06870. 2017.
39. Ding F, Zhuang Z, Liu Y, et al.: Detecting defects on solid wood panels based on an improved ssd algorithm. Sensors. 2020; 20(18). 1424-8220. PubMed Abstract | Publisher Full Text | Free Full Text
40. Zheng M, Lei Z, Zhang K: Intelligent detection of building cracks based on deep learning. Image Vis. Comput. 2020; 103: 103987. 0262-8856. Publisher Full Text
41. Miao R, Jiang Z, Zhou Q, et al.: Online inspection of narrow overlap weld quality using two-stage convolution neural network image recognition. Mach. Vis. Appl. Jan 2021; 32(1): 27. 1432-1769. Publisher Full Text
42. Dong X, Taylor CJ, Cootes TF: Small defect detection using convolutional neural network features and random forests.Leal-Taixé L, Roth S, editors. Computer Vision – ECCV 2018 Workshops. Cham: Springer International Publishing; 2019; pages 398–412. 978-3-030-11018-5.
43. Delconte F, Ngo P, Debled-Rennesson I, et al.: Tree Defect Segmentation using Geometric Features and CNN. Reproducible Research on Pattern Recognition (RRPR). Milan, Italy: 2021; volume LNCS 12636: pages 80–100. Publisher Full Text
44. Yang Y, Zhou X, Liu Y, et al.: Wood defect detection based on depth extreme learning machine. Appl. Sci. 2020; 10(21). 2076-3417. Publisher Full Text
45. Tang T-W, Kuo W-H, Lan J-H, et al.: Anomaly detection neural network with dual auto-encoders gan and its industrial inspection applications. Sensors. 2020; 20(12). 1424-8220. PubMed Abstract | Publisher Full Text | Free Full Text
46. Gao Y, Mosalam KM: Deep transfer learning for image-based structural damage recognition. Comput. Aided Civ. Inf. Eng. 2018; 33(9): 748–768. Publisher Full Text
47. Howard A, Sandler M, Chu G, et al.: Searching for mobilenetv3. CoRR, abs/1905.02244. 2019.
48. Girshick RB, Donahue J, Darrell T, et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524. 2013.
49. Avenash R, Viswanath P: Semantic segmentation of satellite images using a modified cnn with hard-swish activation function. Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP. INSTICC, SciTePress; 2019; pages 413–420. 978-989-758-354-4. Publisher Full Text
50. Bridle J: Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters.Touretzky D, editor. Advances in Neural Information Processing Systems. 1989; volume 2. . Morgan-Kaufmann; Reference Source
51. Mazhar M: Code and Data — figshare.com. [data and code]. Reference Source2023.
52. CVAT.ai Corporation: Computer Vision Annotation Tool (CVAT).9 2022. Reference Source
53. Amazon web services. http
54. Paszke A, Gross S, Massa F, et al.: Pytorch: An imperative style, high-performance deep learning library. CoRR, abs/1912.01703. 2019.

Comments on this article Comments (1)

Version 1

VERSION 1 PUBLISHED 23 Mar 2023

Reader Comment 27 Sep 2025

Chenjie Xu, Shanghai Jiao Tong University, Shanghai, China

27 Sep 2025

Reader Comment

The manuscript lacks comparative experiments with current mainstream lightweight detection models (such as YOLOv5, Unet, etc.), making it difficult to objectively evaluate the proposed method's relative advantages.
Please supplement the ... Continue reading The manuscript lacks comparative experiments with current mainstream lightweight detection models (such as YOLOv5, Unet, etc.), making it difficult to objectively evaluate the proposed method's relative advantages.
Please supplement the experiments by comparing the speed and accuracy of at least 2–3 mainstream lightweight models on the same dataset.
The manuscript lacks comparative experiments with current mainstream lightweight detection models (such as YOLOv5, Unet, etc.), making it difficult to objectively evaluate the proposed method's relative advantages.
Please supplement the experiments by comparing the speed and accuracy of at least 2–3 mainstream lightweight models on the same dataset.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Comment

Author details Author details

¹ School of Computing, University of Eastern Finland, Kuopio, Kuopio, 70150, Finland

Mazhar Mohsin
Roles: Conceptualization, Data Curation, Formal Analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Oluwafemi Samson Balogun
Roles: Project Administration, Supervision, Writing – Review & Editing

Keijo Haataja
Roles: Project Administration, Resources, Supervision

Pekka Toivanen
Roles: Project Administration, Resources, Supervision, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 23 Mar 2023, 12:319

https://doi.org/10.12688/f1000research.131905.1

Copyright

© 2023 Mohsin M et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Mohsin M, Balogun OS, Haataja K and Toivanen P. Convolutional neural networks for real-time wood plank detection and defect segmentation [version 1; peer review: 1 approved with reservations]. F1000Research 2023, 12:319 (https://doi.org/10.12688/f1000research.131905.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 23 Mar 2023

Views

9

Reviewer Report 14 Jun 2023

Zhenye Li, Nanjing Forestry University, Nanjing, Jiangsu, China

Approved with Reservations

https://doi.org/10.5256/f1000research.144792.r177501

It's encouraging to see research tackling the pertinent issue of real-time defect detection and segmentation in industrial applications. Your work addresses a critical need in the industry and promises substantial improvement in process efficiency and accuracy.

I ... Continue reading

It's encouraging to see research tackling the pertinent issue of real-time defect detection and segmentation in industrial applications. Your work addresses a critical need in the industry and promises substantial improvement in process efficiency and accuracy.

I appreciate the clear outline of your novel method for detecting wood planks and segmenting surface defects. The use of a large-scale image dataset and data augmentation techniques indicates thorough preparation and diligence in your research methodology. However, it would be better if some more comparisons could be added to the experiment part. And by the way, the code you provided can not be downloaded for 403 forbidden error.

Your reported results are impressive, particularly a mean average precision of 97% for plank detection and 96% global pixel accuracy for defect segmentation. The real-time processing capability of 30 FPS further underscores the applicability of your system in an industrial environment. However, it would be interesting to see a comparative analysis with other similar lightweight architectures. This would give a better understanding of the true effectiveness of your model.

While your study makes significant contributions to industrial wood processing applications, extending the methodology to other materials or industries could increase the breadth of your work. Including a discussion on potential challenges or adaptations needed for these different applications would make your research more comprehensive.

Overall, the potential impact of your work in improving efficiency, accuracy, and productivity in the industry is commendable. The research is well-conducted with significant results, and I look forward to seeing further developments in this area.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Computer vision, neural network, hyperspectral imaging

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (1)

Version 1

VERSION 1 PUBLISHED 23 Mar 2023

Reader Comment 27 Sep 2025

Chenjie Xu, Shanghai Jiao Tong University, Shanghai, China

27 Sep 2025

Reader Comment

The manuscript lacks comparative experiments with current mainstream lightweight detection models (such as YOLOv5, Unet, etc.), making it difficult to objectively evaluate the proposed method's relative advantages.
Please supplement the ... Continue reading The manuscript lacks comparative experiments with current mainstream lightweight detection models (such as YOLOv5, Unet, etc.), making it difficult to objectively evaluate the proposed method's relative advantages.
Please supplement the experiments by comparing the speed and accuracy of at least 2–3 mainstream lightweight models on the same dataset.
The manuscript lacks comparative experiments with current mainstream lightweight detection models (such as YOLOv5, Unet, etc.), making it difficult to objectively evaluate the proposed method's relative advantages.
Please supplement the experiments by comparing the speed and accuracy of at least 2–3 mainstream lightweight models on the same dataset.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Comment

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1
Version 1 23 Mar 23	read

Zhenye Li, Nanjing Forestry University, Nanjing, China

Comments on this article

All Comments(1)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

9 Views

14 Jun 2023 | for Version 1

Zhenye Li, Nanjing Forestry University, Nanjing, Jiangsu, China

9 Views Cite this report Responses(0)

Approved With Reservations

It's encouraging to see research tackling the pertinent issue of real-time defect detection and segmentation in industrial applications. Your work addresses a critical need in the industry and promises substantial improvement in process efficiency and accuracy.

I appreciate the clear outline of your novel method for detecting wood planks and segmenting surface defects. The use of a large-scale image dataset and data augmentation techniques indicates thorough preparation and diligence in your research methodology. However, it would be better if some more comparisons could be added to the experiment part. And by the way, the code you provided can not be downloaded for 403 forbidden error.

Your reported results are impressive, particularly a mean average precision of 97% for plank detection and 96% global pixel accuracy for defect segmentation. The real-time processing capability of 30 FPS further underscores the applicability of your system in an industrial environment. However, it would be interesting to see a comparative analysis with other similar lightweight architectures. This would give a better understanding of the true effectiveness of your model.

While your study makes significant contributions to industrial wood processing applications, extending the methodology to other materials or industries could increase the breadth of your work. Including a discussion on potential challenges or adaptations needed for these different applications would make your research more comprehensive.

Overall, the potential impact of your work in improving efficiency, accuracy, and productivity in the industry is commendable. The research is well-conducted with significant results, and I look forward to seeing further developments in this area.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Computer vision, neural network, hyperspectral imaging

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

[1] 1. Gao M, Qi D, Hongbo M, et al.: A transfer residual neural network based on resnet-34 for detection of wood knot defects. Forests. 2021; 12(2). 1999-4907. Publisher Full Text

[2] 2. Shahrabadi S, Castilla Y, Lopez MAG, et al.: Defect detection in the textile industry using image-based machine learning methods: a brief review.2022; 2224: 012010. Publisher Full Text

[3] 3. Liu X, Song L, Liu S, et al.: A review of deep-learning-based medical image segmentation methods. Sustainability. 01 2021; 13: 1224. Publisher Full Text

[4] 4. Ferguson M, Ak R, Lee Y-TT, et al.: Detection and segmentation of manufacturing defects with convolutional neural networks and transfer learning.2018. Publisher Full Text Reference Source

[5] 5. Bergmann P, Batzner K, Fauser M, et al.: The mvtec anomaly detection dataset: A comprehensive real-world dataset for unsupervised anomaly detection. Int. J. Comput. Vis. Apr 2021; 129(4): 1038–1059. Publisher Full Text

[6] 6. Wang S, Xia X, Ye L, et al.: Automatic detection and classification of steel surface defect using deep convolutional neural networks. Meta. 2021a; 11(3). 2075-4701. Publisher Full Text Reference Source

[7] 7. Saberironaghi A, Ren J, El-Gindy M: Defect detection methods for industrial products using deep learning techniques: A review. Algorithms. 2023; 16(2). 1999-4893. Publisher Full Text Reference Source

[8] 8. Kumagai A, Iwata T, Fujiwara Y: Transfer anomaly detection by inferring latent domain representations.Wallach H, Larochelle H, Beygelzimer A, et al., editors. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2019; volume 32. .

[9] 9. Urbonas A, Raudonis V, Maskeliūnas R, et al.: Automated identification of wood veneer surface defects using faster region-based convolutional neural network with data augmentation and transfer learning. Appl. Sci. 2019; 9(22). 2076-3417. Publisher Full Text

[10] 10. Iglesias C, Martínez J, Taboada J: Automated vision system for quality inspection of slate slabs. Comput. Ind. 2018; 99: 119–129. 0166-3615. Publisher Full Text

[11] 11. Pham VH, Lee BR: An image segmentation approach for fruit defect detection using k-means clustering and graph-based algorithm. Vietnam Journal of Computer Science. Feb 2015; 2(1): 25–33. 2196-8896. Publisher Full Text

[12] 12. Gan Y, Qunfei Z: An effective defect inspection method for lcd using active contour model. Instrumentation and Measurement, IEEE Transactions on. 09 2013; 62: 2438–2445. Publisher Full Text

[13] 13. Si X, Zhen H, Xuemin H: Fabric defect detection based on regional growing pcnn. J. Multimed. 10 2012; 7: 372–379. Publisher Full Text

[14] 14. Lee J, Yoo SI: Defect detection on images using multiple reference images: solving a binary labeling problem using graph-cuts algorithm. Journal of Electronic Imaging. July 2012; 21: 033011–033014. Publisher Full Text

[15] 15. He T, Liu Y, Chengyi X, et al.: A fully convolutional neural network for wood defect location and identification. IEEE Access. 08 2019;1–1. Publisher Full Text

[16] 16. Uzen H, Turkoglu M, Hanbay D: Multi-dimensional feature extraction-based deep encoder–decoder network for automatic surface defect detection. Neural Comput. & Applic. February 2023; 35(4): 3263–3282. Publisher Full Text

[17] 17. Xu L, Boyu L, Hong M, et al.: Improved faster r-cnn algorithm for defect detection in powertrain assembly line. Procedia CIRP. 2020; 93: 479–484. 53rd CIRP Conference on Manufacturing Systems 2020. 2212-8271. Publisher Full Text Reference Source

[18] 18. Xu Q, Zhao Q, Yu G, et al.: Rail defect detection method based on recurrent neural network. 2020 39th Chinese Control Conference (CCC). 2020a; pages 6486–6490. Publisher Full Text

[19] 19. Wang Q, Yang R, Chongjun W, et al.: An effective defect detection method based on improved generative adversarial networks (igan) for machined surfaces. J. Manuf. Process. 2021b; 65: 373–381. 1526-6125. Publisher Full Text Reference Source

[20] 20. Zhang J, Wang H, Tian Y, et al.: An accurate fuzzy measure-based detection method for various types of defects on strip steel surfaces. Comput. Ind. 2020; 122: 103231. 0166-3615. Publisher Full Text

[21] 21. Huangpeng Q, Zhang H, Zeng X, et al.: Automatic visual defect detection using texture prior and low-rank representation. IEEE Access. 2018; 6: 37965–37976. Publisher Full Text

[22] 22. Deotale NT, Sarode TK: Fabric defect detection adopting combined glcm, gabor wavelet features and random decision forest. 3D Research. Jan 2019; 10(1): 1–5. 2092-6731. Publisher Full Text

[23] 23. Li S, Li D, Yuan W: Wood defect classification based on two-dimensional histogram constituted by lbp and local binary differential excitation pattern. IEEE Access. 2019; 7: 145829–145842. Publisher Full Text

[24] 24. Liu X, Xue F, Lu T: Surface defect detection based on gradient lbp. 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC). 2018; 133–137. Publisher Full Text

[25] 25. Wang W, Deng N, Xin B: Sequential detection of image defects for patterned fabrics. IEEE Access. 2020; 8: 174751–174762. Publisher Full Text

[26] 26. Rosebrock A: Deep Learning for Computer Vision with Python, Practitioner Bundle.3.0.0 edition2019. Reference Source

[27] 27. Lin T-Y, Maire M, Belongie S, et al.: Microsoft coco: Common objects in context.2015.

[28] 28. Russakovsky O, Deng J, Hao S, et al.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV). 2015; 115(3): 211–252. Publisher Full Text

[29] 29. He K, Zhang X, Ren S, et al.: Deep residual learning for image recognition. CoRR, abs/1512.03385. 2015.

[30] 30. Lin T-Y, Goyal P, Girshick RB, et al.: Focal loss for dense object detection. CoRR, abs/1708.02002. 2017.

[31] 31. Krizhevsky A, Sutskever I, Hinton GE: Imagenet classification with deep convolutional neural networks.Pereira F, Burges CJC, Bottou L, et al., editors. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2012; volume 25. .

[32] 32. Huang G, Liu Z, Weinberger KQ: Densely connected convolutional networks. CoRR, abs/1608.06993. 2016.

[33] 33. Simonyan K, Zisserman A: Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations. 2015.

[34] 34. Szegedy C, Liu W, Yangqing J, et al.: Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). June 2015; pages 1–9. Publisher Full Text

[35] 35. Liang X, Lv S, Deng Y, et al.: A weakly supervised surface defect detection based on convolutional neural network. IEEE Access. 2020b; 8: 42285–42296. Publisher Full Text

[36] 36. He Z, Liu Q: Deep regression neural network for industrial surface defect detection. IEEE Access. 2020; 8: 35583–35591. Publisher Full Text

[37] 37. Shi J, Li Z, Zhu T, et al.: Defect detection of industry wood veneer based on nas and multi-channel mask r-cnn. Sensors. 2020; 20(16). 1424-8220. PubMed Abstract | Publisher Full Text | Free Full Text

[38] 38. He K, Gkioxari G, Dollár P, et al.: Mask R-CNN. CoRR, abs/1703.06870. 2017.

[39] 39. Ding F, Zhuang Z, Liu Y, et al.: Detecting defects on solid wood panels based on an improved ssd algorithm. Sensors. 2020; 20(18). 1424-8220. PubMed Abstract | Publisher Full Text | Free Full Text

[40] 40. Zheng M, Lei Z, Zhang K: Intelligent detection of building cracks based on deep learning. Image Vis. Comput. 2020; 103: 103987. 0262-8856. Publisher Full Text

[41] 41. Miao R, Jiang Z, Zhou Q, et al.: Online inspection of narrow overlap weld quality using two-stage convolution neural network image recognition. Mach. Vis. Appl. Jan 2021; 32(1): 27. 1432-1769. Publisher Full Text

[42] 42. Dong X, Taylor CJ, Cootes TF: Small defect detection using convolutional neural network features and random forests.Leal-Taixé L, Roth S, editors. Computer Vision – ECCV 2018 Workshops. Cham: Springer International Publishing; 2019; pages 398–412. 978-3-030-11018-5.

[43] 43. Delconte F, Ngo P, Debled-Rennesson I, et al.: Tree Defect Segmentation using Geometric Features and CNN. Reproducible Research on Pattern Recognition (RRPR). Milan, Italy: 2021; volume LNCS 12636: pages 80–100. Publisher Full Text

[44] 44. Yang Y, Zhou X, Liu Y, et al.: Wood defect detection based on depth extreme learning machine. Appl. Sci. 2020; 10(21). 2076-3417. Publisher Full Text

[45] 45. Tang T-W, Kuo W-H, Lan J-H, et al.: Anomaly detection neural network with dual auto-encoders gan and its industrial inspection applications. Sensors. 2020; 20(12). 1424-8220. PubMed Abstract | Publisher Full Text | Free Full Text

[46] 46. Gao Y, Mosalam KM: Deep transfer learning for image-based structural damage recognition. Comput. Aided Civ. Inf. Eng. 2018; 33(9): 748–768. Publisher Full Text

[47] 47. Howard A, Sandler M, Chu G, et al.: Searching for mobilenetv3. CoRR, abs/1905.02244. 2019.

[48] 48. Girshick RB, Donahue J, Darrell T, et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524. 2013.

[49] 49. Avenash R, Viswanath P: Semantic segmentation of satellite images using a modified cnn with hard-swish activation function. Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP. INSTICC, SciTePress; 2019; pages 413–420. 978-989-758-354-4. Publisher Full Text

[50] 50. Bridle J: Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters.Touretzky D, editor. Advances in Neural Information Processing Systems. 1989; volume 2. . Morgan-Kaufmann; Reference Source

[51] 51. Mazhar M: Code and Data — figshare.com. [data and code]. Reference Source2023.

[52] 52. CVAT.ai Corporation: Computer Vision Annotation Tool (CVAT).9 2022. Reference Source

[53] 53. Amazon web services. http

[54] 54. Paszke A, Gross S, Massa F, et al.: Pytorch: An imperative style, high-performance deep learning library. CoRR, abs/1912.01703. 2019.

Convolutional neural networks for real-time wood plank detection and defect segmentation

Abstract

Keywords

Introduction

Figure 1. Industrial scenario: A wood plank is being conveyed on a conveyor belt.

Literature review

Defect detection and segmentation

Traditional methods

Deep learning based methods

Methods

Figure 2. The overall system architecture.

Backbone network

Figure 3. MobileNetV3 building block image from.47

Wood plank detection

Table 1. Details of the layers in the detection module with a MobileNetV3 backbone.

Defect segmentation

(1)

Table 2. Details of the layers in the segmentation module with MobileNetV3 backbone.

Dataset

Table 3. Defect types, severity and properties.

Table 4. The quantity of images in each subset of the dataset for various types of defects.

Defect classes

Figure 4. Defect types: from left to right (1) Area (2) Branch (3) Edge (4) Knots (5) Line (6) Stain.

Data annotation and labeling

Data augmentation

Training

Quantization and pruning

Evaluation

(2)

Results and discussion

Figure 5. Input image with two region of interests highlighted in green.

Figure 6. Detection module output from second ROI.

Figure 7. Output from segmentation module.

Figure 8. Final output showing plank ID on the top left, severity or final classification, defects segmentation with unique colors and labels.

Deployment and testing

Table 5. Hardware specifications.

Conclusion and future work

Data availability

Source data

Underlying data

Extended data

Software availability

References

Comments on this article Comments (1)

Open Peer Review

Comments on this article Comments (1)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated

Figure 3. MobileNetV3 building block image from.⁴⁷