bwimage: A package to describe image patterns in natural structures

Currently R is the most popular software for data analyses among biologists. Here, we present bwimage, a package designed to describe patterns from black and white images. The package can be used for a wide range of applications. We implemented functions previously described in the literature to calculate parameters designed originally, but not exclusively, for vegetation structures. Additionally, we propose a new parameter: the aggregation index. We demonstrate applications for field work, providing examples that range from calculation of canopy openness, description of patterns in vertical vegetation structure, to patterns in bird nest structure. We provide advice and illustrated examples of how to produce high quality images for analyses.

This article is included in the gateway. RPackage

Introduction
The facility to obtain high quality digital images creates the opportunity to measure natural variables using image analyses. Black and white pictures have frequently been used to understand patterns in field ecology, especially in plant biology studies 1 . However, the use of plant image analyses software is not easily extended to other biological fields for several reasons. Free programs are uncommon and paid software normally has threshold algorithms that were specifically designed for vegetation pictures 2 . Thus, a flexible method that would allow the application of such analyses to other subjects would be welcome. For example, despite the relatively well reported descriptions of bird nests and egg morphology in Del Hoyo and collaborators 3 (but see Xiao, Hu 4 ), there are no well-established approaches to estimate nest wall openness patterns.
Currently, R software 5 allows users to migrate from data processing based on combinations of different software (with the possibility of having costly licensing, software-specific files, incompatibility between operating systems and lack of updates) to a free, single cross-platform software. Here, we introduce bwimage, a package for R that can be used to analyze patterns in black and white images from natural structures. We provide data examples for applications and descriptions of routines for processing of black and white images.

Implementation
Bwimage´s analysis of images is based on the transformation from a picture ("jpeg" and "png" files are allowed) to a binary matrix ( Figure 1). For each pixel, the intensity of red, green, blue, or the average of these three channels (argument channel) is compared to a threshold (argument threshold_value). If the average intensity is less than the threshold (default is 50%) the pixel will be set as black, otherwise it will be white. Beyond RGB intensity in PNG images, the alpha channel is used to set transparent pixels, i.e. alpha channel values Figure 1. General approach for image analysis in the bwimage package. A) An image of a natural structure is obtained with digital photography; here we used an image from a canopy. B) The image is converted into a binary matrix, functions threshold_color (to a single image) or threshold_image_list (for two or more images). In the data matrix the value one represents black pixels, zero represents white pixels and NA represents transparent pixels.

Amendments from Version 2
Typos were fixed, and Dr. Roy Francis was included in the Acknowledgments section.
Any further responses from the reviewers can be found at the end of the article REVISED above the threshold (argument threshold_value; default is 50%) will set the pixel as transparent. In the data matrix, the value one represents black pixels, zero represents white pixels and NA represents transparent pixels. For high resolution files, i.e. numbers of pixels in width and height, we suggest reducing the resolution to create a smaller matrix, as this strongly reduces GPU usage and time necessary to run analyses. However, by reducing resolution, the accuracy of data description will also be lowered. Figure 2 compares different resamplings from a figure of 2500×2500 pixels. If the user is not acquainted with scale and threshold processing and/or images were captured under different light conditions, we recommend the scale and application of threshold algorithms in a native image editor software, such as GIMP 6 , and subsequent usage of the resulting images with the bwimage package.
Several metrics can be performed with the functions presented in Table 1. We implemented functions to calculate parameters designed originally, but not exclusively, for vegetation structures (described by Zehm et al. 2003) and propose a new parameter: the aggregation index. The aggregation index is a standardized estimation of the average proportion of same-color pixels around each image pixel. First, the proportion of same-color neighboring pixels (SCNP) is calculated (marginal lines and columns are excluded). Next, the SCNP for all pixels are averaged; then, given the proportion of black and white pixels, number of pixels in height and width, and location of transparent pixels (when present), the maximum and minimum possible aggregation indexes are calculated. Finally, the observed aggregation is standardized to a scale where the minimum possible value is set at zero and the maximum value is set at one ( Figure 3).

Operation
Bwimage is written in the R programming language 5 , and can be run on Windows, Mac OS X, and Linux systems. The package is available at the CRAN repository, and the development releases are available at Github   7 . The bwimage CRAN page documents package dependencies. Input images must be in one of the following formats PNG, JPG, or JPEG.

Use cases
Canopy openness is one of the most essential ecological parameters for a field ecologist. In the bwimage package, canopy openness can be calculated based on a single picture. To illustrate, we demonstrate below how to analyze a canopy image with the bwimage package. The photo was taken with a digital camera placed in the ground, perpendicular to the ground. Canopy closure can be calculated by estimating the total amount of vegetation in the canopy. Canopy openness is equal to one minus the canopy closure. For this example, we used the original image from Figure 1. The original image file is provided as Underlying data 8 . canopy_matrix<-threshold_color("canopy.JPG",compress_method="proportional",compress_rate=0.1) Several metrics to describe vertical vegetation complexity can be performed by the bwimage package (see Table 1).
Here we provide examples based on an image ( Figure 2A) from a vegetation plot of 30×100cm 1 . The original image file is provided as Underlying data 9 . On the 100cm side of this plot we placed a panel of 100×100 cm, covered with white cloth, and perpendicular to the ground. A plastic canvas of 50x100cm was used to cover the vegetation along a narrow strip in front of a camera positioned on a tripod at a height of 55 cm. A photograph of the portion of standing vegetation against the white cloth was taken.   Variation in eggs and nest morphology provide relevant information concerning bird life history that has frequently been used to answer ecological [10][11][12] and evolutionary questions [13][14][15] . Here we analyze examples that address the quantification nest wall openness and the aggregation of nest wall holes, using a nest of the blueblack grasssquit (Volatinia jacarina) deposited in the museum collection Coleção Ornitológica Marcelo Bagno, at Universidade de Brasília (register number COMB-N682). Figure 4 describes how to produce a high-quality image to describe patterns in bird nest wall openness. The original image file used is provided as Underlying data 16 .

Conclusions
The bwimage package provides accessible and simple methods for ecologists and field researchers to describe patterns from black and white digital images. It is a flexible method that allows the application of image analyses to an exceptionally broad range of research subjects. Bwimage´s analysis is based on a simple computational routine based on the transformation of a picture ("jpeg" or "png" files) into a binary matrix, followed by the analysis itself. Several metrics can be calculated by the bwimage package. We implemented functions previously described in the literature, and additionally, we proposed a new parameter: the aggregation index, which generates a standardized estimate of the average proportion of same-color pixels around each image pixel. The application of this methods is exceptionally broad.

Open Peer Review
I have tried running the tool on two different OS and tried out most of the functions using several images. Generally, the tool and functions work as described. The documentation is reasonably well written and easy to follow.
R is quite lacking in image analysis, therefore this package could be a useful little addition.
For effectively following the workflow described here, optimal thresholding is the key. The thresholding offered with the package is too basic and the user has to rely on external tools.
To compare the resulting metrics, two images need to be exactly comparable. The zoom/crop/scaling of the image, the lighting conditions, shadow etc would most likely render the results incomparable. This is something to be handled by external tools as well.
My opinion is that image recognition and classification has to be implemented as some sort of machine learning algorithm rather than any manual thresholding for practical real world application. See relevant example .
No implementation to work with large images (Sparse matrices etc). Only available option is to scale down the image and work with a smaller image. Issues with small image size is briefly discussed in the manuscript.
Summary functions would be nice. Some simple functions to visualise the results would be nice. For example just to preview the thresholded image, I used image(t(apply(img_thres,2,rev))).
Benchmarking: It's hard to say how good the results are. It would've been nice to see a comparison Benchmarking: It's hard to say how good the results are. It would've been nice to see a comparison of this tool to other existing tools in terms of efficiency as well as accuracy. For example; How does SCNP compare to other measures of aggregation (used in this field)? It would be nice to see some sort of use case on real world dataset rather than a few isolated images.
A detailed HTML vignette showcasing all the functionality and typical use case workflows. The r-bloggers tutorial was a good start but I wouldn't consider that as a stable location. Perhaps on github along with the source?
In figure 2, dimension of sub plot I is missing in the caption.

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Yes No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Molecular Biology, Bioinformatics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. 12 November 2019 © 2019 Chianucci F. This is an open access peer review report distributed under the terms of the Creative Commons , which permits unrestricted use, distribution, and reproduction in any medium, provided the original Attribution License work is properly cited.

Francesco Chianucci
The Council for Agricultural Research and Economics (CREA) -Research Centre for Forestry and Wood, Arezzo, Italy The topic is interesting. However, I think the package makes simple things which many other similar packages do, and omit some intermediate steps required to refine the analysis.
First, the authors should provide a list of thresholding (either single or dual) methods to binarize images. The single Otsu, the minimum algorithm, the two-corner method are some of the algorithm which should be developed. The package could also use some dependancies from other packages to implement these thresholding. For instance, the rtiff package contains the function autoThreshold which implement the Ridler-Calvard thresholding. I also suggest take a look at the caiman package which contains tool for canopy image analysis, see also below.
Second, users can also process single channel image or decide to use a single channel (for example, the blue channel is frequently used for canopy images). So the user should have flexibility on choosing the image feature to process (setting the channel, setting the threshold).
Canopy images are usually fish eye images, and therefore they require correction for lens projection. In addition, canopy openness should be weighted for zenith angle in fish eye image. Similarly, gap fraction is required for zenith ring. Thus, the packages should allow to work with fish eye (circular image), namely correct for lens distortion, setting the circular inner mask, extract information for zenith angle ranges (inner rings). I suggest to take a look in the caiman package to inherit its functions.
An interesting attribute from the packages is the identification of row or column gaps (holes), namely continuous sequences of white pixels. In my view, a very interesting output would be the identification and the labelling of all holes in the image, along with their summary statistics (e.g. number of holes, size of each hole in number of pixels, average, sd and so on). I think the package will improve strongly if authors can implement such function. This would allow the extraction of canopy attributes from images .

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Partly No competing interests were disclosed. We appreciate all of your helpful comments, and we are certain that they have improved significantly the original version of the Package. In the new version (1.2) of bwimage package published in CRAN we: i) introduced two new functions (stretch and compress functions) to provide an additional tool to distort images. These functions are applications of algorithms for mapping images from circle to square, and vice versa, adapted from Lambers (2016 J of Computer ); and ii) included the possibility of users processing single channel images.

Graphics Techniques
We considered incorporating other packages for threshold in bwimage dependencies; however, we decided to avoid a cross-dependency of packages because it implies in a heavy load package, and brings possible compatibility issues with R updates. Thus, we chose to create a tutorial (shared on https://www.r-bloggers.com/using-bwimge-r-package-to-describe-patterns-in-images-of-natural-structures/ ) to provide an overview of bwimage package and demonstrate examples of how to apply threshold algorithms from the package autothresholdr, following its application in bwimage package. A comparison of estimation of vegetation density from a bush image submitted to different thresholding algorithms is provided. Note: the current version of autothresholdr package (1.3.5) provide 17 threshold algorithms by function auto_thresh, covering a wide range of applications.
We do not mention in the previous version of this article, but bwimage version 1.0 already had a function (hole_section_data) to summarize holes statistics (i.e. number of holes, mean hole size, sd, minimum and maximum size). This function was designed to be used inside a loop or apply-family functions. By combination of hole_section_data and loop function, users can potentially collect a summary of holes statistics. The size of each section is obtained by the hole_section function, which returns size and map of each hole. We add information about this function in manuscript table 1. We also provide an example of how to apply hole_section_data for a set of 12 images in the above-mentioned tutorial.
With my best regards. Sincerely, Carlos Biagolini-Jr No competing interests were disclosed.

Competing Interests:
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias You can publish traditional articles, null/negative results, case reports, data notes and more The peer review process is transparent and collaborative Your article is indexed in PubMed after passing peer review Dedicated customer support at every stage For pre-submission enquiries, contact research@f1000.com