Discrimination of the SARS–CoV-2 strains using of coloured s-LASCA-imaging of GB-speckles, developed for the gene “S” nucleotide sequences

Background: A recent bioinformatics technique involves changing nucleotide sequences into 2D speckles. This technique produces speckles called GB-speckles (Gene Based speckles). All classical strategies of speckle-optics, namely speckle-interferometry, subtraction of speckle-images as well as speckle-correlometry have been inferred for processing of GB-speckles. This indicates the considerable improvement in the present tools of bioinformatics. Methods: Colour s-LASCA imaging of virtual laser GB-speckles, a new method of high discrimination and typing of pathogenic viruses, has been developed. This method has been adapted to the detecting of natural mutations in nucleotide sequences, related to the spike glycoprotein (coding the gene «S») of SARS–CoV-2 gene as the molecular target. Results: The rate of the colouring images of virtual laser GB-speckles generated by s-LASCA can be described by the specific value of R. If the nucleotide sequences compared utilizing this approach the relevant images are completely identical, then the three components of the resulting colour image will be identical, and therefore the value of R will be equal to zero. However, if there are at least minimal differences in the matched nucleotide sequences, then the value of R will be positive. Conclusion: The high effectiveness of an application of the colour images of GB-speckles that were generated by s-LASCA- has been demonstrated for discrimination between different variants of the SARS–CoV-2 spike glycoprotein gene.


Introduction
As it is well known, if laser light diffracts on random objects, then laser speckles are formed. 1-3 Recently, the possibility of transforming a nucleotide sequence into a pattern of 2D speckles had been demonstrated. [4][5][6][7][8][9] This new type of speckle pattern has been called "GB-speckles" (gene-based speckles). 5,7,9 Changes within in the structure of the GB-speckles can reflect even negligible changes in the nucleotide sequence, caused by inartificial mutations. This allows detection of single-nucleotide polymorphisms (SNP) using virtual GB-speckles with outstanding precision. In addition, it offers unlimited potential of improving the diagnosis' accuracy by increasing the Fourier transform area. 10 Essential advancement in the area of GB-speckles has been reported in previous years. According to previously published reports, [4][5][6][7][8][9]11 implementation of speckle-optics methods, like speckle-interferometry and subtraction of speckle-images as well as speckle-correlometry for processing of GB-speckles, provides considerable progress in the current bioinformatics toolbox. This can become crucial to significantly improve existing routine methods of laboratory diagnostics of infectious diseases. GB-speckles as a technique opens the door to the new horizons in digital biology. 12,13 Recently, model GB-speckle patterns of nucleotide sequences of the omp1 genes for two different of Chlamydia spp., such as Chlamydia trachomatis and Chlamydia psittaci of at least six genovars (D, E, F, G, J and K) have been composed. 4,5 Probability density functions and correlation properties of spatial intensity fluctuations for the relevant GB-speckle patterns have been studied. [5][6][7] As it has been shown in previous studies, [4][5][6][7]9 the presence of inartificial mutations in analysed strains, including single SNP cases, can be easily defined using methods of speckle-optics. [4][5][6][7]9 More recently, the encoding algorithm's optimization for nucleotide sequences of C. trachomatis into two-dimensional GB-speckle pattern had been carried out; 4,6 and speckle-interferometric technique may give rise the ultra-fast optical processors of DNA sequences. 4 This is ensured by the development of the exclusive system of interferential fringes which are generated by the model interference pattern led by the existence of any type of mutations. Additionally, the method of virtual phase-shifting speckle-interferometry was reported to be efficacious 11 to investigate of polymorphism of the C. trachomatis omp1 gene. This approach allowed the detection of the C. trachomatis omp1 gene with SNPs, including both a single SNP and a combination of several SNPs in the bacterial strains with genetic mutations (11 known subtypes in total) had been developed. 6 The format of GB-speckles had been successfully applied to transform the nucleotide sequences of the genes expressing the serine proteases, the well-known Omptin family proteins within the Enterobacteriaceae. These proteins have been found on the surface of several bacterial agents causing different enteric infections, such as salmonellosis, shigelosis, yersiniosis, and escherichiosis. 7 Further, the phase and the relevant two-dimensional distributions of the intensity of GB-speckles in various strains of viral pathogens, namely of lumpy skin disease virus of cattle, LSDV, and also for sheeppox virus, SPPV have been obtained. 8 Additionally, interference patterns for generated the specific superposition in the relevant fields of GB-speckle and the certain difference in their images have been successfully investigated to reveal a minimal discrimination between the initial viral nucleotide sequences.
A new bioinformatics approach has been proposed very recently: 14 GB-speckles processing via an s-LASCA technique (from the spatial Laser Speckle Contrast Analysis) application. As it had been demonstrated, it is possible to extend affectability of the proposed approach comparing to current bioinformatics strategies 15 using s-LASCA imaging in the GB-speckles' processing. It had been shown in Ref. 16, that the GB-speckles' generation combined with s-LASCA imaging method are very effective to analyze nucleotide polymorphism in several genes of C. trachomatis.
This paper is devoted to development of advantageously new technique: the coloured s-LASCA imaging of GB-speckles. Such a technique is an improved version of previously suggested "greyscale" s-LASCA imaging that was recently REVISED Amendments from Version 3 developed especially for GB-speckles. Nucleotide sequences for some target genes SARS-CoV-2 have been successfully processed using coloured s-LASCA-imaging. Natural mutations in the comparing genes have been reliably and accurately detected. have been compared on the base of analysis of GB-speckles. The official reference sequences were taken from the GISAID database.

Methods
Algorithm for the total conversion of a nucleotide sequence to a colour GB speckle structure, processed by s-LASCA imaging technique Initial processing of nucleotide sequence First, the sequence of the letters derived from the original one-dimensional nucleotide sequence was converted into the sequence of numbers in accordance with the following rule: 4 It is critical to emphasize that the specific relationship between the letters and numbers in this case is not critical as used earlier; 6 thus, other rules could have been applied to the encoding, for instance: Next, all possible triad combination are generated. As a result, a complete set of all triads is formed: The number of all possible combinations of four numbers combined in triads is 64.
Then, a discrete magnitude, h, is allotted to each triad in accordance with the simple algorithm described previously. 4 This algorithm was implemented in Matlab R2015a (RRID:SCR_001622); an open access alternative is Julia. The value of h is a positive integer, varying in the range from 1 to 64. In this case, each triad from the original nucleotide sequence is associated with only one h value. So, for example, the combination (1 1 1) conforms to the value h = 1, (1 1 2) corresponds to h = 2, (1 1 3) conforms to h = 3, (1 1 4) conforms to h = 4, (1 2 1) conforms to h = 5, (1 2 2) conforms to h = 6, and so on. Finally, the latest combination (4 4 4) conforms to the value h = 64. Finally, a square matrix H n,m was formed by a onedimensional array h. The physical significance of the shaped matrix H n,m is that each of its elements represents the local height of some virtual rough surface corresponding to the local content of the analyzed genetic construction. The resulting virtual rough surfaces could be used to model original speckle structures corresponding to diverse particular nucleotide sequences.
The two-dimensional speckle patterns that corresponded to each specific sequence was generated with the use the diffraction of a coherent beam with a square cross-section profile on a virtual scattering surface with a microrelief described by the matrix H n,m . At each point of the virtual diffuser (in the beam scattering plane), some phase modulation U n,m = exp(À2πj H n,m /64) is introduced ( j is an imaginary unit). The surface is illuminated at the normal incidence of the beam; the phase in the illuminating beam was a constant value.
It is assumed that speckles are formed in the far diffraction zone and described in the Fraunhofer approximation. In this case, the expression for the amplitude of the scattered field is the Fourier transform of the field in the diffraction plane, evaluated at frequencies spaces 10 where X o and Y o are the coordinates in the observation plane, z is the distance between the scattering plane and the observation plane, λ is the wavelength. The illuminating radiation is completely monochromatic, thus, λ = const. In this situation, the structure of speckles does not depend on the wavelength and z. Only the sizes of GB-speckles depend on these values, the average size of which is determined by the ratio: where a is the size of the illuminated fragment of virtual surface. 17 It is important to emphasize that the ratio λ/a characterizes the diffraction angular divergence of a laser beam in the far field, and the product of this divergence angle by the light traveled distance z is equal to the lateral size of the beam. Thus, it can be seen that the diameter of the undisturbed laser beam (namely, this value is on the right side of the expression (6)) and the average speckle size are approximately equal to each other in any observation plane. In other words, when the parameters z and λ change, a proportional change in the size of all speckles occurs synchronously. At the same time, the structure of speckle-patterns in all observation planes are completely similar, only their scale changes from plane to plane, but not the shape of the speckles or their location in the speckle pattern.
Generating GB-speckles The procedure for transcoding the original nucleotide sequence into a GB-speckle structure using the example of the hCoV-19/cat/USA/TX-TAMU-078/2020|2020-07-29 gene (the gene #1) is shown below.
Important to emphasize, that experimental studies were not carried out in this work, only computer modeling. The scheme for calculating GB-speckles during radiation diffraction on a virtual scattering surface is described in detail in the work. 9 s-LASCA imaging of GB-speckles s-LASCA strategy has been connected for handling of GB-speckles. The strategy of s-LASCA is based on the examination of an individual realization of static speckles. 3,16 In this case, the whole realization of the speckle field is divided into square zones; typically, each counting 5Â5 or 7Â7 pixels. For each zone, the contrast of GB-speckles was calculated using the simplest formula: where I was the varying intensity of GB-speckles, changing from point to point; σ I was the standard deviation of the intensity of fluctuations. After the contrast C is calculated in each point, LASCA image is developed. Here, the size of subarea for the local contrast calculating was 2Â2 pixels. As it has been demonstrated 14 this size of subarea is close to optimal.
Coloured GB speckles To generate three two-dimensional implementations of GB speckles built for different genetic sequences, it is necessary to construct a colour image, where each colour component (red, green, and blue) has its own GB speckle structure. When all three speckle structures were totally indistinguishable, the colour images look grey-scale. If the colour components differ from each other, then, as a result, colouring will appear in the image.
In Figure 2a, the coloured speckle-pattern for intensity distribution is presented (the red component obtained for the nucleotide sequence derived from gene #1, the green component corresponded to the nucleotide sequence of gene #2, and blue component was the relevant to gene #3 nucleotide sequence, respectively). Figure 2a, demonstrates the differences in the initial nucleotide sequences, a slight staining appears in the colour specklepattern structure for a two-dimensional intensity distribution.
In Figure 2b, the coloured speckle-pattern for phase distribution is shown for such nucleotide sequences as: (i) the red component for gene #1, greenfor gene #2, blue for gene #3.
It is quite obvious that in the case under consideration, there is a pronounced colouring over the whole image for the field of GB-speckle.
Thus, the obtained colour image for the intensity and phase of GB speckles is a reliable diagnostic sign of the presence of polymorphism.
A novel detection technique based on the s-LASCA images with coloured GB-speckles Once an s-LASCA image is obtained for each of the three components of the matched genetic sequence, the final colour image can be constructed. An example of such an image is shown in Figure 3a.

Results and discussion
It is obvious that the image shown in Figure 3a in comparison with the image in Figure 2a has a more pronounced colouring over the entire field of view, but is characterized by a higher contrast. From a quantitative point of view, the degree of colouring can be described by the value where Ir i , Ig i and Ib i are values of intensity for the red, green, and blue components in each pixel,  Obviously, if the nucleotide sequences compared using s-LASCA imaging of GB-speckles are completely identical, then the three components of the resulting colour image will be identical, and therefore the value of R will be equal to zero. However, if there are at least minimal differences in the compared nucleotide sequences, then the value of R will take a positive value. Thus, the value of R calculated for the Figure 3a is 0.1 (gene#1, gene#2 and gene#3 are compared).
The physical meaning of the introduced parameter R is that this parameter characterizes the degree of coloring of the picture (GB-speckle-pattern). The bioinformatic (molecular biology) value of R is that it takes positive values, even in the case of the appearance of a one SNP in the analyzed nucleotide sequences. Thus, the minimum natural mutations of the virus can be determined using the parameter R.
It is important to note that the value of R calculated for Figure 2a and Figure 2b (coloured bare GB-speckle) equals to 0.049 and 0.026, respectively. This means that the value of R at least in two times higher for GB speckles, processed by s-LASCA imaging technique.
Evidently, R is positive for all images in Figures 3; so, R is an important diagnostic feature when detecting the presence of SNPs in SARS-CoV-2 genes. This is the main result of this paper.

Conclusion
A fundamentally new bioinformatics technique for reliable detection of single SNPs is proposed. The new method is based on the applying of the s-LASCA 'imaging technique' generating original GB-speckles. It is established that even one SNP can be reliably detected. It has been demonstrated that suggested technique is very effective tool for discrimination between different variants of the SARS-CoV-2 spike glycoprotein gene.   Figures 1-2. These points need a correction.

2.
I have no additional proposals for edits. The revised version of the article can be recommended for indexing with no further reviewing.

Is the study design appropriate and is the work technically sound? Yes
Are sufficient details of methods and analysis provided to allow replication by others?

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes where a is the size of the illuminated fragment of virtual surface. It is important to emphasize that the ratio λ/a characterizes the diffraction angular divergence of a laser beam in the far field, and the product of this divergence angle by the light traveled distance z is equal to the lateral size of the beam. New reference: M. Francon. La granularite laser (speckle) et ses applications en optique. Masson, Paris, New York, Barcelone, Milan 1978. Thus, it can be seen that the diameter of the undisturbed laser beam (namely, this value is on the right side of the expression (**)) and the average speckle size are approximately equal to each other in any observation plane.
In other words, when the parameters z and λ change, a proportional change in the size of all speckles occurs synchronously. At the same time, the structure of speckle-patterns in all observation planes are completely similar, only their scale changes from plane to plane, but not the shape of the speckles or their location in the speckle pattern.