In silico analysis of cross reactivity among phospholipases from Hymenoptera species

Background: Phospholipases are enzymes with the capacity to hydrolyze membrane lipids and have been characterized in several allergenic sources, such as hymenoptera species. However, cross-reactivity among phospholipases allergens are little understood. The objective of this study was to determine potential antigenic regions involved in cross-reactivity among allergens of phospholipases using an in silico approach. Methods: In total, 18 amino acids sequences belonging to phospholipase family derived from species of the order hymenoptera were retrieved from the UniProt database to perform phylogenetic analysis to determine the closest molecular relationship. Multialignment was done to identify conserved regions and matched with antigenic regions predicted by ElliPro server. 3D models were obtained from modeling by homology and were used to locate cross-reactive antigenic regions. Results: Phylogenetic analysis showed that the 18 phospholipases split into four monophyletic clades (named here as A, B, C and D). Phospholipases from A clade shared an amino acid sequences’ identity of 79%. Antigenic patches predicted by Ellipro were located in highly conserved regions, suggesting that they could be involved in cross-reactivity in this group (Ves v 1, Ves a 1 and Ves m 1). Conclusions: At this point, we advanced to the characterization of potential antigenic sites involved in cross-reactivity among phospholipases. Inhibition assays are needed to confirm our finding.


Introduction
Allergic diseases have become a public health problem; the genetic background of patients (atopy) and the environmental conditions are considered the cause of the increased risk to develop allergic diseases 1 . Exposure to allergens (typically harmless antigens in the environment) also promotes an immune response mediated by IgE. Over the last few years, species belonging to the order Hymenoptera have been characterized as potential allergenic sources. They represent a common source of sensitization, with more than 200,000 species including bees, wasps, and ants. Most of the member of this order are cosmopolitan species, but some of them have an endemic distribution with a capacity of sensitization , like the Bombus sp. located more frequently in central and northern Europe, whereas the yellowjacket (YJ) (Vespula spp.) and honeybee (HB) (Apis mellifera) are allergenic sources in North America. Other wasps such as Polistinae are found in southern Europe and America [2][3][4] .
Allergic immune response from hymenopteran allergens has been studied in detail due to a high incidence of sting reactions to these insects. Approximately, 9.2% to 28.7% of the adult population is sensitized to the venom of hymenopterans 5 . Allergic response to Hymenoptera venom is one of the leading causes of anaphylaxis worldwide with a frequency of 27%, as compared to medications (41%) and foods (20%) 3,6 . Molecular, structural, and immunological characterization of hymenopteran venom allergens is advanced, in total 75 allergens from 31 different species have been explored, and since phospholipases are a family of allergens with clinical and biological relevance, some proteins belonging to this order such as hyaluronidase and antigen V are also considered relevant to sensitization to allergens from hymenoptera 2,7,8 . Exposure to hymenoptera allergens is associated with bites and stings; it is considered that 56.6-94.5% of the general population have been bitten at least once in their life 9 .
Phospholipases (PLA) are a major component of the venom of these species, representing 75% of the total mass of the poison and has been characterized as one of the main allergens in Hymenoptera 10 . They can be found in venoms from other arthropods such as chelicerates, in the venom of ophidians, as well as in different tissues of mammals such as pancreatic juice, synovial arthritic fluid. The superfamily includes 42 groups distributed in four types: A, B, C, and D 11 .
Phospholipases belonging to class A split into two groups: class A1 hydrolyzes the phospholipid ester bond between the first acyl and glycerol (1 acyl-SN-glycerol phosphate), while class A2 hydrolyzes the bond between the second acyl and glycerol (2 Acyl-SN-glycerol phosphate). They are a family of enzymes with different molecular weights, PLA1 has a molecular weight of 28 KDa, while PLA2 are classified as high molecular weight cytosolic PLA2 (40-85 kDa) and low molecular weight secretory PLA2 (14-18 kDa) with the capacity to hydrolyze fatty acids that are present on the cell membrane and other types of lipophilic substances or participate in the mechanism of regulation of gene expression through the production of free fatty acids, from which cyclooxygenases synthesize prostaglandins 12-14 .
The structure, function, mechanisms, and cell signaling of PLA have been extensively studied; one important aspect of PLA is their capacity to induce allergic responses. Several epitopes involved in the co-sensitization of some PLA that share structural homology and identity have been studied; this suggests a potential role in cross-reactivity. However, this is little understood and studies are needed to complement what has been reported. The aim of this work was to explore cross-reactivity and antigenicity of allergenic PLA using an in silico approach, using bioinformatics tools, where we identified several antigenic regions that may be involved in cross-reactivity among phospholipases.
Today it is evident how the use of bioinformatics tools for science has grown; it is considered the first step to carry out experimental studies because they create a functional prediction. Understanding and predicting an individual clinical cross-reactivity to allergens is key to better management, treatment, and progression of new therapies for allergy to Hymenoptera; prediction can be performed by methods for the identification and computational mapping of specific IgE epitopes or epitopes reported in the Immune Epitope Database and Analysis Resource, which can help identify the conserved regions that may be affecting patients' health. Various studies have carried out on this methodology for predicting food allergen epitopes [15][16][17] .
The in silico methodology has been used in other work to report possible cross-reactivity based on proteins in studies of structural or functional homology, through bioinformatics tools 18 .

Selection of phospholipases and alignment
The amino acid sequences of phospholipases type A (A1 and A2) from 18 Hymenoptera species were selected according to the allergenic capacity reported. The sequences were obtained from the UniProt database (see Table 1 for a list of accession numbers). All Allergens that were reported in the WHO/ IUIS Allergen Nomenclature Sub-Committee with a complete sequence were used. We did not include incomplete sequences for analysis. Three sequences are not reported as allergenic but were chosen to observe the differences in identity and the structures of several phospholipases. The identity degree among phospholipases was determined using the PRALINE web server. The parameters to perform the alignment were configured to use BLOSUM62 as the exchange matrix. The interactions used were 3 with an E value of 0.001.

Phylogenetic analysis
The Molecular Evolutionary Genetic Analysis (MEGA) program, version X was used to obtain phylogenetic trees, using the method of maximum parsimony of the taxa with the support of Bootstrap with 1000 repetitions as a measure of reliability and robustness under the assumption of a minimum evolution. In the topology, this model uses a comparative matrix to find the similarity between the amino acids of 18 sequences to establish the evolutionary proximity between the species. The matrix was constructed with all the amino acid sequences of the phospholipases recovered from the UniProt database and reported to the WHO/IUIS. Therefore, the more positive identity values found between the sequences, the greater their relationship will be, and the closer they will be located in the tree. All empty spaces were eliminated (complete deletions). From the global comparison and the homologies, the sum of the length of the branches (SBL) will be presented, which will determine the number of nodes and their position, including the "groups" of the evolutionarily closest sequences. Phylogenetic sub-analyses were carried out in order to identify the degree of identity of the groups formed. The alignment for phylogenetic analysis was carried out using CLUSTAL W, which performs alignments. The parameters to perform the multiple alignment were configured to use gap opening penalty of 10.00 and gap extension penalty of 0.20, and the divergent cutoff delay was 30%.

Generation of 3D models
The phospholipases with 3D structures not reported in the Protein Data Bank were obtained by modeling based on homology using the SWISS-MODEL server. Quality was evaluated by means of several tools, including the Ramachandran charts, WHATIF, the QMEAN4 index (The Qualitative Analysis of Energy Analysis) using ProSA-web and the SWISS-MODEL server. The results were expressed as a number between 0 and 1. Higher numbers indicate higher reliability and energy values (force field GROMOS96). ElliPro tools were used to predict lineal and conformational epitopes on a representative phospholipase for group. Residues with larger scores are associated with greater solvent accessibility. Only residues with a score > 0.7 were selected.

Phospholipases found and phylogenetic results
We selected 15 sequences of allergenic phospholipases and three not allergenic to include in the analysis with 361 positions in the final dataset. The sequences were derived from several biological sources: two from bees, two bumblebee, six wasps, two hornet, three ants, and three sources not described as an allergen, mosquito, spider, and scorpion. The allergens of bees and wasps belong to group 1 and the ants to groups 1 and 2 ( Table 1).
The phylogenetic tree had a consistency index of 0.857256 with a retention index of 0.779682 and a composite index of 0.683688 (0.668387) for all sites and parsimony-informative sites. A closed relationship among phospholipase allergens as shown, formed four nodes with a high phylogenetic relationship among them ( Figure 1). According to the tree, group A grouped three phospholipase A1 all belonging to the Vespula genus (Ves v 1, Ves m 1, and Ves s 1). This group presents the greatest relationship among the groups with the closest distance between branches. Meanwhile, group B contains the highest number of phospholipases A2 phylogenetically related, including allergens of the Bombus and Apis genera (Bom p 1, Bom t 1, Api m 1, Api c 1) and two non-allergic phospholipases from Parasteatoda tepidariorum (Common house spider) and Centruroides hentzi (Hentz striped scorpion). Group C included four proteins, three of them from ants belonging to Solenopsis gender (Sol i 1, Sol i 2, Sol s 2) and one belonging to the mosquito C. quinquefasciatus. In group D we found all the wasp allergens that belong to the genus Polistes (Pol a 1, Pol d 1), P. Paulista (Poly p 1), and D. maculate (Dol m 1) and V. crabro (Vesp c 1).

Identification of potential cross-reactive antigenic sites
Multiple alignments of the phospholipases of the different groups obtained from the phylogenetic analyzes were made. We built four 3D models of the 18 phospholipases Ves s 1, Sol i 1, Culex quinquefasciatus and Centruroides hentzi. The remaining proteins were reported on the UniProt database. We considered structures for better visibility of antigenic patches, the parameters for structural quality control for homology models are found in Table 3. To compare the ElliPro results, we chose the main antigen patches with a score higher than 0.7 and more than three residues, taking as reference the epitope of one phospholipase of each group; group A: Ves m 1; group B: Bom p 1; group C: Sol i 1; Group D: Pol d 1 ( Table 2). The constitutional antigenic patches are shown in Figure 2. Phospholipases from group A had a shared identity of 79% between their amino acid sequences (Figure 3). A total of 704 residues were identified and conserved among the phospholipases analyzed, and for these group, we used Ves m 1 to identify the possible epitopes. We found three common linear antigenic patches and two constitutive antigenic patches with a score greater than 0.7. Also by means of the identity matrix (Table 4) we can corroborate that their percentages remain high along with other proteins outside of group A.
Group B shares an identity of 35% between their amino acid sequences but when we exclude Api c 1, the identity increases to 64%. In total, 259 identical residues among the sequences were found. We found and included three linear      Three sequences were studied with a total of 439 residues. Unconserved sequence are shown with blue color and high conserved sequence with red color. Middle conserve sequence are showed with green and orange color. A total of 251 residues were identities. The percent sequence identity was 64%. epitopes and two conformational epitopes in Bom p 1 with a score >0.7. (Figure 4).
Group C, which includes allergens from ants, showed the lowest identity, with only 23% and the highest number of gaps (600 residues missing). Sol i 1 was the protein furthest away from any of the Hymenoptera allergens and appears to be closely related with wasps' allergens. No common antigenic patches were detected; however, Sol l 1 presents an interesting antigenic patch with 46 residues and a score of 0.711.
For group D, 1916 residues exhibit an identity among the five sequences of allergens. This group exhibit a high identity of 64%, the second highest after clade A. For the identification of antigen patches in this group, we used Pol d 1 finding four linear epitopes but only one linear epitope with a valid score ( Figure 5).

Discussion
Phospholipases A1 and A2 are allergens of insects, which provide a diagnostic benefit for the differentiation of genuine cross-reactivity sensitization. However, the cross-reactivity of this group of allergens has scarcely been holistically explored. In this study, we were able to predict those possible antigenic regions that could explain the cross-reactivity of phospholipases in Hymenoptera through in silico analyses.
The 18 amino acid sequences of the allergens were aligned, and a phylogenetic analysis was carried out which yielded four monophylogenetic groups (A, B, C, D). Group A yielded the highest degree of identity among their amino acid sequences 79%. All the allergens of this group belong to the Vespula genus, one of the most studied sources of wasp allergens 7,19 . In group B (Bom p 1, Bom t 1, Api m 1, Api c 1) two analyses were conducted, the first with the presence of the Api c 1 allergen where a degree of identity of 35% was found and the second without the allergen, where we found a higher degree of identity at 64%. This showed that the alignment of these three species could explain a possible cross-reactivity. Group C (Pol a 1, Pol d 1, Poly p 1, Vesp c 1, Dol m 1.02) showed a level of identity of 64%. However, analysis of conserved and affected residues showed that Group A shares three antigenic regions that could contribute to their cross-reactivity.
IgE against cross-reactive carbohydrate determinants (CCD) is one of the main causes of double positivity and is present in most hymenopteran venom allergens with more frequency in venom from HB and YJ in patients that are allergic to insect bites 20 . The prevalence of this allergen has been described in more than 20% of patients allergic to honeybee venom; approximately one of four HB poisons and one of 10 YJ venom allergens have been found to be CCDsIgE-positive. The PLA2 structure contains the insect CCDs that are specified by the presence of a 3-core α-1 fucose 21 . Insect CCD causes 69% at 75% double positive test results for HBV and YJV during allergy diagnosis 20-22 . Hemmer et al. propose that the Radio Allergo Sorbent Test (RAST) results to OSR pollen appear to be a simple and practicable measure to detect sugar specific IgE in individual sera. This could be useful to discriminate between patients who cross react through CCD and doubly sensitized patients who may require immunotherapy with two poisons. Currently, CCD-free allergens have been known to allow cross reactivity between proteins to be found without having a double positivity. Ves v 1, Api m 1, Dol m 1, Pol d 1 are allergens that lack cross-reactivity based on CCD and allow diagnoses without interference 19,23,24 . However, it should be clarified that these are mostly of recombinant origin because in its purified natural form possess CCD; for example, Api m 1 of natural origin has CCD and makes diagnosis difficult 24 . On the other hand, Sol i 1 is the only PLA1 hymenopteran venom known to have CCD, which could make the specific diagnosis of fire ant allergy difficult 25 .
Research on the allergenic capacity of Hymenoptera allergens has been characterized by individualized studies, with Api m 1, Sol i 1, Pol d 1, Ves m 1 among those most studied so far, but the possible cross-reactivity between phospholipase allergens A1 and A2 has not been holistically evaluated 2,24,26 . No cross-reactivity between A. mellifera, S. invicta and V. vulgaris was detected, which supports our results, since there was no relationship between these allergens. However, when analyzed along with other allergens, it was observed that a certain degree of identity is maintained between these two proteins, suggesting a possible cross reactivity without CCD. In the Table 5 we can see the presence or absence of CCD and comparison between reported clinical cross-reactivities and obtained cross-reactivities.
Group A (Ves m 1, Ves s 1 and Ves v 1) being the most representative, the cross reactivity between Vespula spp. is strong due to the similarities in the composition of the poison Other authors demonstrated that Ves v 1 also shows an identity of 54% with Poly p 1, it being the lowest among the allergens studied and a study carried out in Spain with 59 previously diagnosed allergic patients with an allergy to vespid found that there could be a double sensitization between Ves v 1 and Pol d 1 because in 31% of patients they could not be clearly defined as sensitized only to Vespula or Polistes 28,29 . Consequently, the different Vespula poisons react strongly in a crossed manner, which would explain the high degree of identity found in the study (Group A (79%)). Of the three proteins, only Ves v 1 has been described as a CCD allergen, showing that this interaction between the Vespula phospholipases could be CCD-independent and related only by protein structure 19 . The quaternary structure of the three Vespula phospholipases is also very similar, suggesting the possibility of present both linear and conformational epitopes ( Figure 6A). Therefore, we suggested that fragment inhibition studies be carried out to identify the possible antigenic peptide described in this study.
Group B showed a degree of identity of 35%, however, in the analysis, we found that if we performed the alignment without the Api c 1 allergen, the degree of conservation between Api m 1, Bom p 1 and Bom t 1 increased to 64%. So far, we have found no more information about the possible cross reactivity in these allergens. In this group, Api m 1 is the most characterized allergen; it contains the cross-reactive carbohydrate (CCD) determinants of insects that are defined by the presence of a 3-core α-1 fucose 30 .
For years, the detection of Api m 1 CCD challenges the differentiation of HB and YJ allergy. However, in vitro detection of immunoreactive sIgE from these insects showed double positivity in up to 59% of the patients 24 . PLA2s possess important venom allergens in other members of the genus Apis and Bombus that have been shown to have homology.
A. cerena (Api c 1) have been little explored but have been described as having high identity levels with other phospholipases, like A. mellifera (95%) 26 . In our study, we observed that when comparing the sequences of these phospholipases with those of the genus Bombus, that identity was not preserved since the identity we found was very low and when excluding it from the alignment, the sequences were more conserved 31 . Studies conducted on the genus Bombus found that the primary sequences of Bom t 1 and Api m 1 have an identity of 53% and their three-dimensional structures show conserved low protein surfaces 32 . However, the allergens selected from group B in our study showed a high conservation and structural homology leading to possible cross-reactivity ( Figure 6B).
As for Group C, we highlight that it was the only group that included phospholipases A1 and A2 in the clade, so a low identity was expected. We found that the ant phospholipases Sol i 1, Sol i 2 and Sol s 2 showed a degree of alignment identity with the other phospholipases in the primary sequences of 23%. This low identity is not enough to explain cross-reactivity in silico, even though allergen Sol i 1 has been extensively analyzed and other studies suggest that it may have a possible reactivity with the Centruroides species 33,34 .
The phylogenetic analyzes reported in this study revealed that Sol i 1 is the most divergent member among the currently identified hymenopteran venom group PLA1. As noted, Sol i 1 is in a group (group C) completely isolated from the clade consisting of wasp allergenic PLA1 (group A) and showed no structural homology ( Figure 6C). Furthermore, in multiple alignments, the fire ant exhibits the lowest level of sequence identity. However, studies have shown cross-reactivity between Sol i 1 and its wasp counterparts with amino acid sequence identity levels of 38% with Ves m 1, 36% with Ves v 1, 40% with Dol m, 1.35% with Pol d. 1.36% with Poly p 1 35 . However, a recent study suggests that peptide-based cross-reactivity between Sol i 1 and PLA1 of Polistinae wasps does not occur because the alignments and the phylogenetic and structural analyzes showed that it is an allergen further from its counterparts, in addition to possessing the lowest level of identity among the sequences studied, with 36%, and the highest RMSD value with 0.172 29 .
Several works have attempted to demonstrate cross-reactivity between A1 phospholipases 29,36 . The cross-reactivity based on PLA1 of the venoms of eight hymenoptera was analyzed and it was described that the identity of the primary sequence of Poly p 1 was conserved in 36% with Sol i 1, 74% with Pol d 1 and 71 % with Pol a 1. In our study no relationship was found between Poly p 1 with Sol i 1. However, group D, where we found the different species of Polistes (Pol a 1 and Pol s 1), Poly p 1, Dol m 1 and Vesp c 1, showed a high degree of identity of 64% and structure homology ( Figure 6D), enough to explain cross-reactivity 29 . An attempt was made to look for cross reactivity between Dol m 1, Ves v 1 and pol a 1 with mice; partial cross-reactivities in the T-cell epitopes of homologous vespid allergens was found, which supports our findings 7,29,36 .
Of the species chosen, three non-allergenic phospholipases (Centruroides hentzi, Parasteatoda tepidariorum, and Culex quinquefasciatus) were taken to adjust the phylogenetic analysis, so as the results were produced, we observed that these phospholipases separated into two clades showing some affinity for some phospholipases allergens.
A study identified allergens in the venom of common striped scorpions. Eleven patients with scorpion venom allergy were assessed, where four patients had a history of anaphylaxis (with positive skin test responses) to imported fire ant venom (IFA) and at least two other had a history of large local reactions, suggesting that there could be a cross reactivity between proteins of these insects; this association would be clinically relevant 29 . This shows that despite not being described as allergens, it is necessary to carry out studies to verify their capacity to trigger sensitization.
Bioinformatic studies are high impact tools of great importance. Currently they are recognized as the first step to conducting an investigation, since they are in silico analyzes that facilitate a possible approximation to expected results, allow predictions or models, and serve as the basis for the emergence of large projects. In our study, we show possible antigenic regions involved in cross-reactivity between phospholipases A1 and A2, based on what was found with the use of in silico analysis we can say that they are proteins with a high degree of identity and that three antigenic regions were found, which would explain possible co-sensitization.
It is also necessary to note that our study has some weaknesses that could explain the lack of cross-reactivity between the allergens evaluated; In the case of phospholipases, we model its tertiary structure based on other homologous proteins since its tertiary structure is unknown, however in-silico constructions are not exact and it is possible that its natural form is different from what we propose. In the same direction, the epitopes require further confirmation through studies in biological models, in vivo, in vitro and experimental.

Conclusion
Potential antigenic sites were identified for the generation of cross-reactivity between the phospholipases analyzed in this study. The identity between these proteins of different species is relatively high, which shows that cross-reactivity between them is possible and their frequency in most cases can be high. These studies support diagnostic testing by component studies for venom allergy and the need to carry targeted mutagenesis tests is important to confirm their relevance in the allergenic capacity of phospholipases.

Data availability
All data underlying the results are available as part of the article and no additional source data are required.

Open Peer Review
some answers to the reviewer's comments were not implemented in the text and they should be.
Regarding the comment: "Answer: we accept the observation and proceed to change constitutive antigenic parch from conformational epitopes in the second version", conformational antigenic patches (instead of constitutional or constitutive epitopes) should be used all over the manuscript.
Legend to figure 2: Sol i 1 (not Sol I 1). "the Vespid basal is sequence" needs to be corrected.
Regarding the answer to this point: "9. Discussion, third paragraph: "double positivity" needs to be explained (69% at 75% is not clear).
What is OSR? (spell out)." bioinformatic tool and multiple alignment. There are some considerations that the authors must address to provide a more solid argument, and further analysis must be performed. Why did the authors used ElliPro server and not another one? Is there any advantage of this tool compared to others?
○ Prediction of antigenic regions should be performed using different tools and individual analysis of accessibility, hydrophobicity, etc. And then compared the results to provide the "most probable" antigenic region.

○
The authors use the term "constitutional antigenic patches". What does this mean? Is it a concept used by the bioinformatic tool? If so, they must explain what this is about.

○
Comparisons between aa sequences result in percentages of identity as a way to express the level of conservation or homology. However, proteins like phospholipases have variable molecular weights, which means that they aren't similar in the whole length of their sequences. The authors must make clear that a portion of each protein is used to make the statements related to the identities and that the analysis is restricted to these areas.  The discussion is too long given the presented results. Sometimes the information provided is repetitive. For example, the second paragraph of the discussion is essentially previously mentioned in the results section and can be eliminated.

○
The third paragraph in the discussion section refers to carbohydrate determinants, which are not even superficially explored in this study. Although the observation referred to CCD can be mentioned in this manuscript, is not necessary to discuss this issue too much since there are no analysis for this matter in this paper.

○
A similar evaluation of all the information provided in the discussion should be done. It is difficult to understand what the authors are trying to argue and how this is closely related to the actual results. A shortened and more precise discussion could greatly improve the manuscript.

Minor comments:
English and grammar needs to be revised. Many sentences need to be re-written to better disclose the message that the authors are trying to say, for example: Allergic immune response from hymenopteran allergens has been studied in detail due to a high incidence of sting reactions to these insects… ○ A closed relationship among phospholipase allergens as shown, formed four nodes with a high phylogenetic relationship among them… ○ ○ A better use of "comas" and "periods" must be done in order to improve the writing as well. For example: Molecular, structural, and immunological characterization of hymenopteran venom allergens is advanced, in total 75 allergens from 31 different species have been explored, and since phospholipases are a family of allergens with clinical and biological relevance, some proteins belonging to this order such as hyaluronidase and antigen V are also considered relevant to sensitization to this allergenic ○ In figure 3, what does "identities residues" mean?.
○ If a total of 704 residues were identified and conserved among the phospholipases, why is figure 3 showing around 330 only?.

○
We built four 3D models of the 18 phospholipases Ves s 1, Sol i 1, Culex quinquefasciatus and Centruroides hentzi… are you referring to "18" or "4"proteins?.  In group B (Bom p 1, Bom t 1, Api m 1, Api c 1) two analyses were conducted, the first with the presence of the Api c 1 allergen where a degree of identity of (35%) was found and the second without the allergen… why are you using parenthesis?

If applicable, is the statistical analysis and its interpretation appropriate? Not applicable
Are all the source data underlying the results available to ensure full reproducibility? Partly

Are the conclusions drawn adequately supported by the results? Partly
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Immunology, allergy, recombinant allergens, mosquito and house dust mite allergy, somatic hypermutation and class switch recombination.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. We have addressed all the suggestions, below are the comments.
Answer at the comments: 1. Why did the authors used ElliPro server and not another one? Is there any advantage of this tool compared to others?
Answer: it is a tool that provides greater robustness to predictive models and has been shown to be a better server for conformational epitopes.
2. Prediction of antigenic regions should be performed using different tools and individual analysis of accessibility, hydrophobicity, etc. And then compared the results to provide the "most probable" antigenic region.
Answer: Given the robustness of the program, we do not consider the use of other tools.
3. The authors use the term "constitutional antigenic patches". What does this mean? Is it a concept used by the bioinformatic tool? If so, they must explain what this is about.
Answer: in the second version we will change the term to conformational epitopes for clarity.
4. Comparisons between aa sequences result in percentages of identity as a way to express the level of conservation or homology. However, proteins like phospholipases have variable molecular weights, which means that they aren't similar in the whole length of their sequences. The authors must make clear that a portion of each protein is used to make the statements related to the identities and that the analysis is restricted to these areas.
Answer: the PRALINE server makes an adjustment according to the length, based on the coverage between the proteins. The areas that are being conserved will be shown in the images.
5. Consequently, was the homology model made for the whole Ves s 1, Sol I 1, C. quinquefasciatus and C. hentzi phospholipases as whole proteins or just a portion of them?
Answer: Yes, the whole protein model was made.
6. In the methods section, which templates were used by SWISS MODEL to predict the 3D model?
Answer: The template that SWISS MODEL uses to predict tertiary structures is based on other homologous proteins with known tertiary structure. The server shows which is the base protein that is being taken for the creation of the new protein. This proteins will be show in the second version.
7. How was the superimposition of 3D structures presented in figure 5 performed?. The approach should be indicated in the methods.
Answer: it was done using the matchmaker tool of the Pymol program.
8. The discussion is too long given the presented results. Sometimes the information provided is repetitive. For example, the second paragraph of the discussion is essentially previously mentioned in the results section and can be eliminated.
Answer: we consider it important to make a short summary of the results for a better understanding of the discussion.
9. The third paragraph in the discussion section refers to carbohydrate determinants, which are not even superficially explored in this study. Although the observation referred to CCD can be mentioned in this manuscript, is not necessary to discuss this issue too much since there are no analysis for this matter in this paper.
Answer: we preserve the written information from CCD because the information was suggested by the editor and another reviewer. 9. A similar evaluation of all the information provided in the discussion should be done. It is difficult to understand what the authors are trying to argue and how this is closely related to the actual results. A shortened and more precise discussion could greatly improve the manuscript.
Answer: thanks we will take it into account for the second version Minor comments: 1. English and grammar needs to be revised. Many sentences need to be re-written to better disclose the message that the authors are trying to say, for example: -Allergic immune response from hymenopteran allergens has been studied in detail due to a high incidence of sting reactions to these insects… -A closed relationship among phospholipase allergens as shown, formed four nodes with a high phylogenetic relationship among them… -A better use of "comas" and "periods" must be done in order to improve the writing as well. For example: Molecular, structural, and immunological characterization of hymenopteran venom allergens is advanced, in total 75 allergens from 31 different species have been explored, and since phospholipases are a family of allergens with clinical and biological relevance, some proteins belonging to this order such as hyaluronidase and antigen V are also considered relevant to sensitization to this allergenic.
Answer: English grammar is reviewed, as well as the use of points and commas.
Answer: They could also be called conserved residues, these correspond to the amino acids that are being recognized as causing cross-reactivity, showing high identities between the different sequences analyzed.
3. If a total of 704 residues were identified and conserved among the phospholipases, why is figure 3 showing around 330 only?
Answer: the PRALINE tool used for the alignment shows the number of residues taken into account for the analysis, what was done was to calculate how many residues of that total were conserved in those analyzed sequences. Thus, in figure 3, 934 of which 704 were conserved were taken into account in the analysis, this figure corresponds to 79% of the total residues. 4. We built four 3D models of the 18 phospholipases Ves s 1, Sol i 1, Culex quinquefasciatus and Centruroides hentzi… are you referring to "18" or "4"proteins?

General comments:
The manuscript by Emiliani et al. reports an in silico analysis of cross-reactivity among phospholipases from Hymenoptera species. Four groups or clades are identified and further analyzed by sequence alignments and epitope prediction. However, the potential IgE crossreactivity between pairs of proteins within and between different clades are not clear from the analysis. The discussion needs to be revised to explain better their results compared to clinical observations reported in the literature.
First paragraph from results: From Table 1 it seems that 15 allergenic and 3 non allergenic sequences were selected. The way it is phrased it seems as if 18 allergenic sequences were selected. The numbers of species do not add up to 18 (5 bees, 6 wasps, 3 ants, 3 other = 17). Are yellow jackets and hornets considered wasps? If so there should be 8 wasps, and 4 bees.  Table. "has a score is" needs correction.
could be helpful in the results section to illustrate this (and blue and yellow areas in the respective molecules).
Discussion, third paragraph: "double positivity" needs to be explained (69% at 75% is not clear).
The discussion is hard to follow, maybe in part because it is not clear from the results if crossreactivity is expected or not among different species. It might be interesting to have a table in the results section showing the 4 groups with their proteins (in first row and first column) and indicating if clinical cross-reactivity has been observed (also CCD presence or absence if it applies), next to expected cross-reactivities from the results.

Minor comments:
English grammar needs revision. For example, in Abstract: "phospholipase allergens", "18 amino acid sequences", "shared an amino acid sequence identity". ○ Some terms need explanation for the readers to understand (for example: "empty spaces", "length of the branches", "nodes" in first paragraph, page 4).

○
Page 3, second column, 3rd paragraph, line 10: sentence with "areas" is vague. Does "area" mean epitopes or something else? Line 11: "Various studies have been carried out…" ○ Page 5, first paragraph: P. paulista. Other species names should be in italics as well all over the manuscript (i.e. Legend to Figure 6).

Are sufficient details of methods and analysis provided to allow replication by others? Partly
If applicable, is the statistical analysis and its interpretation appropriate? Not applicable conformational epitopes in the second version.
4. Legend to figure 2: Sol i 1 (not Sol I 1). "the Vespid basal is sequence" does not seem correct.
Answer: it was correct the name of the antigen: Sol I 1, both in the legend as in the paragraph 6.
5. Page 5, 3rd paragraph: how can 704 residues be conserved among 337 residues per sequence (in the alignment of Figure 3). The authors should explain where the number 704 comes from. Similarly, the same applies to 259 and 1916 residues from sequences that are shorter in Figures 4 and 5, respectively. Usually, percent of identity between two sequences is a better way to express homology than saying the total residues that were conserved for all the sequences (this is what the authors seem to be presenting). A better way that the authors could use to show homologies among several sequences, is an Percent Identity Matrix, which shows all the percentages of identity between pairs of proteins.
Answer: the PRALINE tool used for the alignment shows the number of residues taken into account for the analysis, what was done was to calculate how many residues of that total were conserved in those analyzed sequences. Thus, in figure 3, 934 of which 704 were conserved were taken into account in the analysis, this figure corresponds to 79% of the total residues and similarly applies to residues 259 and 1916 of figures 4 and 5. The matrix Percentage identity is included in the second version.
Answer: corrected 7. Linear and continuous epitopes are the same. In page 8, first paragraph: do the authors mean to use linear and continuous in the same line?
Answer: Yes, they are the same and for illustrative purposes the term Epitope lineal is used in the second version.
8. Discussion, end of paragraph 2: Do groups A and D share antigenic areas? (what are "affected residues"?). If so, the explanation is not clear, and not shown in the results.
Maybe an overlay of a representative molecule from each of the two groups showing the areas that are shared in green could be helpful in the results section to illustrate this (and blue and yellow areas in the respective molecules).
Answer: In our study, we did not find antigenic areas directly related to cross-reactivity between group A and D proteins. However, in the study carried out by Hoffman, et al. 2005 showed that there are levels of identity between the proteins of these groups and they also include Sol i 1 found in group C. Therefore we use this information to compare the results found.
Answer: the double possibility refers to possible cross reactivities caused by the recognition of different substances contained in the analyzed insects, for example; In the particular case of some bees and wasps, they present double cross-reactivity given by the carbohydrate cross-reaction determinants (CCD) and by phospholipases, as they are different substances but both are recognized in insects, they are called "double positivity". The percentages show that 69 to 75% of the tests performed for bee venom and yellow jacket venom give double positivity due to the presence of carbohydrate cross-reaction determinants (CCD) and phospholipases, which confuses the diagnosis allergy. 10. The discussion is hard to follow, maybe in part because it is not clear from the results if cross-reactivity is expected or not among different species. It might be interesting to have a table in the results section showing the 4 groups with their proteins (in first row and first column) and indicating if clinical cross-reactivity has been observed (also CCD presence or absence if it applies), next to expected cross-reactivities from the results.
Answer: thanks for the suggestion, we will take it into account to carry out a review of the results, discussion in order to clarify the central ideas and make the information clearer for the reader. We make the suggested table for clarity of information.
Answer: English grammar will be reviewed.
2. Some terms need an explanation for readers to understand (for example: "empty spaces", "length of branches", "nodes" in the first paragraph, page 4).
Answer: -The empty spaces are areas of different sequences where there is no relationship between their amino acids.
-The branch length or branch distance refers to the relationship between the allergens exposed in the tree. The longer the branch length, the less related there is between the allergens and the shorter the greater the relationship.
-Nodes is a synonym for groups or clades that make up the phylogenetic tree.
Answer: "sensitization to this allergen" is corrected by "sensitization to hymenoptera allergens" because the central idea revolves around the diversity of allergens of this order.
4. Page 3, second column, third paragraph, line 10: the sentence with "areas" is vague. Does "area" mean epitopes or something else? Line 11: "Several studies have been carried out ..." Answer: The term was corrected for conserved regions that refer to epitopes (these are the regions causing sensitization). What is mentioned on line 11 is support to show that other valid studies have been done.
5. Page 5, first paragraph: P. Paulista. Names of other species should also be italicized throughout the manuscript (ie, Legend to Figure 6).
Answer: we italicize those names that were missing. Figures 3, 4 and 5 need correction (highly conserved sequences identified, conserved in the middle).

Legends in
Answer: Conserved between the sequence with high identity 8. The verb "Consistent" can be removed from the legend to the figure Answer: the verb was removed.

Competing Interests:
No competing interests were disclosed.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com