New insights about the serine/threonine protein kinase substrates from Mycobacterium tuberculosis using molecular docking, quantum similarity analysis and DFT calculations [version 1; peer review: 1 approved]

Background: The protein kinases present in the human body have received a lot of attention because of the interest in their use as therapeutic targets. However, little is known about the protein kinases associated with tuberculosis. For these reasons, this research investigates a new point of view regarding the crystallized serine/threonine protein kinases Pkn A, B and G of Mycobacterium tuberculosis. Methods: The conformational analysis shows a DFG-in motif in Pkn B and G and a DFG-out motif in Pkn A. For all the protein kinases that have been studied, the gatekeeper residue is methionine. A study of the protein kinases with their ligands was also conducted to find new insights on the binding site with a series of ligands associated to protein kinases Pkn A, B and G through molecular docking. The residues with hydrogen bonds on the hinge zone of Pkn A are GLU96 and VAL 98, of Pkn B are GLU 93 and VAL 95 and of Pkn G are GLU233 and VAL235. Results: The results show the H-bond acceptor and H-bond donor sites on the hinge zone to all ligands, establishing a structural model of the ligands on the active site with two or three interactions in this zone. This interaction model was validated using density functional theory calculations (by means of net charges and images of the electrostatic potential) and molecular quantum similarity analysis, showing a high correlation between the electronic and steric effects in each ATP complex studied. Conclusions: In this work we can see that the interactions of the hinge zone are characterized by the key factor of one or two H-bonds acceptors and one H-bond donor in the ligands of this zone. The quantum similarity analysis shows good correlation between the Open Peer Review


Introduction
Protein kinases (PKs) are a key in controlling proliferation and differentiation in eukaryotic cells in living organisms. They are enzymes that catalyze the protein phosphorylation process. An important reason to analyse the protein phosphorylation process is that it represents an attractive drug target in a variety of fatal diseases such as cancer 1 . The PKs present in the human body have been widely used in the field of drug design because they play an important role as therapeutic targets in these diseases. However, little is known about the serine/ threonine protein kinases (STPKs) involved in tuberculosis. For this reason, we propose a study about crystalized STPKs Pkn A, B and G, to gain new insights in the mechanism of the phosphorylation process.
To date, one of the main questions in the tuberculosis field is 'what are the protein substrates of the STPKs?' According to Av-Gay et al. 2 , to deal with this topic we need to study the interactions of crystalized STPKs Pkn A, B and G with a particular series of ligands to each PK. The competitive ATP ligands used are a series of compounds for Pkn A reported by Sipos et al. 3 , ligands for Pkn B reported by Székely et al. 4 , Loughheed et al. 5 and Chapman et al. 6 and ligands for Pkn G reported by Sipos et al. 3 and Pato et al. 7 . These ligands were selected with the goal of finding new insights into the inhibition of oxidative phosphorylation. Therefore, we are researching new pharmacological alternatives for this disease through the structural models of the ligands on the ATP sites.
Another aim of this study is to conduct molecular quantum similarity (MQS) analysis using a quantum similarity field to understand the electronic and structural patterns involved in stabilization of the active site of the inhibitors for each PK studied. Therefore, a hybrid quantum mechanics/molecular mechanics (QM/MM) approach was used to carry out a structure-activity relationship analysis of the inhibitor set for Pkn A, Pkn B and Pkn G.
Protonation states were identified using PropKa utility at a physiological pH 12 . After, restrained molecular minimization using the Impact Refinement (Impref) module 13 was carried out, taking into account the heavy atoms restrained within a rootmean-square deviation (RMSD) of 0.18 Å from the initial coordinates. The main characteristics of Pkn A and B consist of a transmembrane receptor with a tyrosine kinase domain protruding into the cytoplasm. For Pkn G, the unique multidomain topology of this PK reveals a central kinase domain that is flanked by N-and C-terminal rubredoxin and tetratrico-peptide repeat domains, respectively 13,14 .
The molecular datasets used for the docking study were reported by Sipos et al. 4 , Székely et al. 5 , Lougheed et al. 6 , Chapman et al. 7 and finally Pato et al. 8 (see Table 1- Table 6).
To select this molecular dataset, the structural diversity and uniform distribution of IC 50 was taken into account. Logarithmic IC 50 (μM) (pIC 50 =−log IC 50 ) was employed as the dependent variable instead of IC 50 . The 3D molecular structures of the compounds were drawn and optimized using Maestro 15 and then prepared with the LigPrep module 16 , where ionization/ tautomeric states were predicted at physiological pH conditions   Docking studies were performed with the software Glide 18 using standard precision (SP) mode and default parameters. The open-source software AutoDock Vina could also be used to obtain similar results. The docking grid was generated with standard protocol centered at the co-crystallized ligand. Due to that, the scaling factor for the van der Waals interactions is very important it was taken as 0.8. The radii of nonpolar protein atoms were selected to allow the binding of larger ligands. Further, the induced fit docking (IFD) workflow 19 was also used to generate a good number of conformations of the receptor suitable for binding ligands that had strange orientations. It allows the protein to undergo side-chain movements, backbone movements, or both, upon ligand docking. The docking procedure has four steps where accuracy is ensured by Glide's scoring function and Prime's: (i) The first step for Glide docking is carry out into the rigid receptor to generate a good ensemble of poses.
(ii) Analysis of the protein using Prime 21 . Taking into account side-chain prediction module followed by a structure energy minimization of each protein/ligand complex pose.
(iii) Analysis through re-docking of the ligand into low energy induced-fit structures taking in account the previous step using default Glide settings (no vdW scaling).
(iv) Finally, the binding energy is estimated using (IFDScore) by accounting for the docking energy (GScore), as well as receptor strain and some solvation terms (Prime energy).
To obtain the docking outcomes, the dynamics simulation of 100ns in vacuo using the GROMOS11 force field set was used Compounds to Pkn G reported by Pato et al. 8 on the protein-ligand complex, using Gromacs 5.1.2 20 to analyze the stability of the protein-ligand complexes 22 .

MQS analysis
In recent studies, MQS methodology has shown good results in understanding steric and electronic effects of ligands 23,24 .
The MQS field was introduced by Carbó and co-workers 30 years ago [25][26][27][28][29][30] and has been used to understand the steric and electronic effect on the molecular sets. The mathematic relationship to obtain quantum similarity Z AB between two compounds A and B, with electron density ρ A (r 1 ) and ρ B (r 2 ) is defined using the minimization of the expression for the Euclidean distance as: ( ) ( ) Therefore, the Euclidean distance is the square root of the integral on the squared differences between the densities ρ A (r 1 ) and ρ B (r 2 ). In this context, Z AB represents the similarity measure between the electronic densities of the compounds A and B, and Z AA and Z BB are the self-similarity of compounds A and B 31 . The similarity index very often used is the Carbó index 31 . This index is expressed as follows: In Equation 2, (Ω) represents an operator. A simple way to make quantum similarity measures (QSM) [29][30][31] involving two density functions, in the most usual form, is expressed as an integral: In this equation ρ A (r 1 ) and ρ B (r 2 ) are the density functions to quantum objects A and B, while Ω(r 1 ,r 2 ) is a positive operator that defines weight. The Dirac delta function: δ(r 1r 2 ) is used to obtain the overlap similarity measure and to measure the Coulomb similarity the Coulomb operator: |r 1r 2 | -1 is used. These operators are the most useful for obtaining similarity comparisons between ligands.

Molecular alignment
An important factor when carrying out the QSM is an optimal molecular alignment. As the integrals associated to the QSM produce, in any case, real positive results, the relative position problem can be approached using a maximal QSM. An overlap QSM can be expressed according to the equation: In this equation, it is implicitly supposed that ρ B (r) is translated and rotated by the six possible ways: (T;φ) and are shown as explicit parameters in this integral 31 . This key factor was applied using the topo geometrical superposition algorithm (TGSA) 31 .

Density functional theory (DFT) calculations
The calculations of this study were performed with the GAUSSIAN 09 program 32 . The open-source software AutoDock Vina could also be used to obtain similar results. The structures were optimized using the B3LYP functional (DFT exchange-correlation method) 33,34 and the 6-31G(d,p) basis set 35 . The contour lines of the molecular electrostatic potential have been obtained with the GaussView program 36 .

Results and discussion
Pkn A, B and G: docking of their ATP-competitive inhibitors The conformational analysis of the STPKs defines the active and inactive form. These forms are also known as the DFG-in and DFG-out states. They are distinguishable by a flip of the Asp and Phe orientation in the activation loop as well as conformational changes in the surrounding loops 37 . The main difference between DFG-in and DFG-out conformations is that the different position of the DFG residue in the "out" form results in the opening of an additional cavity, the allosteric site, which can be hydrophobic in nature according to Zuccotto et al. 38 . In Pkn A the Asp side chain is buried and the Phe side chain is flipped outside the pocket. Therefore, Pkn A becomes DFGout. However, in the STPKs Pkn B1 (crystal: 1MRU), Pkn B2 (crystal: 2FUM) and Pkn B3 (crystal: 3F69), the Asp side chain is exposed and the Phe side chain is buried (DFG-in), allowing ATP binding. Pkn G has DLG (Asp-Leu-Gly) instead of DFG (see Figure 1); the conformation is DLG-in. An important residue of STPKs is the gatekeeper. Their size determines the access of small, medium and large inhibitors to the back cavity even from the DFG-in state. In these crystallized PKs, the gatekeeper residues are MET95 for Pkn A, MET92 for Pkn B1 (crystal: 3F69), Pkn B2 (crystal: 1MRU), Pkn B3 (crystal: 2FUM), and MET232 for Pkn G. The methionine is a medium gatekeeper residue, according to Zuccotto et al. 38 and allows access to small, medium and large inhibitors. This characteristic is important because it increases the range of molecular diversity in the development of new inhibitors.
In Figure 1D, 1E and 1F we can see the hydrogen-bondacceptor maps in red mesh. Pkn A shows a small red contour on the hinge zone ( Figure 1D), a medium contour near to the DFG motif and a large contour near to the helix-αC and C-terminal loop. This is unlike Pkn B, which, using the crystal 3F69 (Figure 1E), shows only acceptor maps near to the DFG motif and in the C-terminal loop. Pkn G ( Figure 1F) has large acceptor contours involving the hinge zone, DLG zone, helix-αC and C-terminal loop. Figure 1G, 1H and 1I shows the hydrogen-bond donor maps in blue mesh. In Pkn A ( Figure 1G) we can see medium contours to hydrogen-bond donor on the hinge zone and C-terminal loop and a large contour near the DFG motif. In contrast, Pkn B ( Figure 1H) shows only a large contour near to the DFG motif. Pkn G ( Figure 1I) has small contours to hydrogen-bond donors along the hinge zone and near the DLG motif, helix-αC, C-terminal and N-terminal loops.
Analyses of hydrophobicity are shown in Figure 1J, 1K and 1L. Figure 1J shows a large contour of hydrophobicity for Pkn A near to the hinge zone and small contours on the C-terminal loop. Pkn B ( Figure 1K) has only a small contour on the C-terminal loop. In contrast, Pkn G ( Figure 1J) shows a large contour near the hinge zone, DLG motif and small contours near to the helix-αC and C-terminal loop. These physicalchemistry properties are important to describe the non-covalent stabilization on the active sites and selectivity of the ligands in the active pocket.  forces. Further, some of the higher scoring ligands forming hydrogen bonds and aromatic-aromatic interactions with the amino acid residues are related to the hinge zone. Figure 2 shows the docking interactions of the Pkn A inhibitors from Table 1. These compounds have groups with electron withdrawing resonance effects such as nitro, amine, carbonyl groups and electron donating groups such as methylene.
In Figure 2A we can see the hydrogen bonds on the hinge zone to compound 1 ( Table 1). This compound has two interactions on the hinge zone with the residues Glu96, Val98 of 2.29 Å and 2.54 Å, respectively. Another hydrogen bond can be formed with the residue Asn104 of 1.98 Å. Compounds 2, 3 and 4, see Figure 2B, have differences only in the substituent group, in the case of 3, this is a pyrrole ring and in 4, is the dimethylamine group. In this series, the most active compound is 2, with pIC 50 -1.839. These compounds show two hydrogen bonds on the hinge zone with residues Glu96 and Val98 of 1.31 Å and 2.30 Å to 2, 1.71 Å and 2.35 Å to 3 and 1.81 Å and 2.21 Å to 4. Another hydrogen bond can also be formed with the residue Gly145 of 2.22 Å to 2, 1.81 Å to 3 and 1.95 Å to 4. From these results, we can see the key factor to the stabilization of the active site in this ligand set, with the main characteristic of one H-bond acceptor and H-bond donor in this zone. Figure 3 (left) and Figure 4 show, respectively, the atomic charges calculated by the CHelpG population and electrostatic potential analyses of compound 1 in some areas of special interest. The net charges obtained for H22, O24, O30 and O31 are consistent with the formation of hydrogen bonds shown in Figure 2A. The electrostatic potential surfaces shown in Figure 4 also justify the formation of hydrogen bonds with Glu96, Val98 and Asn104. Figure 3 (right) and Figure 5 show, respectively, the atomic charges and electrostatic potential of compound 4. The atomic charges of H27, H42 and H15 justify very well the formation of hydrogen bonds with Glu96, Val98 and Asn104 that can be seen in Figure 2B. An equivalent conclusion can be found from the electrostatic potential surfaces shown in Figure 5.

Pkn B: Descriptions of the structural characteristics of ATP binding sites and docking of inhibitors
The compounds to Pkn B have groups with electron withdrawing resonance effects, such as carbonyl, hydroxyl, amine and methoxy groups, with big rings central to its stabilization on the active site. Figure 6 shows the docking results of the Pkn B inhibitors from Table 2-Table 5, grouped by series, taking into account their functional groups and structural similarities. Figure 6A shows compound 5, which has interactions in the hinge zone with the residues Glu93 (2.63Å) and Val95 (-CO: 2.46Å, -NH: 1.99Å, 2.16Å), see Table 7. Compound 5 has dual effects, interacting with Pkn B (pIC 50 : -0.585) and Pkn G (pIC 50 : 1.301), see Table 2. Figure 6B shows the interaction with compound 6, which has interactions with the residue Val95 (-CO: 1.31Å, -NH: 2.48Å). In these interactions the -NH is H-bond donor and the -CO is H-bond acceptor; these interactions can be related to the pIC 50 of 0.569 for this compound. Figure 6C shows the interaction in the hinge zone to of compounds 7, 8 and 9. These compounds have two interactions in the hinge zone with the residue Val95. The most active compound of this series is 8 (pIC 50 : 0.585) with its interactions with the residue Val95 (-CO: 1.80Å, -NH: 2.15Å), see Table 7. Figure 6D shows the interactions of the compound 10; this compound has two interactions in the hinge zone with the residue Val95 (-NH: 2.33Å, -C=O: 2.09Å). Another H-bond happens with the residues Asp96 (1.71Å) Asp102 (-C=O: 2.42Å, -OH: 2.10Å) and Lys140 (2.01Å). The interactions with the two guanidine groups are through the aspartates. This is very interesting because these kinds of interactions could differentiate one kinase from another. Figure 6E shows the interactions in the hinge zone of the compounds 11, 12, 17, 18, 19, 20 and 23-53. This series of compounds have three H-bonds in the hinge zone with the residues Glu93 and Val95. The most active compound from this series is 52 (pIC 50 : 1.638). It has  Table 2-Table 5. interactions with residues Glu93 (2.21Å) and Val95 (-CO: 2.33Å, -NH: 1.85Å), see Table 7. In this series there are two -CO H bond acceptors and one -NH H bond acceptor. Figure 6F shows the interactions in the hinge zone of the compounds 13 and 22; these compounds have two interactions in the hinge zone with the residue Val95 (-CO: 2.39Å, -NH: 2.42Å for 13 and -CO: 2.27Å, -NH: 1.90Å for 22), see Table 7. The most active compound is 22 (pIC 50 : 1.187). Finally, Figure 6G shows the interactions in the hinge zone of the compounds 14, 15, 16, 21 and 54-57. This series of compounds also has three interactions in the hinge zone with the residues Glu93 and Val95. The most active compound in this series is 57 (pIC 50 : 1.398), see Table 5.
Taking in account the results from Table 7, the key factor to the stabilization of the active site of these ligands is one or two -CO (H bond acceptor) and one -NH (H bond acceptor) groups in this zone. Figure 7 shows the atomic charges calculated with the CHelpG population analysis (upper left) and the electrostatic potential (upper right and down) of compound 5 in some areas of special interest. The net charges obtained for H11, O13, H21 and H22 are consistent with the formation of hydrogen bonds shown in Figure 6A. The electrostatic potential of surfaces shown in Figure 7 also justify the formation of hydrogen bonds with Glu93 and Val95. Figure 8 shows the atomic charges calculated by CHelpG population analysis (left) and the electrostatic potential (right) of compound 6 in some areas of special interest. The net charges obtained for H40 and O41 are consistent with the formation of hydrogen bonds shown in Figure 6B. The electrostatic potential of surfaces shown in Figure 8 also justify the formation of hydrogen bonds with Val95. Figure 9 shows the atomic charges calculated by CHelpG population analysis (left) and the electrostatic potential (right) of compound 8 in some areas of special interest. The net charges obtained for N9 and H13 are consistent with the formation of hydrogen bonds shown in Figure 6C. The  electrostatic potential of surfaces shown in Figure 9 also justify the formation of hydrogen bonds with Val95. Figure 10 shows the atomic charges calculated by CHelpG population analysis (left) and the electrostatic potential (upper  right and lower) of compound 10 in some areas of special interest. The net charges obtained for O29, N31, N16, H64 and H76 are consistent with the formation of hydrogen bonds shown in Figure 6D. The electrostatic potential of surfaces shown in Figure 10 also justify the formation of hydrogen bonds with Val95 (O29 and N31), Lys140 (N16) and Asp102 (H64 and H76). Figure 11 shows the electrostatic potential generated by compound 13 in some areas of special interest. The electrostatic potential of surfaces shown in Figure 11 are consistent with the interactions shown in Figure 6F and justify the formation of hydrogen bonds of H34 and N36 with Val95. Figure 12 shows the atomic charges calculated with the CHelpG population analysis (left) and the electrostatic potential (right) of compound 15. The net charges obtained for H31, H38 and N39 are consistent with the formation of hydrogen bonds shown in Figure 3G. The electrostatic potential of surfaces shown in Figure 12 also justify the formation of hydrogen bonds with Glu93 (H38) and Val95 (H31 and N39). Pkn G: Descriptions of the structural characteristics of ATP binding sites and docking of inhibitors The Pkn G inhibitors from Table 6 have a common structure and three interactions in the hinge zone with the residues Glu233 and Val 235 (see Figure 13).
The most active compound of this series is 66 (pIC 50 : 2.000). This compound has interactions with the residue Glu93 (2.59Å) and Val95 (-CO: 2.24Å, -NH: 2.17Å), see Table 8.  Figure 14 shows the atomic charges calculated by CHelpG population analysis (left) and the electrostatic potential (right) of compound 58. The net charges obtained for H16, O11 and H14 are consistent with the formation of the hydrogen bonds shown in Figure 13. The electrostatic potential of surfaces shown in Figure 14 also justify the formation of hydrogen bonds with Glu233 (H16) and Val235 (O11 and H14).

MQS analysis
The MQS analysis was conducted to understand the correlation between the ATP ligands associated to each kinase studied and the stabilization process in the active site. This correlation allows postulation of the structural models of the ATP complexes.
For the Pkn A ligands (see Figure 15), the correlation between the steric and electronic effects is R 2 =0,9487 using the coulomb and overlap operators to determine electronic and steric effects, respectively. This correlation shows the nature of the stabilization of these ligands on the active site and its non-covalent character. Figure 16 shows the correlation of Pkn B ligands (see Table 2- Table 5). This correlation shows the electronic and steric effects along the ligand set associated with the active site. The high values in the Carbó indexes give an idea about the ATP complex generated and stabilized.
In Figure 17 is shown the correlation between the steric and electronic effects associated with the ATP complex of Pkn G (see Table 6). This complex has a correlation of R 2 =0,7357. In this complex, the steric and electronic effects also are very relevant and the stabilization in the active site has a high non-covalent character.  In Figure 15- Figure 17 a strong correlation between the two operators used to determine electronic and steric effects is shown. These effects can be related to the stabilization process and non-covalent interactions. In this way, the quantum   similarity analysis allows us study the ATP complex and understand the steric and electronic relationship along the ligand set.

Conclusions
Inadequate dosing and incomplete treatment regimens, coupled with the ability of the tuberculosis bacilli to cause latent infections that are tolerant of currently used drugs, have fueled the rise of multi-drug resistant tuberculosis. To deal with this problem, there is currently an increased interest in kinase inhibitors with the main aim of proposing new anti-tuberculosis drugs. For these reasons, in this work we conducted a study of the crystalized STPKs Pkn A, B1, B2, B3 and G. The conformational analysis shows the DFG-in state of Pkn B, G, and DFG-out state of Pkn A. Methionine is the gatekeeper residue in all the kinases studied.
To understand the interactions of the active site of the protein kinases from Mycobacterium tuberculosis, a docking study using the protein kinases Pkn A, B and G was conducted. In these protein kinases, the C-term domain is extracellular and N-term is cytosolic. The cytosolic domain is almost the same for all the kinases and the extracellular domain is different. Ligands bind to the kinase through the formation of two to three hydrogen bonds with the backbone residues of the hinge region connecting the kinase N-and C-terminal loops. The residues with hydrogen bonds on the hinge zone of Pkn A are GLU96 and VAL 98, of Pkn B are GLU 93 and VAL 95, and finally of Pkn G are GLU233 and VAL235. From these results, it is possible to see that the interactions of the hinge zone are characterized by the key factor of one or two H-bonds acceptors and one H-bond donor in the ligands of this zone. On the other hand, the DFT calculations performed with some of the most representative compounds have supported the previous conclusions. Finally, the quantum similarity analysis shows good correlation between the steric and electronic effects in each ATP complex. In this sense, the outcomes presented in this work can be used to develop new protein substrates of the STPKs and find new tuberculosis treatments.

Data availability
All data underlying the results are available as part of the article and no additional source data are required.