Interaction of N-3-oxododecanoyl homoserine lactone with transcriptional regulator LasR of Pseudomonas aeruginosa: Insights from molecular docking and dynamics simulations

Background In 2017 World Health Organization announced the list of the most dangerous superbugs and among them is Pseudomonas aeruginosa, which is an antibiotic resistant opportunistic human pathogen as well as one of the ‘SKAPE’ pathogens. The central problem is that it affects patients suffering from AIDS, cystic fibrosis, cancer, burn victims etc. P. aeruginosa creates and inhabits surface-associated biofilms. Biofilms increase resistance to antibiotics and host immune responses, because of those current treatments are not effective. It is imperative to find new antibacterial treatment strategies against P. aeruginosa, but detailed molecular properties of the LasR protein are not clearly known to date. In the present study, we tried to analyse the molecular properties of the LasR protein as well as the mode of its interactions with autoinducer (AI) the N-3-oxododecanoyl homoserine lactone (3-0-C12-HSL). Results We performed docking and molecular dynamics (MD) simulations of the LasR protein of P. aeruginosa with the 3-0-C12-HSL ligand. We assessed the conformational changes of the interaction and analysed the molecular details of the binding of the 3-0-C12-HSL with LasR. A new interaction site of the 3-0-C12-HSL with LasR protein was found, which involves interaction with conservative residues from ligand binding domain (LBD), beta turns in the short linker region (SLR) and DNA binding domain (DBD). It will be referenced as the LBD-SLR-DBD bridge interaction or “the bridge”. We have also performed LasR monomer protein docking and found a new form of dimerization. Conclusions This study may offer new insights for future experimental studies to detect the interaction of the autoinducer with “the bridge” of LasR protein and a new interaction site for drug design.


Introduction
Pseudomonas aeruginosa (P. aeruginosa) is a Gram-negative, monoflagellated, obligatory aerobic, Bacillus of the family Pseudomonadaceae and was first introduced at the end of the nineteenth century (1,2). It has been found in environments such as soil, water, in human bodies, in animals, sewage, plants, and hospitals (3). P. aeruginosa is an opportunistic human pathogen, because it rarely infects healthy persons. The central problem is that it infects patients who are immunologically compromised, like those suffering from AIDS, cystic fibrosis, cancer or severe wound from burning (2,4). This organism is a major issue and it is known to cause about 40-60% deaths in hospitals. About 90% of cystic fibrosis deaths are considered to be due to the secondary infection by this pathogen (5). The pathogenicity of P. aeruginosa occurs due to the synthesis of virulence factors like proteases, hemolysins, exotoxin A, production of antibiotic pyocyanin, Hydrogen Cyanide (HCN), Type II secretion system, ramnolipids and biofilm formation by this organism (6). Biofilm formation is characteristic to nearly all bacteria where During colonization and invasion, both the pathogen and the host are exposed to molecules released by each other like bacterial AIs or hosts stress hormones and cytokines. The mechanisms and receptors that might be involved in cross-talk between microbial pathogens and their hosts are not well described to date. The activity of the LasR protein is regulated by the AI ligand N-3-Oxododecanoyl Homoserine Lactone (3-O-C12-HSL) which when binds to the LasR protein helps the protein to dimerize. There remains a need for understanding of the LasR monomer, because till date there is no molecular detail information about LasR monomer behaviour. Therefore, in this work we tried to analyze the mechanism of interactions of the 3-O-C12-HSL with LasR protein. So far this is the first report that shows that the 3-O-C12-HSL can interact as well with the beta turns in the viccinity of DNA binding domain. This study may therefore be used to develop new therapeutics to prevent the infections caused by P. aeruginosa by targeting both the LBD as well as the beta turns in the viccinity of DBD and inhibit activation of genes.

Sequence analysis of LasR
The methodology was based on a previous study (1). The Amino acid sequence of LasR protein of P. aeruginosa was obtained from UniprotKB (Sequence ID P25084). The crystal structure of the N-terminal LBD of LasR protein (amino acid residues 7 to 169) from P. aeruginosa was available in the Brookhaven Protein Data Bank (PDB) (PDB ID: 3IX3, chain -A). However, in order to have a better understanding of LasR protein properties, the structure of the entire LasR protein was needed.
Therefore, the amino acid sequence was used to search in the PDB databank to find suitable template(s) for homology modeling using BLAST (15). From this BLAST search, the best match was found to be the crystal structure of QscR, bound to 3-O-C12-HSL (PDB code 3SZT, chain -A) (16) with 30% sequence identity, 97% query coverage and E-value of 1 × 10-25. However, the other similar structures which were obtained from BLAST search results were as follows: 1. 1FSE, chain-A with 37% sequence identity, 28% query coverage and E-value of 2 × 10−4. It is the crystal structure of Bascillus subtilis regulatory protein gene.
3. 3ULQ, chain-B with 32% sequence identity, 25% query coverage and E-value of 0.09. It is the crystal structure of the Anti Activator Rapf complexed with the response regulator coma DNA binding domain.

Modeling of LasR monomer
The protein LasR is a 239 residue long protein. The homology modeling of LasR protein was done using the HHPred web server (17). The HHPred web server uses a multi-template modeling approach and the templates, used for the homology modeling of LasR, are 3SZT_A, 3IX3_A, 1FSE_A and 3ULQ_B. These structures were also identified by BLAST (16) to be the suitable templates for homology modeling of LasR protein. The final model of LasR protein spanning the amino acid residues from 1 to 6 and 170 to 239 was built using the aforementioned templates. The root mean squared deviation (RMSD) of the built model was checked by superposing the coordinates of the modeled protein on to the template 3SZT, chain -A and it was found to be 1.512 Ǻ. The stereo-chemical fitness of the modeled LasR protein was checked by PROCHECK (18) (Figure S1,S2 of the Supporting Information), Verify3D (19) and Ramachandran Plots. The structure was found to be stereo-chemically fit.More than 96.65% of the amino acid residues in the final modeled structure had a 3D-2D score >0.2 as specified by Verify3D( Figure S3 of the Supporting Information) for a good computational model (Eisenberg et al., 1997). No amino acid residues were found to be in the disallowed regions of the Ramachandran Plot ( Figure S1 of the Supporting Information). The PROSA value of this model was -6.69 which represents a good model quality for homology models (20), although the obtained model is slightly different (1), it is still viable.

Building the model of the ligand
The LasR protein is known to exert its function when it gets attached to the ligand, 3-O-C12-HSL which is the AI (Figure 1). Thus to study the mechanism of the ligand binding interactions with LasR, the three dimensional model of the ligand was generated. The model was obtained from PubChem web-server (21).
The ligand parameters were calculated for the General Amber Force Field (22) by using the acpype tool (23) with AM1-BCC partial charges (24).

Molecular Dynamics Simulations of LasR protein systems.
MD simulations of all systems were conducted with the GROMACS suite, version 5.1.2 (25), utilizing the Amber ff99SB-ILDN force field (26).
In all cases, Short-range nonbonded interactions were cut off at 1.4 nm, with long-range electrostatics calculated using the particle mesh Ewald (PME) algorithm. (27,28) Dispersion correction was applied to energy and pressure terms to account for truncation of van der Waals terms. Periodic boundary conditions were applied in all directions. Set 1. LasR monomer simulation. In order to generate the starting structure of LasR monomer before docking, it was placed in a dodecahedron box of TIP3P water (29),to which 100 mM NaCl was added, including neutralizing counterions. Following two steepest descents minimization, LasR monomer was equilibrated in two steps, with position restraints applied to peptide heavy atoms throughout. The first phase involved simulating for 200 ps under a constant volume (NVT) ensemble. Protein and nonprotein atoms were coupled to separate temperature coupling baths, and temperature was maintained at 300 K using the Berendsen weak coupling method(31.).Following NVT equilibration, 200 ps of constant-pressure (NPT) equilibration were performed, also using weak coupling (31) to maintain pressure isotropically at 1.0 bar. Production MD simulations were conducted for 100 ns in the absence of any restraints. For this data collection period, the V-rescale(32) was used to maintain temperature, and the Parrinello-Rahman barostat (33,34) was used to isotropically regulate pressure. This combination of thermostat and barostat ensures that a true NPT ensemble is sampled. (32) Set 2. Molecular dynamics simulations using docked structures. It was shown that docking has its limitations (35). For this reason after finishing molecular docking simulations we chose the structures of LasR bound to 3-O-C12-HSL to perform molecular dynamics (MD) simulations. A time step of 2 fs was used during heating and the production run; coordinates were recorded every 1 ps. A 300 ns MD trajectory for each system was obtained under constant pressure at 300 K. Further details concerning the model building and simulation protocol can be found above.

LasR-3-O-C12-HSL ligand blind docking simulations
To build the LasR-ligand complex, the autoinducer ligand 3-O-C12-HSL was docked with LasR monomer using Autodock Vina(36), based on a simple scoring function and rapid gradient-optimization conformational search. However, it should be noted that AutoDock Vina currently uses several simplifications that affect the obtained results. The most significant simplification is the use of a rigid receptor. Vina provides a parameter called 'Exhaustiveness' to change the amount of computational effort used during a docking experiment. Note that the default exhaustiveness value of the original Vina is 8 and the authors claim that it "should increase the time linearly and decrease the probability of not finding the minimum exponentially" (36), hence increasing this value leads to an exponential increase in the probability of finding the energy minima.
Principal component(PCA) (37) and cluster analysis using K-means algorithm (38) was performed ( Figure 2 and Figure S4 and Figure S5 of the Supporting Information). The results demonstrate that the number of interaction sites don't change in the interval using exhaustiveness from 1024 to 4096. Exhaustiveness value of 1024 was chosen as it provides good results and good speed.
After the exhaustiveness was chosen, 100 independent docking calculations were carried out with different initial seeds within a box of dimension that was used before. All parameters were set to default, except the exhaustiveness value, which we increased to a value of 1024 to allow thorough sampling of the docked configurations and maximum number of binding modes to generate set to 20.

Analysis of docking conformations and trajectories
We performed the analysis of docking conformations and trajectories by a combination of Autodock Vina (36), Gromacs (25) in addition to custom python scripts, which uses pandas(39), scikit-learn (40) and MDTraj (41).
Here is the list of program packages we used for analysis:  The MDTraj python library (41) was used for the trajectory analysis.
 Plot visualization was done using matplotlib (42) and Seaborn package (43). We used the following techniques for the analysis:  Principal component analysis is an unsupervised statistical technique used to emphasize variation and bring out strong patterns in a dataset. It's often used to make data easy to explore and visualize.. Developed about 100 years ago by Pearson and Hotelling (47), it remains a popular dimension reduction technique (47). The PCs describe concerted atomic displacements and can highlight major conformational changes between the structures. These motions are essential for protein function (48). The dynamics in this low-dimensional subspace -spanned by the first few PCs -was termed "essential dynamics" (49). PC analysis, performed on Cartesian coordinates has proven to be a valuable tool for studying conformational changes.
 Cluster analysis is another unsupervised technique for grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups (clusters). by minimizing intra-cluster and maximizing inter-cluster differences. Therefore, most clustering algorithms require a measure of similarity, or "distance", of objects. Clustering algorithms can be divided into partitional and hierarchical clustering approaches (50).

NMR calculations
Finally, the trajectory of LasR monomer simulation was used for the calculation of chemical shifts. Sparta+ (51) and ShiftX2 (52) were used to predict the chemical shifts of backbone atoms of the protein. In the prediction process, waters were removed from each trajectory using mdtraj's wrapper functions (41).

Protein-Protein Docking
In general, computational docking approaches strive to sample all possible interactions between two proteins to discern the biologically-relevant interaction. When docking homology models, it is best if there is an experimental evidence to suggest the general interaction site (within~10 Å). Representative structures from molecular dynamics simulations were used for protein-protein docking using Cluspro (53). From the experimental X-ray data it was found that 'Trp152', 'Lys155' and 'Asp156' from H10 play an important role during dimerization.
The distances between 'Trp152' from chain A and 'Asp156 from chain B was 0.276 nm, 'Asp156 from chain A and 'Lys155' from chain B was 0.279nm. These residues were used as attraction constraints. In its basic operation, ClusPro outputs the structures at the centers of the 10 most populated clusters

Entire flowchart
The entire methodology is presented as a flowchart for the better understanding of the reader: • Sequence of LasR taken from UniprotKB(sequence ID P25084). Protein BLAST of LasR was performed to find out suitable templates for homology modeling.
• Homology modeling done using HH-pred web server. The modeled structure was validated using Verify 3D, Procheck, Prosa.
• The structure of 3-O-C12-HSL obtained from pubchem web server.
• PCA and cluster analysis performed on docking conformations.
• Extraction of centroid conformations from cluster analysis.
• Ligand parameters generated using Acpype interface in the framework of the AMBER force field.
• Using centroid conformations as starting points for molecular dynamics simulations using Gromacs.
• Analysis of molecular dynamics trajectory files using MDTraj.

Compactization of LasR without 3-O-C12-HSL
We performed a simulation run of 100 ns using a standard MD protocol for the assesemnt of conformational changes of LasR monomer. To assess the overall stability of the molecule, the root-mean-square-deviation (RMSD) of the protein atoms was calculated with reference to the initial frame through time MD runs ( Figure S6 of the Supporting Information). Another suitable measurement for the stability of the LasR monomer structure is radius of gyration( Figure S7 of the Supporting Information).
When analyzing MD simulations, Principal Component Analysis (PCA) (47) is commonly applied to visualize the motions that the system is exploring through the simulation. Each of these motions, or Principal Components (PCs), is responsible for an amount of variance in the original data that is quantified by its associated eigenvalue.
Thus, usually the first handful of PCs are enough to capture more than 70% of the variance in the original trajectory (the exact amount depends on the original trajectory data set). Though PC analysis reveals the main motions contained in an MD trajectory, it does not partition the snapshots into distinct conformational classes (50). This can be achieved by clustering the PC data. For the selection of initial starting structure of the LasR for docking study we performed cluster analysis on the LasR monomer run for the selection of a starting point. By identifying a distinct representative structure of the most populated cluster, this will allow to perform blind docking on the whole structure. Additionally, this cluster analysis permits assess what are the recurrent conformations that LasR adopts throughout the MD runs. For the clustering analysis, we chose the Agglomerative Clustering algorithm with Ward linkage (54) as the most appropriate as they are deterministic, allowing for reproducibility of resulting clusters, thus minimizing the amount of bias.
We performed several rounds of Agglomerative clustering with Ward linkage using different parameters (details can be found in the Computational methods section). The accuracy of the different clustering rounds was assessed by the help of the Davies-Bouldin Index(DBI) (55), Silhouette Score (56), Dunn Index (57) and the pseudo-F statistic(pSF or Calinski Harabasz) (58) metrics. These four metrics were plotted vs. the cluster count that was achieved using different number of cluster quantity parameters( Figure S8 of the Supporting Information). An optimal number of clusters were chosen for each region, simultaneously accounting for a low DBI, High Silhouette, High Dunn Index and high pSF values.
The distribution of clusters over simulation is divisive (Figure 4). Clearly separated in all four cases are: cluster 4 (black) at the beginning of the simulation (after equilibration),cluster 2 (green) in the middle, cluster 1 (lightgreen) and cluster 3 (darkblue) at its end. In this study, the clustering in the subspace defined by the first two PCs (Figure 4) provides the most coherent picture. Moreover, this clustering is supported by a good DBI, Dunn,Silhoutte Score and psF value ( Figure S8 of the Supporting Information). It also shows clearly that the simulation has converged ( Figure 5).

Figure 5: RMSD over simulation time and color-coded by clusters obtained from
Ward-linkage algorithm for the the complete data .The colors correspond to the clusters as defined in Figure 4.
For the validation of the quality of molecular dynamic simulation, theoretical NMR shifts were calculated using Sparta+ (51) and ShiftX2 (52). For the NMR shifts calculation, snapshots from cluster 3 (Figure 4 and Figure 5) were used.
In Figure 6 and Figure 7 you can see that there is a strong linear relationship between experimental and simulated NMR shift values ( Figure S9 of the Supporting Information), thus this demonstrates the quality of MD simulation.  After that for docking study the representative structure( Figure 8) was extracted from cluster 3 (Figure 4 and Figure 5). Images generated with UCSF Chimera (46) The representative structure of LasR system and its differences from the homology modeled structure are highlighted in Figure 8. Secondary structure analysis of the representative structure was performed as well using PDBSum (59) ( Figure S10 of the Supporting Information).
To our knowledge, this is the most complete model used for long-time scale MD simulations to date, containing the full-length LasR molecule (residues 1 to 239). This is also the first study to include the dynamics of the complete C-terminal region of LasR, which has proven to have interesting modulations of the N-terminal region of LasR dynamics, as will be discussed later.

Docking analysis of with LasR monomer
In this study, molecular docking approach was used to inspect the possible binding modes of 3-O-C12-HSL in LasR monomer protein. PCA and cluster analysis were performed on docking data (Fig. 8). The results show three binding sites, cluster 2 corresponds to experimental data, while cluster 1 and cluster 3 do not. (Figure 9). These results clearly support the findings of Bottomley et al. (13), where it was also demonstrated that 3-O-C12-HSL binds to LBD.
We performed several rounds of K-Means clustering with different parameters (details can be found in the Computational methods section). The accuracy of the different clustering rounds was assessed using the DBI (55), Dunn Index (56), Silhouette score (57) and the pSF (58) metrics. These four metrics were plotted vs. the cluster count ( Figure S10 of the Supporting Information). An optimal number of clusters were chosen for docking results, simultaneously accounting for a low DBI, high Dunn, high Silhoeutte and high pSF values.
We generated 2000 docked poses and performed representative structure extraction for use in MD simulations of the LasR 3-O-C12-HSL binding sites. The resulting representatives from each cluster are shown in Figure 10. These cluster representative structures were produced by finding the centroid conformations. Representative structures from each cluster were extracted. The binding energy for representative structure of cluster 1 is -5.4 kcal/mol, as for the mean binding affinity for the whole cluster is -5.257 ±0.233 kcal/mol ( Figure S12 of the Supporting Information). Cluster 1 contains 839 docked poses from 2000, about 41.95%. For cluster 2 the binding affinity for the representative structure is -5.1 kcal/mol and for the whole cluster -5.593 ±0.386 kcal/mol ( Figure S13 of the Supporting Information) and this one corresponds to the experimental binding site data (13). Cluster 2 contains 864 docked poses from 2000, about 43.2%. For cluster 3, the representative structure features the highest binding affinity -5.7 kcal/mol and for the whole cluster -5.264 ±0.27 kcal/mol ( Figure S14 of the Supporting Information). Cluster 3 contains 297 docked poses from 2000, about 14.85%, an interesting region that to our knowledge has never been previously studied.
There is significant literature suggesting that docking experiments are not suitable to predict the binding affinity of protein-ligand complexes, but they can still provide important information (60). For example, Warren et al. found that docking programs often identified the structure of the crystallographic ligand. However, the scoring functions were often not able to identify the structure of the crystallographic ligand as the lowest energy pose (60), as well as the work of Chen (35).

Binding Modes of 3-O-C12-HSL.
We performed three 300 ns simulations using standard MD protocol. Overall, 900ns of aggregate simulation time was used for the data analysis of 3-O-C12-HSL interaction with LasR monomer. The representative structures were taken from docking results( Figure 10) and used as starting points for MD simulation with LasR. Simulations were conducted for sufficient time to allow the positions of 3-O-C12-HSL to reach equilibrium in LasR molecule.
To assess the overall stability of the molecule, the mass-weighted root-mean-square-deviation (RMSD) of the backbone atoms was calculated with reference to the initial frame through time for the different independent MD runs. Figure 11 shows that Simulation 2 and 3 experience a substantial RMSD deviation from the initial starting point. Simulation 2 corresponds to cluster 2 in docking simulations, while Simulation 3 corresponds to cluster 3( Figure 10). Simulation 1 which corresponds to cluster 1, the molecule of 3-O-C12-HSL did not fixate and reach equilibrium, so further research was not performed (Figure 11).
Simulation 2 shows that after 230ns, the structure becomes stable.While Simulation 3 changes its conformation in 100ns and becomes stable till 300ns. Another more suitable measurement of the flexibility of the LasR monomer is the per-residue root-mean-square fluctuation (RMSF). In Figure 12 the average per-residue RMSF for each cluster simulation runs is plotted.  Residues that participate in hydrophobic interactions are shown on Figure 14 and Figure  S13. PCA, cluster and hydrogen bond analysis were also performed on simulation 3 (Fig. S14). Over the course of cluster 2, 3-O-C12-HSL with LasR established an average of 1.506±0.742 hydrogen bonds,for cluster 3 the average 0.228±0.492 (Fig. S13),while in cluster 1 the average is 0.652±0.654. Over the course of simulation 3, 3-O-C12-HSL establishes hydrogen bonds and large number of hydrophobic contacts with amino acid side chains in the beta turns in the viccinity of DNA binding domain of LasR protein ( Figure S15 of the Supporting Information). In simulation 3, 3-O-C12-HSL forms hydrogen bonds mainly with 'Lys182' and 'Leu177' of the beta turns in the viccinity of DNA binding domain (Fig. S15 from supplementary material). RMSD analysis of the conformations between LasR monomer and LasR bound to 3-O-C12-HSL in DBD is equal to 1.677 Ångström (Figure 15). Residues that participate in hydrogen and hydrophobic interactions are shown on Fig. 16. Figure 16: Formation of hydrogen bonds of OdDHL with Lys182 and Leu177, from a centroid snapshot of cluster 2 in simulation II( Figure S9). Putative hydrogen bonds are shown as black lines. Image was generated with PyMOL (44).
Complete data sets for MD simulations are available as Supporting Information (Figures S15-S20).

Validation of MD models by protein docking
Given that our findings are based on computational simulations, it is therefore necessary to apply the complementary method to validate these results. It is known that 3-O-C12-HSL to LasR helps in the process of dimerization and binds to the corresponding promoter DNA region as a dimer. Thus, the monomeric LasR-3-O-C12-HSL complexes were subjected to dimerization. We performed the protein docking experiment using ClusPro 2.0 (53, 62-64) due to its success in the CAPRI (Critical Assessment of Predicted Interaction). Each centroid conformation was extracted from simulations 2 and 3 and was used for docking. For the selection of the model we used the approach as recommended by the authors of ClusPro (53), suggesting to find the most populated clusters rather than the lowest-energy structures and use the centers of the clusters as putative models.
From simulation 2 the top model contains 122 members and the scores for the docking model were -1440.7 for the center and -1517.9 for the lowest energy, suggesting a favorable binding mode. (Fig. 17)

Conclusion
From the simulations it can be safely concluded that the AI ligand 3-O-C12-HSL, can bind to LBD and to the beta turns in the viccinity of DBD of LasR protein. The interaction with the beta turn is a novel site. Both conformational transitions favour dimerization, so this raises the question what would be the roles of different binding sites,it could be a two step activation or resistance mechanism.This part needs further research. This study also sheds light on the patterns of interactions of the autoinducer 3-O-C12-HSL, ligand with LasR protein. Results from this study may be used as a first-hand guide for future drug development endeavors.