Tessellate & Montage : Molecular analytics of cyclic conformations

The conformations and shapes of macromolecular structures in biological and synthetic materials often define the macroscopic functions of the systems. Tessellate and Montage provide a standardized toolset for rapid reporting of large datasets allowing comparisons of cyclic molecule conformations (ring pucker) from structural databases and simulation trajectory data. This facilitates an understanding of the dynamic transition between common conformations and the flexible range in a ring that underlies molecular behaviour and recognition properties.


Introduction
Cyclic molecules and systems consisting of cyclic molecules play central roles in several chemical and biological processes 1,2 .RNA, DNA, proteins and carbohydrates are polymers that all contain cyclic moieties that exhibit wide ranging conformational variety.For example, the puckering of fructose can be used to illustrate the value of determining conformational statistics as recognition and binding to Fructokinase is due to pucker specificity.Similarly different sugar puckers of ribose in RNA and de-oxyribose in A-DNA and Z-DNA account for the molecular shape of the respective nucleotide helices 2 .In DNA-protein complexes any kink in the DNA axis is accompanied by a change in deoxy-ribose pucker, demonstrating the important connection pucker plays in molecular structure and flexibility 3,4 .
In enzymatic transition states for glycosyl hydrolases and transferases, conformations change to provide the correct electronic state for the reaction to take place 5 .
Conformational analyses, which recognise shapes (Figure 1), are important for finding transition state leads, understanding molecular recognition events, ligand fitting in macrocycles and for molecular pattern matching.
We created a standardized toolset called  1, however these tools do not provide methods for calculating the ring pucker of macrocycles, nor do they have the reporting capability for comparing multiple types of molecular data sets in a single report.Existing pucker models include perpendicular displacements from a mean plane 7,8 , intracyclic torsions 9,10 and triangular tessellation [11][12][13][14][15] .A further aim was to ensure that these tools (Tessellate and Montage) function independently from informatics workflows systems, while also being readily incorporated and fully functional as part of systems such as Galaxy 21 and the Glycome Analytics Platform 22 .

Methods
Tessellate is a package for quantifying ring conformation and qualifying the conformer in terms of a canonical conformer (e.g.chair, boat).Developed in Python, Tessellate uses BioPython 23 for accessing protein data bank files.Using Tessellate with Visual Molecular Dynamics (VMD) is possible through a tcl script provided.Visualising the results of this analysis is done using Montage.
Montage is a fork of the excellent MultiQC 6 and provides reporting templates and functionality for computational chemistry and chemical glycobiology.Previously we prototyped reporting within a web platform, the aim here was to ensure that the visual reports connected easily into workflow systems such as Galaxy 21 , GAP 22 and were also transferable outside of these platforms.

Implementation
These tools were designed to work in Linux.
Triangular tessellation for ring sizes 5-8 -Implemented as per the following papers 11,14,15 .Protein and water residues in the input structures are ignored by default.
Nomenclature and ring ordering -Conformational qualification depends on a consistent ring ordering according to the IUPAC definition 24 .Residues atoms are matched to an internal ligand dictionary and aligned in best-possible cyclic order to prevent atom ordering mistakes.
Ring finding algorithm -Cycles are detected using the Smallest Set of Smallest Rings (SSSR) ring perception algorithm 25 .
Conformational quantitation -Coordinates are matched to known canonical conformers coordinates.The best match is returned.

Usage example:
The workflow is as follows: 1. Choose molecular data e.g. from the PDB or use VMD to create a list of atomic coordinates from molecular simulation data.
4. Open report using a web browser.

Use cases Simulation -Ribose in vacuo biased molecular dynamics calculation
The conformational changes during a biased free energy simulation for ribose.This analysis can be used to identify patterns in conformational change and confirm adequate sampling of phase space (Figure 2).VMD was used to open and read coordinate and trajectory files, and write out the atomic coordinates of the ribose 5 membered ring.The atomic coordinates are read in by Tessellate, producing pucker coordinates stored as JSON.MultiQC creates an HTML report from the JSON input.

Comparative conformational character of DNA and RNA
The nucleotides A-DNA and Z-DNA tend to be C3'-endo ( 3 E), but in B-DNA the tendency is to C2'-endo ( 2 E); this accounts for the different molecular shape of the respective nucleotide helices.To explore this, PDB structures that are representative of A, B and Z DNA (1ANA, 3BSE, and 2DCG) were analysed (see Operation section and Figure 3).

Cyclodextrins
An analysis of alpha cyclodextrin (ACD) from the PDB shows the glucose monomer conformations are mostly 4 C 1 (85.9%) with few deviations (Figure 4).The ring pucker of the macrocycle, as defined by treating the centre of geometry for each glucose in cyclodextrin as an atom, varies between planar (P) and half chair (H, e.g. 3 H 4 , 5 H 4 , 2 H 3 ) conformations.
Results from the three use cases can be found in Supplementary File 1.

Conclusions
Tessellate enables users to compile datasets from the reservoir of PDB and simulation produced data.From this ensemble of experimentally and computationally determined ring structures, the relation between cyclic conformational preference and other variables can be discovered using Montage.: Overall, the article is acceptable and generally well written.The tools described are Review summary likely to be useful to persons studying ring conformations.But, a few points need to be addressed before it should be considered complete, and the article would be enhanced by some other modifications.

Attention to these points is required
Provenance information should be provided for the simulation data.That the data are made available is very good, but readers also need to know where it came from.It is important that readers be able to determine the extent to which the data should be considered only as illustrations or whether they can be considered as valid scientific data.A list of reporting requirements for simulation data is recommended near the end of "Carbohydrate force fields", available at: Foley et al .If the data are from a separately published, peer-reviewed paper, it is sufficient to reference the paper.
It is important to fully caption figures.A listing the meanings of all the pucker abbreviations used in the figures is essential.The table could be in the text or given as a key in one or more of the figures.This is for persons out of field as well as experts who are simply from another branch.For example, I do not know what UAP means; someone who has not studied the subject would be very lost.I expect that the meanings are all in referenced papers, but it is kind not to make readers consult references simply to understand such a significant element in a figure.Include addresses, and references in the text, for all websites, software and databases listed, even the ones that 'everyone knows'.For example, although PDB is defined in the abbreviations section, there is no indication to a user how to find the Protein Data Bank.Also, VMD is referenced in a table, but it should be referenced at its first mention in the text as well.There might be other similar instances.
Notes about the figures: The graphs need better titles.A title like "Tessellate: Pucker series for five" can be made 1 Notes about the figures: The graphs need better titles.A title like "Tessellate: Pucker series for five" can be made sense of after some thought, but one has to look around a moment to realize it means a five-membered ring, as opposed to a 5 nanosecond simulation or some such.Also, since the whole paper is about Tessellate and Montage, it is not necessary to include the word unless it is being compared to something else.A title like "Pucker results for a Tessellate five-membered ring" would be better.Please explicitly say somewhere that you are assuming a constant puckering amplitude.A five-membered ring requires two parameters, a six-membered ring requires three, and so forth.
Related to that, the phrase "Numeric Pucker 1D" in Figure 2 is not defined.The x-axis for the top graph in Figure 2 says "Time or sequence index."Which is it?If it is a time, what is the unit?Increase the font size on the axes where they are very small.I would make them at least the size of the title of the first graph in Figure 2.

Attention to these points would enhance the paper
A little reorganization in the Methods section would make the article more accessible and possibly open the tools up for a wider audience.In particular, the subsection titled is Implementation confusing.It starts with a prerequisite (use Linux), which probably goes in the Operation section.Then, it lists a number of features.While the analysis method used for each feature is also described, which is very good, I think the section would be better titled "Available Analytical Protocols" or even "Features", and the line about Linux removed as it is made clear in .Operation In cases like those presented in Figure 4, I would find it more useful to present the data as a fraction or percentage rather than raw counts.But, this is merely my opinion and recommendation.It is certainly unreasonable to expect every paper to reference every other paper with any relevance, but you might take a look at one that I have considerable knowledge of, "BFMP: A Method for Discretizing and Visualizing Pyranose Conformations", available here: Makeneni et al .For that paper, we spent a lot of time thinking of ways to communicate information about pucker in 6-membered rings, and I think we came up with some nice graphical methods.You might consider including some of them regardless whether you like the classification scheme.The authors present two python-based packages for the analysis of ring conformations in molecular systems and for the visualization of the related data.The tessellate package implements various algorithms for the detection of cycles of different size (5 to 8 included) and represents a handy tool, which pairs nicely with its visualization counterpart, Montage, to produce and analyze aggregate data regarding the structure of the cycles.I tested the code, which is freely available, and was able to reproduce the examples reported in the article.I think that these packages will prove useful and that this article should be indexed in F1000Research.Some comments regarding possible (optional) improvements : 1) tessellate can not only be run as a standalone command, but can also be used as a python module.This can be beneficial to the interoperability with other libraries, and probably a couple of lines could be added to the text to show an example of how to do that.

References
2) if possible, it would be helpful to sort by size the elements of the histograms generated by Montage, so that at the bottom of the stack is the largest population, followed by the second-most large, and so on.
3) regarding interoperability: output in the pandas data frame format would be probably a useful addition.

Is the rationale for developing the new software tool clearly explained? Yes
Is the description of the software tool technically sound?Yes Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?Yes No competing interests were disclosed.

Competing Interests:
I have read this submission.I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias You can publish traditional articles, null/negative results, case reports, data notes and more The peer review process is transparent and collaborative Your article is indexed in PubMed after passing peer review Dedicated customer support at every stage For pre-submission enquiries, contact research@f1000.com

Figure 2 .
Figure 2. Scatter plot and histogram of ring pucker from timeseries data.

Figure 3 .Page 5 of 12 F1000Research 2017, 6 :Figure 4 .
Figure 3. Histogram of ring pucker in different types of DNA, the planar conformation is due to the nitrogen base.

doi: 10 .
5256/f1000research.14386.r35646B. Lachele Foley Complex Carbohydrate Research Center, The University of Georgia, Athens, GA, USA Relevant reviewer expertise: conformational properties of carbohydrates, especially puckering of 6-member rings; Linux; scientific computing, especially molecular modeling and the proper communication of simulation methods; communication of technical information to interdisciplinary and non-specialist audiences; some biochemistry of carbohydrates.: The authors present a pair of software tools, Tessellate and Montage, for the analysis Article summary of ring conformations (typically referred to as ring pucker).The first tool, Tessellate, analyzes data containing coordinates of atoms bonded cyclically.The output of Tessellate can be processed by Montage to produce visual representations.

2
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?YesAre the conclusions about the tool and its performance adequately supported by the findings presented in the article?Yes No competing interests were disclosed.Competing Interests:I have read this submission.I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.21 December 2017 Referee Report doi:10.5256/f1000research.14386.r28851Marcello Sega Faculty of Physics, University of Vienna, Vienna, Austria