Relative prolixity in journals with different citation impact values: an evidence-based scientific writing assessment [version 1; peer review: 2 approved with reservations, 1 not approved]

Background: Scientific writing guidelines recommend that a scientific text should be straightforward, without prolixity, and informative, without obscurity. However, the extent to which researchers follow these recommendations is unknown. Considering that the most cited journals provide more detailed instructions for authors, we aimed to investigate the degree of relative prolixity (i.e., length versus amount of information) among journals with different citation impact scores. Methods: We analyzed journals whose articles follow the classic Introduction, Method, Results, and Discussion structure, written in English and with a CiteScore value ≥ 0.01 classified in the ‘Pharmaceutical Science’ area. Relative prolixity was calculated as the ratio between the number of characters and the number of citations contained in the introductory section of original articles. Additionally, we collected the number of paragraphs and words. Results: The number of characters, words and citations in the Introduction section were significantly higher in the journals with higher CiteScore values. The median number of paragraphs in the Introduction was not affected by the citation impact of the journals. The degree of relative prolixity in the Introduction section of the articles was negatively correlated with the CiteScore values. Conclusions: Articles published in journals with higher CiteScore values have lower degrees of relative prolixity (i.e., shorter texts to transmit a certain amount of information) and obscurity.


Introduction
A scientific text should be straightforward, without prolixity (Matthews & Matthews, 2014), and informative, without obscurity (Annesley, 2010a;Hirst et al., 2015). However, the extent to which researchers follow these recommendations is unknown. Considering that the most cited journals provide more detailed instructions for authors (Gasparyan et al., 2017), we hypothesized that articles published in such journals would be less prolix and more informative than those published in journals with fewer citations. Therefore, the objective of this study was to investigate the degree of relative prolixity (i.e., length versus amount of information) among journals with different CiteScore™.

Experimental design
To verify the degree of relative prolixity of original articles among journals with different citation impact scores, we chose to analyze journals of a specific area whose publications had their quality recently assessed -'Pharmaceutical Science' area (indexed in the Scopus database) (Bohannon, 2013;Xia et al., 2015). We evaluated 101 journals (from 305) whose articles follow the classic Introduction, Method, Results, and Discussion (IMRAD) structure, written in English and with a CiteScore value ≥ 0.01.
The degree of relative prolixity was calculated as the ratio between the number of characters, which increase prolixity (Hirst et al., 2015), and the number of citations to any reference, which make the text more informative (Annesley, 2010b; Katz, 2009). Additionally, we collected the number of paragraphs and words. These analyses were done in the Introduction because this is the only section for which scientific writing literature recommends a length limit. To count the number of characters (including spaces), words, paragraphs and citations, the entire body of text of the Introduction section of each article was copied and pasted into Microsoft Office Word 2010 (Microsoft Corporation, Redmond, WA, USA). Each journal was evaluated using the median of the last three published articles in 2018.

Data analysis
Journals were grouped into quartiles of distribution based on their CiteScore values (Fernandez-Llimos, 2018). Differences between groups were assessed using the Kruskal-Wallis test and the correlation between the journals' CiteScore and the degree of relative prolixity was assessed by the Spearman's rank correlation test using SPSS software version 21 (IBM, Armonk, NY, USA). For all analyses, p values below 0.05 were considered statistically significant. Data are presented as mean ± standard error of the mean.

Results
The mean CiteScore values of the 'Pharmaceutical Science' journals divided into the four quartiles were 0.34, 1.37, 2.40 and 4.14, respectively. The median number of paragraphs in the Introduction of articles did not differ significantly between quartiles and ranged from 3.92 to 4.84 (p = 0.102; Figure 1a). Both the number of characters (including spaces) and citations (21, on average) in the Introduction were significantly higher in the journals with the highest CiteScore (p < 0.001; Figure 1b and c). The number of words in the Introduction gradually increased from the first to the fourth quartile (p < 0.001), whose average number was 442.69, 512.24, 591.00, and 721.60 words. No differences were detected between the first and second, and second and third quartiles (p > 0.05).
The degree of relative prolixity in the Introduction of the articles presented a negative correlation with the CiteScore value of the journals (p = 0.017; Figure 2).

Discussion
This study showed that the articles published in pharmaceutical science journals with higher values of CiteScore have an Introduction with a lower degree of relative prolixity and more characters, words, and citations. A low degree of relative prolixity matches the consensual recommendation that the In conclusion, articles in the pharmaceutical sciences journals with higher CiteScore values show lower degrees of relative prolixity (i.e., shorter texts to transmit a certain amount of information) and obscurity.

Grant information
The author(s) declared that no grants were involved in supporting this work.

1.
I'm not sure why the authors have looked at the number of paragraphs and number of characters as outcomes in themselves. The length of the introduction doesn't tell us about writing quality or wordiness. A long piece can be well-written and a short piece can be poorly written. Also, different journals have different constraints (such as constraints on word counts) that may affect the length of the introduction. Thus, I don't think these outcomes are very informative, and need not be highlighted in graphics.

2.
Why were characters, numbers, and citations analyzed by CiteScore quartile but relative prolixity analyzed treating CiteScore as a continuous variable? In Figure 1, I expected to see a Figure 1d that showed the average prolixity per quartile.

3.
This paper would be strengthened by considering other metrics of readability beyond just the ratio of characters to citations. There are numerous online tools that allow one to measure readability with validated measures such as the Flesh Reading Ease Scale. See for example, this tool: http://www.checktext.org/. Examining the correlation between CiteScore and readability scores would add to the impact of this paper.

4.
Journals sometimes have limits on the number of references, which would influence the number of citations appearing in the introduction section. The authors should state whether any of the journals examined had such limitations.

5.
The aim of this study was to gauge the association between a journal's CiteScore and a measure of the journal's prolixity. But only 3 samples per journal were taken. Given that prolixity may vary widely from paper to paper within the same journal, I believe that a larger sample size per journal would have strengthened this study. It's not clear that the last three papers published are going to be representative of all studies in a journal. The authors should comment on how they arrived at the choice of 3 studies per journal.

6.
The authors should give the magnitude of the correlation coefficient between CiteScore and prolixity. I believe it's about -.20, which would be considered an extremely weak correlation even though it is statistically significant. The authors should comment on the fact that the correlation is weak. Also, there is one study with high prolixity (>600) that is making it hard to see the pattern of correlation in Figure 2; consider presenting the graph both with and without this point.

7.
Bar graphs are not ideal. I would recommend that the authors replace bar graphs with box plots with individual points overlaid, as these are more informative. 8.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Partly © 2019 Ghasemi A. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Asghar Ghasemi
Endocrine Physiology Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran The paper by Nascimento et al. analyzed whether "relative prolixity" of published papers in the field of pharmaceutical science is related to quality of Journals (assessed by CiteScore values). Although, efforts on improving science of writing are acknowledged, in my opinion, the presented paper is subjected to some criticisms as below.

Major issues:
The index that the authors used to assess prolixity of the published papers seems not to be valid to lead to a legitimate conclusion. The authors did not provide a good background for the index. I could not find any information about "relative prolixity index" in the references that the authors refer to (Hirst et al., 2015;Annesley, 2010;Katz, 2009). There are other scores that authors can use; e.g. "Fog index" (Gunning, 1969[ref-1) which was developed by Robert Gunning to test readability of a paragraph or passage. 1.
The conclusion of the study is mostly based on the results of a correlation between CiteScore and relative prolixity; the number of journals with high cite score (~10) seems to be low (~10) that affect the correlation coefficient. In addition, correlation per se cannot provide a basis for conclusion. In fact, in this study the authors conclude that the more the references in the introduction, the less the relative prolixity. This is not straightforward as other factors such as topic presented, number of hypotheses being tested, etc. can affect the number of references in the introduction section of a paper. I suggest to consider relevancy of the citations in papers studied.