A survey exploring biomedical editors’ perceptions of editorial interventions to improve adherence to reporting guidelines

Background: Improving the completeness of reporting of biomedical research is essential for improving its usability. For this reason, hundreds of reporting guidelines have been created in the last few decades but adherence to these remains suboptimal. This survey aims to inform future evaluations of interventions to improve adherence to reporting guidelines. In particular, it gathers editors’ perceptions of a range of interventions at various stages in the editorial process. Methods: We surveyed biomedical journal editors that were knowledgeable about this topic. The questionnaire included open and closed questions that explored (i) the current practice of their journals, (ii) their perceptions of the ease of implementation of different interventions and the potential effectiveness of these at improving adherence to reporting guidelines, (iii) the barriers and facilitators associated with these interventions, and (iv) suggestions for future interventions and incentives. Results: Of the 99 editors invited, 24 (24%) completed the survey. Involving trained editors or administrative staff was deemed the potentially most effective intervention but, at the same time, it was considered moderately difficult to implement due to logistic and resource issues. Participants believed that checking adherence to guidelines goes beyond the role of peer reviewers and were concerned that the quality of peer review could be compromised. Reviewers are generally not expected to focus on reporting issues but on providing an expert view on the importance, novelty, and relevance of the manuscript. Journals incentivising adherence, and publishers and medical institutions encouraging journals to take action to boost adherence were two recurrent themes. Conclusions: Biomedical journal editors generally believed that engaging trained professionals would be the most effective, yet resource intensive, editorial intervention. Also, they thought that peer reviewers should not be asked to check RGs. Future evaluations of interventions can take into account the barriers, facilitators, and incentives described in this survey.

report report report report report report report

Amendments from Version 2
We are now reporting the results for the ease of implementation/ effectiveness in terms of medians and quartiles instead of means and standard deviations. Regarding Figure 1, we have removed the means and we are now displaying the medians as blue horizontal lines. Moreover, we decided to remove the black points representing the individual scores as they do not allow to see the medians in the graphs. In this way, Figure 1 is clearer and cleaner.
In the Discussion, we have added further justifications on whether we believed that sample size and response rates were adequate for the purpose of the survey. We have modified the Conclusions both in the abstract and at the end of the paper, linking them to the survey aims (perceptions, barriers, facilitators…). We have also moved to the end of the Discussion section what was reported in the old Conclusions.

Introduction
Transparent and accurate reporting of research is essential for increasing the usability of available research evidence. Reporting guidelines (RGs) can be useful tools to help authors report research methods and findings in a way that they can be understood by readers, replicated by researchers, used by health care professionals to make clinical decisions, and included in systematic reviews 1 . Since the inception in 1996 of the Consolidated Standards of Reporting Trials (CONSORT) for the reporting of randomised controlled trials (RCTs) 2 , more than 400 RGs for different study types, data, and clinical areas have been developed. These RGs can be found in the library of the Enhancing the Quality and Transparency Of Health Research (EQUATOR) Network 3 .
Biomedical authors' adherence to RGs has been observed to be suboptimal 4 . Consequently, in recent years various stakeholders have proposed, and sometimes evaluated, the impact of different types of interventions to improve this adherence. These interventions were identified and classified in a recently published scoping review 5 . We found that the strategies most widely used by journals have been shown not to have the desired effect 6-9 and this highlighted the need for the implementation and evaluation of the other interventions proposed 5 .
This paper reports a survey aimed to inform the future evaluation of interventions to improve adherence to RGs. In particular, we focused on interventions that can be implemented at various points in the editorial process. Our specific objectives were to explore the perceived ease of implementation of different interventions and the potential effectiveness of these at improving adherence to RGs; to map the barriers and facilitators associated with these interventions; to determine possible solutions to overcome the barriers described, and to identify further editorial interventions that could be implemented and subsequently evaluated.

Participants
Purposive sampling was used to recruit biomedical editors that were expected to be knowledgeable and experienced in the topic we aimed to explore. We recruited participants not based on their representativeness of all medical journals but on the fact that they were "information-rich cases" 10 .
Participants were sampled from three sources: (i) editors of journals that had published studies describing interventions to improve adherence to RGs identified in our scoping review 5 , (ii) members of the Methods in Research on Research (MiRoR) Network with current editorial positions and (iii) editors of the top-10 journals (based on impact factor) of BMJ Publishing Group which, apart from being one of the partner institutions of MiRoR, has published the main RGs 2,11-13 ) and has traditionally performed research to improve the transparency and quality of biomedical publications 14 . The authors of this survey who met the eligibility criteria were excluded as potential participants.

Recruitment
The survey was only open to editors that we invited to participate. We contacted three editors (including the editorsin-chief) of each of the sampled journals, as well as individual editors from the group (ii) above. By replying to our invitation email, participants could suggest further editors that they considered could contribute to the survey. To contact editors not known to us we sought email addresses in the public domain.
The survey was not advertised on any website.

Survey administration
The survey was administered by SurveyMonkey 15 and was open between 27 November 2018 and 24 February 2019. Participants were sent a personalised email inviting them to complete an online survey investigating their opinions about different editorial interventions to improve author adherence to RGs. Each invitation was tied to a unique email address. Two reminders to complete the survey were sent to nonresponders at four and eight weeks after the initial mailing.
Participants could edit their responses while completing the survey. However, they could not re-enter the survey once it was completed as no two entries from the same IP address were allowed. We did not offer any incentives for completing the survey.

Response rates
We recorded the view rate of the invitation email (subjects opening the invitation email/subjects invited), the response rate (subjects completing the survey/subjects invited), and the completion rate (subjects completing the survey/subjects completing the first question of the survey).

Questionnaire development
Our previous scoping review 5 identified 31 interventions targeting different stakeholders in the research process. For use in this survey we chose a smaller subset of nine interventions that could be implemented during the editorial process as our focus was on journal editors' perceptions (see Box 1). • Instruct peer reviewers to use the appropriate RGs when assessing a manuscript (Intervention 6)

Box 1. Interventions included and their targets
• Instruct peer reviewers to scrutinise the completed RG checklist submitted by the authors and check its consistency with the information reported in the manuscript (Intervention 7) C. Interventions targeting editorial staff: • An evaluation of the completeness of reporting by a trained editor (or editorial assistant), who would return incomplete manuscripts to authors before considering the manuscript for publication (Intervention 8)

D.
Interventions targeting authors, peer reviewers, and editors: • Training for authors, peer reviewers, and editors on the importance, content, and use of RGs (e.g. The EQUATOR Network toolkits) (Intervention 9) The survey combined open and closed response questions to seek participants' perceptions of a series of interventions to improve authors' adherence to RGs that could potentially be implemented during the editorial process. We pilot tested the draft survey questionnaire with two collaborators of the MiRoR project who currently hold editorial positions. They were asked to review the survey for its clarity and completeness and to provide suggestions on how to improve its structure.
Based on feedback from the pilot we decided not to include the intervention "Implementation of the automatic tool Statreviewer 16 " since participants were not aware of this software and stated that their perceptions would strongly depend on details about how it operates which are not publicly available.
The survey combined open and closed response questions to seek participants' perceptions of a series of interventions to improve authors' adherence to RGs that could potentially be implemented during the editorial process. We structured the questionnaire (see Figure S1, Extended data 17 ) as follows: • Part 1: Current practice. Participants were asked to describe the measures their journal currently takes to improve adherence to RGs.
• Part 2: Perceptions of nine potential interventions.
Participants were asked to indicate on 5-point Likert scales (i) how easy it would be (or was) to implement these interventions at their journals (1-very difficult, 2-moderately difficult, 3-neither difficult nor easy, 4-moderately easy, 5-very easy) and (ii) how effective they thought the interventions would be (or was) at improving adherence to RGs if these were implemented at their journals (1-very ineffective, 2-moderately ineffective, 3-neither ineffective nor effective, 4-moderately effective, 5-very effective). We included images to clarify meanings and context to prompt participants to think about the benefits and drawbacks of the interventions. Free text boxes were included so participants could justify their responses.
• Part 3: Identifying the barriers and facilitators. Participants were asked to choose which intervention they considered potentially the most effective for their journal at improving adherence to RGs. They were asked to describe (i) why they thought that intervention would be the most effective, (ii) what the main difficulties in implementing that intervention would be, and (iii) how they would try to overcome these difficulties.
• Part 4: Further interventions. Participants were asked for further suggestions of possible interventions, including modifications and combinations of the interventions previously discussed.
The survey was distributed over 18 pages with 1 to 3 items per page. These items were not randomised.

Data analysis
For quantitative data (Part 2 of the questionnaire), we used R version 3.6.0 18 . As these data were ordinal, we calculated medians together and the 1st and 3rd quartiles. We excluded from the analysis one questionnaire where the participant just opened the survey and left without answering any question. We did not exclude any questionnaire based on the amount of time that the participant needed to complete it.
For qualitative information, the lead investigator (DB) used the software program NVivo 12 19 . We mapped the barriers and facilitators for each of the interventions explored, as well as other key themes such as the incentives for the use of RG and the implementation of further editorial strategies. The initial mapping made by the lead investigator was discussed with another investigator (SS) and subsequently refined.
For Part 1 of the survey (Current practice) the unit of measure were the journals and therefore editors of the same journal were grouped. This was due to the fact that participants' answers represented an overarching policy and not an individual's opinion. For all other parts of the survey (Part 2 to Part 5), we analysed editors' responses independently, no matter what their journal was.

Ethics approval & informed consent
The Research Committee of the Governing Council of the Universitat Politècnica de Catalunya (UPC) granted ethical approval for this study (Reference EC 01, Date 2 May 2018).
In the invitation email, we informed survey participants that (i) the completion of the survey indicated consent to participate, (ii) they were free to stop and withdraw from the study at any time without providing a reason, (iii) the estimated time to complete the survey was 15 minutes, (iv) any identifiable information obtained in connection with this survey would remain confidential, and (v) the results would be submitted for publication and the anonymised dataset would be made publicly available in the Zenodo repository. The original dataset was kept in a password-protected folder in Google Drive.

Reporting guidelines
We consulted the Checklist for Reporting of Results of Internet E-Surveys (CHERRIES) 20 and the Consolidated criteria for Reporting of Qualitative research (COREQ) 21 guidelines to produce this research report.

Results
Of the 99 editors invited, 42 opened the invitation (view rate 42%), and 24 completed the survey (response rate 24%) from the 25 who started it (completion rate 96%). The average time spent completing the survey was 15 minutes (SD = 8.5 minutes). Among the 24 participants who completed the survey, nine (37%) worked for seven different journals that had published studies on improving adherence to RGs, seven (29%) worked for five top-10 BMJ journals, four (17%) were members of the MiRoR Network that hold editorial positions in four journals, and a further four (17%) were suggested by other participants based on their expertise on the topic and were editors of three different journals. The 20 journals represented in the survey are listed in Table 1.
Participants had a variety of editorial roles (editor-in-chief, senior editor, associate editor or others). Most of them were involved in manuscript decision-making and had less than 15 years of experience as journal editors (see Table 2). The anonymised responses from all 24 participants can be accessed in Zenodo 22 .

Current practice
Respondents worked at 19 journals. Most respondents' journals (11/19, 58%) request authors to submit a completed RG checklist with page numbers indicating where the items are addressed when they submit their manuscript. A further seven (37%) instruct but do not request authors to do it, and one (5%) does not request or instruct authors. Among the journals requesting the submission of checklists, four (4/11, 36%) also explicitly ask peer reviewers to use the completed RGs when assessing manuscripts, one (1/11, 9%) asks peer reviewers general questions about the completeness of reporting, and one performs an evaluation of the completeness of reporting by a trained editor using RGs before the initial decision is made on the manuscript. We observed no incongruences between the answers of editors from the same journal. Some respondents mentioned that in their journals (n=4) the interventions described were only applicable to the study types corresponding to the most established RGs (CONSORT, PRISMA 11 , or STROBE 12 ) for trials, observational studies and systematic reviews respectively.

Perceptions of nine potential interventions
The mean scores for perceived ease of implementation and potential effectiveness for each intervention are shown in Figure 1.
The two most common interventions were considered the easiest ones to implement: the median scores (1 st , 3 rd quartiles) for requesting authors to submit checklists with page numbers (Intervention 1) and for asking peer reviewers to use RGs (Intervention 6) were 5 (Q 1 : 4, Q 3 : 5) and 4 (Q 1 : 3, Q 3 : 5), respectively. By contrast, interventions related to training (Intervention 9), editor involvement in checking completeness of reporting (Intervention 8) and reformatting of the text based on RG requirements (Intervention 4, Intervention 5) were considered the most difficult to implement.

Identifying the barriers and facilitators
This section presents the perceived barriers and facilitators of the interventions considered and editors' suggestions for making the interventions more effective. Table S1 in Extended data 17 shows a full description of these.

A) Interventions targeting authors (1-5)
The main barriers associated with all of the interventions targeting authors was that authors have to state their adherence to the relevant RG and this does not equate to actual compliance. Moreover, it is resource intensive for journals to check that these requirements are appropriately met by authors. Some editors highlighted that Interventions 3, 4, and 5 would involve special formatting of the submitted manuscript, which could be cumbersome for authors given that manuscripts are often submitted to multiple journals with different formats before being accepted. This is particularly relevant for journals with high rejection rates as it could cause frustration for authors. Some participants mentioned logistical issues as their journal's manuscript tracking system is not set up to accommodate these interventions. In addition, changes in the manuscript's format could be incompatible with the journal's house style. Intervention 1 was generally considered quick and straightforward for authors, but several participants indicated that there is published empirical evidence of little effectiveness if the checklist is not assessed by a trained editor or administrator 5-8 .
As Interventions 3, 4, and 5 force authors to tailor the manuscript to RG requirements, participants reported that these could make editors' and peer reviewers' jobs easier as the manuscript would be better structured. Importantly, readers would also be able to locate information more easily. Some editors pointed out that, to make these interventions effective, journals would need to provide templates to authors or to integrate these interventions in the submission system. However, some of these interventions (Interventions 2 and 5) were seen as more effective if they were implemented earlier on in the research process, prior to writing the manuscript.

B) Interventions targeting peer reviewers (6, 7)
Most respondents were negative about the potential effectiveness of implementing the two interventions targeting peer reviewers (Intervention 6 and 7) as they felt these would create too much additional work for reviewers. Participants were concerned that the quality of peer review could be compromised as reviewers are not expected to focus on reporting issues but on providing an expert view on the importance, novelty and relevance of the manuscript. Furthermore, peer reviewers may not know which RGs to use and, even if they do, the effectiveness would be dependent on their willingness to use RGs and their expertise in applying them. Several participants indicated that this work should be delegated to paid editorial staff.

C) Interventions targeting editorial staff (8)
This intervention was considered difficult to implement but potentially effective. The main facilitating factor for its successful implementation was that it is performed by a paid or trained professional, which lends credibility to the intervention, reduces the workload of unpaid peer reviewers, and avoids authors overclaiming adherence. The main barriers outlined for this intervention were (i) the budget issues the journal would need to face to train or hire additional editorial staff that could perform the evaluation, especially if the journal receives a large volume of manuscripts, (ii) the editorial delays it may cause, and the (iii) the potential inefficiency of assistant editors or administrators having to delegate decisions in case of doubt, given that sometimes assessing completeness of reporting is a subjective task.
To make this intervention more feasible for journals, editors suggested that the completeness of reporting evaluation could be performed only for manuscripts that are sent out for peer review and, it could be focused on a few core items (different for each RG) that would enable reproducibility. If this intervention was implemented in a journal that requires the submission of the box to the furthest datum within that distance. Interventions whose names are shown in red target authors, those in brown target peer reviewers, the one in grey target editors or administrative staff and the one in green targets all these stakeholders. Box 1 shows a detailed explanation of each intervention.
of a completed checklist, editors could take advantage of the checklist to locate information.

D) Interventions targeting authors, peer reviewers and editors (9)
Training was seen as a potentially effective intervention but difficult to implement. Some participants highlighted that training with follow up sessions would be resource intensive for journals, and especially difficult to enforce. One participant mentioned that credits (such as CME credits 23 ) could be used to recognise hours of training. The fact that sometimes the editorial staff is based in different places and zones makes it crucial to consider flexible forms of training, such as online courses. As an example, the EQUATOR Network Toolkits section provides resources for authors, peer reviewers and journal editors 24 . However, some participants emphasised that training should also be delivered by research institutions and medical centres.

Further interventions and incentives for authors and journals
When asked about further potentially effective interventions that were not discussed in the survey, some editors mentioned StatReviewer, a reading tool that automatically assesses adherence to RGs and is currently under evaluation 16 . Other respondents also mentioned the possibility of combining some of the interventions discussed in the survey, such as requiring the submission of checklists and trained editors assessing the responses with the information reported in the manuscript.
Moreover, several incentives for authors were listed, including (i) discounts on article processing charges (APCs) for authors that comply with RG requirements, (ii) academic institutions including RG use in the promotion and tenure files, and (iii) credits (such as CME credits 23 ) to recognise hours of training on the use of RGs. Journals could also be encouraged to implement certain interventions if (i) there is empirical evidence that these interventions actually improve the reporting quality of the papers or (ii) publishers or the International Committee of Medical Journal Editors (ICMJE) mandate these as a condition of submission to their journals. Even if some of these interventions are proven to be effective, some respondents reported that it is essential to convince publishers that improving the quality of reporting is a worthy investment to resource.

Discussion
This survey explores biomedical journal editors' perceptions of the practical aspects of the implementation of different interventions to improve adherence to RGs.
Several messages arise from this study. First of all, most editors agreed that the most effective way to improve adherence to RGs is for journals to involve trained editors or administrative staff. Interventions targeting these stakeholders were considered to be difficult to implement for most journals, either because of logistic or resource issues. However, improving the performance of editorial staff is critical 25 and has been shown to have a positive impact on completeness of reporting in the context of a dentistry journal 26 . To make these type of interventions more feasible, journals could implement them only for manuscripts that are sent out for peer review. The editorial staff could also take advantage of the RG checklists submitted by authors, that could be automatically populated with text using specific software such as the the tool proposed by Hawwash et al. 27 Most editors considered that checking reporting issues is beyond the role of peer reviewers. Given the voluntary nature of peer review, requiring reviewers to use RGs would cause an additional workload that could compromise the overall quality of the reviews. If checking reporting issues becomes a standard exercise for peer reviewers, some editors are concerned that peer reviewers may be less likely to comment on important aspects of a manuscript, such as its novelty, clinical interest or implications. Furthermore, as finding peer reviewers is becoming increasingly difficult for editors 28 , these requirements could make them even less willing to review papers. Additionally, some editors considered that the average peer reviewer does not have enough expertise to go over RG requirements.
We observed that the interventions perceived as potentially most effective improving adherence to RGs appear to be more difficult to implement. Conversely, the most common strategies seem to have been implemented based on their feasibility and not on their potential to improve completeness of reporting. This could be one of the reasons why they have failed to achieve the desired results 6-9 ). Some of our respondents insisted that a key element is that journals, universities, and medical institutions find ways to incentivise author's compliance with RGs. At the same time, the scientific community needs to find ways to convince publishers that improving the quality of reporting is a worthy investment so that publishers can encourage their journals to adopt strategies to boost completeness of reporting. A recent article indicates that implementing RGs through the editorial process may increase the number of citations to the research reported 29 .
A common observation by the survey participants was that the effectiveness of the interventions proposed could depend on the types of articles considered. While RGs for randomised trial protocols, randomised trials or systematic reviews are more established, some others, including most RG extensions, are not well known to the stakeholders involved in the publication process. For this reason, it is important for journals to be clear in their "Instructions for Authors" on what RGs they mandate.
It is noteworthy to mention that, regardless of how checklists are implemented in the editorial process and who has to engage to make the interventions successful, the evaluation of completeness of reporting is a subjective task. This is mainly due to the fact that RGs are not originally designed as evaluation tools but as guidance for authors on how to report their research. For this reason, evaluators could sometimes have different views on whether authors are providing enough information to consider that certain RG items are adequately reported.
This study is subject to several limitations. The response rate was low (24%). However, researchers in health science have witnessed a gradual decrease in survey participation over time 30 , especially among health professionals due to the demanding work schedules and increasing frequency of being approached for surveys 31 . Some recent surveys in the field of peer review show even lower response rates (10-20%) among researchers, peer reviewers and readers 32,33 . It is also noteworthy that we took a pragmatic approach to identify relevant editors and the sample was small due to not many having conducted or published research on improving adherence to RGs. Whilst n=24 is a small number, the detailed and rich responses that we received showed a high level of engagement with the topic. Despite having the option to increase the sample size by contacting more editors at a lower level of hierarchy in the journals we targeted, we decided not to do it based on the response rate of the survey. That approach would have changed our sampling frame and we would potentially have had less experienced editors commenting. We took that decision as the purpose of the survey was to tap into the experience of those who had tried interventions or had shown interest in this area, instead of seeking a representative sample of editors.
Connected with this, we could expect survey participants to be more prone to adopt interventions than general biomedical editors. However, their experience could also make them more critical of certain strategies that appear to be more effective than they actually are. This could be the case for the intervention of requesting authors to submit checklists on manuscript submission, which has become popular among medical journals despite having little or no impact on completeness of reporting 6-9 ). Editors with less experience of editorial strategies to improve adherence to RGs might expect authors and peer reviewers to respond to certain interventions in a different way than they would do.
We encourage researchers to perform further evaluations of interventions in collaboration with biomedical journals, such as the RCT our research team is currently undergoing 34 . Our study aims to evaluate the effect on completeness of reporting of a trained researcher assessing during peer review the consistency between the CONSORT checklists submitted by authors and the information reported in the manuscript, and providing authors with a report indicating any inconsistencies found.
Providing high quality evidence of the effectiveness of different interventions at improving adherence to RGs and discussing how to make them less burdensome are key aspects needed to convince all stakeholders that this effort is worth it.

Conclusions
Biomedical journal editors generally believed that engaging trained professionals in the process of checking adherence to RGs would be the most effective, yet moderately resource intensive, editorial intervention. Also, they thought that standard peer reviewers should not be asked to check RG requirements.
Future evaluations of interventions to improve adherence to RGs can take into account the barriers, facilitators, and incentives for implementing editorial interventions that are described in this survey. This project contains the following extended data: Figure S1: Survey questionnaire (Complete version of the survey questionnaire used in this project) Table S1: Barriers, facilitators and possible improvements of the included interventions (Table containing the  The authors took care to respond to all the reviewers' questions. However, there are still three points that were not adequately addressed, and I hope I can make some suggestions to help them solve these problems. I also found two additional problems that came up in this second version of the manuscript.

Data availability
From the 20 points I commented in my last report, the authors adequately responded/solved the problems in the items 1, 3-5, 7, 9-17, 19 and 20. Thank you for that. Below you can find comments on the remaining items (I used the same numbers as before to make it easier to locate them) and the two additional ones.
2) The authors explained that they purposely searched for editors experienced with reporting Sample. guidelines. As much as I don't agree with this approach, at least it was clearly stated now.
However, the explanation about the sample size as a study limitation still does not seem satisfactory. The authors added a comment in the discussion section attributing the sample size to a low response rate in surveys in general. But if the authors knew there was going to be a low response rate (and they cited very useful references for that), why did they stop recruiting? Why not try to approach other editors/journals? The authors should expand the explanation to respond to two questions: What were the constraints in increasing the sample? Why did you stop recruiting?
Even choosing to use a "not representative sample", can we trust that we should listen to the opinions of these 24 editors (a minority among the 99 invited) to say something is feasible or not, for example? What is the effect of this small sample size on the "take-home messages" of this for example? What is the effect of this small sample size on the "take-home messages" of this study? These points were not addressed in the paragraph about study limitations, and I think they should. Also: who were the editors who did not respond? Do you have information about them that you could show in a table? If you can find and show a pattern of response (for example, editors from high-visibility journals would have less interest in responding?), then we can understand better this low response rate. Figure 1 improved a lot, thank you. Now it shows outliers and the types of interventions are Figure 1. clearer too (I wonder if you could put these "nicknames" in Box 1 too!).

6)
However, it is still a bit difficult to read/interpret. I believe the problem is that you chose to show means when you probably should have used (more appropriate to scores and to data that is not medians normally distributed). I suggest the authors try to check for data distribution, try to generate this figure with means, and that they also describe this method in the Methods -Data analysis section.
8) The abstract improved, well done! However, the conclusions seem not to be conclusions at all, not based on your data.
The first sentence is your (and even mine!) opinion, but it not supported by your data. The second sentence tells about your aims, not your conclusions.
The same is valid for the conclusions at the end of the paper: they show valid comments/assumptions but nothing based on your data.
Useful conclusions should be linked to your aims (to survey editors' perceptions) and read like "the editors think that"... or "the perceptions of the editors are...".

18)
: you fixed most of them, but there are still references incomplete. Please review # 23, 24 References and 28 and add the URLs.

Additional concerns:
Regarding Data Analysis: I noticed you added text in this section about "excluding one questionnaire that was terminated early". That called my attention: is this one of the 24 (and then we have 23 responded questionnaires) or was it a 25 questionnaire? Also: how did you approach missing data? By excluding the participants? Was there any other case?
Minor: Thank you for providing the names of the journals in Table 1 Thanks a lot, Patricia, for your extra comments and the time you took to make them. Please see below our response to them:

Minor comments
References: you fixed most of them, but there are still references incomplete. Please review # 23, 24 and 28 and add the URLs. In our previous revised manuscript, we already had added the URLs for all Internet references. However, it seems that the copy-editors of F1000 did not explicitly show them and just include a link to them under "Reference source".
Regarding Data Analysis: I noticed you added text in this section about "excluding one questionnaire that was terminated early". That called my attention: is this one of the 24 (and then we have 23 responded questionnaires) or was it a 25th questionnaire? Also: how did you approach missing data? By excluding the participants? Was there any other case?
We excluded from analysis one questionnaire (the 25 ) where the participant just opened the survey (and therefore SurveyMonkey registered it) but left without answering any question. For this reason, we decided not to count it as a participant. There were no further cases like this. We have made that more explicit in the text (first paragraph of Data analysis, first paragraph of Results). Table 1

. However, I suspect "Francis and Taylor" is not a journal name. Please, check (maybe there is a confusion with the publisher's name "Taylor & Francis", but what was the specific journal?).
We have checked that that was a mistake. Thanks to this, we have realised that for one of the existing journals we had two editors instead of one. We have subsequently adjusted the sample description.

Major comments
. 2) Sample The authors explained that they purposely searched for editors experienced with reporting guidelines. As much as I don't agree with this approach, at least it was clearly stated now. However, the explanation about the sample size as a study limitation still does not seem satisfactory. The authors added a comment in the discussion section attributing the sample size to a low response rate in surveys in general. But if the authors knew there was going to be a low response rate (and they cited very useful references for that), why did they stop recruiting? Why not try to approach other editors/journals? The authors should expand the explanation to respond to two questions: What were the constraints in increasing the sample? Why did you stop recruiting? Even choosing to use a "not representative sample", can we trust that we should listen to the opinions of these 24 editors (a minority among the 99 invited) to say something is feasible or not, for example? What is the effect of this small sample size on the "take-home messages" of this study? These points were not addressed in the paragraph about study limitations, and I think they should. Also: who were the editors who did not respond? Do you have information about them that you could show in a table? If you can find and show a pattern of response (for example, editors from high-visibility journals would have less interest in responding?), then we can understand better this low response rate. th responding?), then we can understand better this low response rate. We have added further justifications on whether we believed that sample size and response rates were adequate for the purpose of the survey: "Despite having the option to increase the sample size by contacting more editors at a lower level of hierarchy in the journals we targeted, we decided not to do it based on the response rate of the survey. That approach would have changed our sampling frame and we would potentially have had less experienced editors commenting. We took that decision as the purpose of the survey was to tap into the experience of those who had tried interventions or had shown interest in this area, instead of seeking a representative sample of editors." We found no pattern of response among the journals/editors surveyed -participation was pretty homogeneous and covered journals of a variety of impact factors and areas, as can be seen in Table 1. Figure 1. Figure 1 improved a lot,

thank you. Now it shows outliers and the types of interventions are clearer too (I wonder if you could put these "nicknames" in Box 1 too!). However, it is still a bit difficult to read/interpret. I believe the problem is that you chose to show means when you probably should have used medians (more appropriate to scores and to data that is not normally distributed). I suggest the authors try to check for data distribution, try to generate this figure with means, and that they also describe this method in the Methods -Data analysis section.
We are now reporting the results for the ease of implementation/effectiveness in terms of medians and quartiles instead of means and standard deviations. We have indicated this in the Methods-Data analysis section. Regarding Figure 1, we have removed the means and we are now displaying the medians as blue horizontal lines. Moreover, we decided to remove the black points representing the individual scores as they do not allow us to see the medians in the graphs. In this way, Figure 1 is clearer and cleaner.

8) Conclusions
. The abstract improved, well done! However, the conclusions seem not to be conclusions at all, not based on your data. The first sentence is your (and even mine!) opinion, but it not supported by your data. The second sentence tells about your aims, not your conclusions. The same is valid for the conclusions at the end of the paper: they show valid comments/assumptions but nothing based on your data. Useful conclusions should be linked to your aims (to survey editors' perceptions) and read like "the editors think that"... or "the perceptions of the editors are...". We have modified the Conclusions both in the abstract and at the end of the paper, linking them to the survey aims (perceptions, barriers, facilitators…). We have also moved to the end of the Discussion section what was reported in the old Conclusions.
No competing interests were disclosed.

Positive points:
This is a much-needed survey, and I am happy and thankful that the MiRoR team invested their time and expertise in undertaking it. The subject of the adoption of reporting guidelines in the editorial process is sensitive, relevant and worth exploring. Hearing editors is a crucial step for the development of interventions aimed to improve the quality of biomedical research reporting. I congratulate the authors for the initiative.
The main result of the study (that the interventions perceived as effective are the most difficult to implement and vice-versa) points out to the need to verifying feasibility before proposing interventions, 1 2 implement and vice-versa) points out to the need to verifying feasibility before proposing interventions, something that we should all think about. I hope the authors find these suggestions helpful and constructive, and that they find these to be positive contributions.

Major concerns: methodology
Although the authors should be praised for the idea of asking editors about 1) Sample characteristics. interventions to improve reporting, I have a concern about what editors should be surveyed. If one wants to explore the perceived ease of the implementation of one intervention, is it adequate to choose a population of editors whom we know are already "knowledgeable about the topic"?
If these editors are aware of the reporting quality crisis and the reporting guidelines in general, wouldn't these be more to adopt interventions that aim to improve the quality of publications? Therefore, prone aren't these the people who would say "yes" to any intervention proposed?
Should the authors not interview editors who have never heard about reporting guidelines too? Wouldn't this heterogeneous sample be more representative of the scenario of difficulties -or the ease -of implementing reporting guidelines checks in the editorial process? Are the study results only applicable to journals that already endorse or require reporting guidelines?
I would like to see comments about this limitation in the discussion if the authors agree with this or even if they do not.
My second most important concern is about sample size. In the Methods section, the 2) Sample size. authors do not explain how they planned the sample size and based on what assumptions. 24% seems to be a very low response rate. Do the authors believe 24 is enough? This limitation should be discussed.
When you asked your participants about their opinions on "how effective" something 3) Effectiveness. would be, what was the definition of effectiveness? Would it be their perception about the intervention improving the adherence to reporting guidelines checklists? Was it a more subjective view of "manuscript quality"?
The authors mention in the text, in Methods: 4) Statreviewer. "Based on feedback from the pilot we decided not to include the intervention "Implementation of the automatic tool Statreviewer14" since participants were not aware of this software and stated that their perceptions would strongly depend on details about how it operates which are not publicly available." However, on page 7, Results, they state: "The implementation of reading tools that automatically assess adherence to RGs, such as Statreviewer14, were seen as potentially interesting interventions." After all, was this intervention Statreviewer evaluated (as in Results) or was it not (as in Methods)? Do you mean they mentioned other tools, such as Penelope, GoodReports, Cobweb or others? Or were they referring to Statreviewer?

Major concerns: reporting
You interviewed 24 editors from 20 journals. What are these journals? I did not see a list of 5) Journals. their names, and I believe it is important. Another suggestion for Figure 1 (here, just a suggestion) is to add something to remind the readers what the interventions are. As Box 1 is far from the Figure, it is difficult to remember what each intervention is. It would be useful to have "nicknames", for example: 1. Page numbers 2. Checklist 3. Highlight 4. Subheadings 5. COBWEB 6. Use RG 7. Check RG 8. Staff 9. Training The abstract is, in general, well written and clear. However, I got intrigued by 7) Peer reviewers' checks. this sentence: "Participants believed that checking adherence to guidelines goes beyond the role of peer reviewers and could decrease the overall quality of reviews." I was surprised that peer reviewers checking reporting guidelines adherence would decrease the overall quality of the paper. Then I reread it in the paper, but here put in a complete sentence, that makes sense: "Participants were concerned that the quality of peer review could be compromised as reviewers should focus on the manuscript's content and not on the reporting issues." Meaning, I suppose, that peer reviewers would focus on checking reporting guidelines and not on their "expert view" of the study. I get it, but it is not very clear, neither in the text, nor in the abstract, and I invite the authors to rewrite both. Especially because "reporting guidelines" are about content too. They point out the content elements that should not be absent from papers, so using the word "content" in the manuscript text is not really appropriate. Please, re-evaluate.
As I said above, the abstract is good, and there is room for growth (you have now 232 words).

8)
However, I could not find the conclusions in the abstract helpful. Could you please inform better what you did conclude? What "points raised in the survey" should be considered? ..."assessing completeness of reporting is a subjective task". This is one of the 9) Checklists as forms. most important sentences in this manuscript, it tells us about using reporting guidelines checklists as evaluation tools -for what they have not been created. I saw no discussion about this in your paper.
The discussion of the paper is short, and it lacks several points pointed above. Mainly, it 10) Discussion. does not discuss the limitations and potential bias of this study. Discussing limitations is a requirement from F1000Research journal.
You mention you did analyse qualitative information using . How did you do it?

11) NVivo
There are at least eight items from the CHERRIES checklist that I could not see in your paper, and 12) you might want to report, even if in a separate (or supplementary) box, from data protection, preventing multiple entries, to randomisation of items, and others.
From Box 1, I do not quite get the difference between Interventions 1 and 2. It seems that both ask 13) authors to submit a populated/completed RG checklist. Is it that #1 asks for page numbers and #2 does not?
In my opinion, the sentences from "To contact editors..." to "improve adherence to RGs" belong more 14) to the "Participants" section than to the "Procedure". Especially the first sentence. Also "Participants could suggest further editors that they considered could contribute to the survey" is something that I would consider as a snowballing or recruitment method, not the survey procedure per se. So, I would suggest putting everything related to recruitment together in one section, to make it easier for your reader (who might be interested in undertaking a similar survey) to find information on such a critical issue as recruitment. Also: how could participants suggest? Was it a specific question in the SurveyMonkey form?
Were participants given an estimation of the time required to answer the survey before they began?

Minor: about citations
Reference number 1 (EQUATOR Network website) is not a proper, complete reference. It points to 16) the home page of the website, not to a document, and as so, it does not show from where the authors collected that information (the reader cannot find them in the home page Thank you for the opportunity of reading and evaluating this paper.

Are sufficient details of methods and analysis provided to allow replication by others? Partly
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required. I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. 20 20 how it operates which are not publicly available." However, on page 7, Results, they state: "The implementation of reading tools that automatically assess adherence to RGs, such as Statreviewer14, were seen as potentially interesting interventions." After all, was this intervention Statreviewer evaluated (as in Results) or was it not (as in Methods)? Do you mean they mentioned other tools, such as Penelope, GoodReports, Cobweb or others? Or were they referring to Statreviewer? We have clarified in the results section ("Further interventions and incentives for authors and journals") that Statreviewer was not included in the survey but mentioned by some participants when asked about further potentially effective interventions.
Major concerns: reporting 5) Journals. You interviewed 24 editors from 20 journals. What are these journals? I did not see a list of their names, and I believe it is important. The 20 journals represented are now listed in Table 1. We have also improved the description of the sample in the first paragraph of the Results section.

7)
Peer reviewers' checks. The abstract is, in general, well written and clear. However, I got intrigued by this sentence: "Participants believed that checking adherence to guidelines goes beyond the role of peer reviewers and could decrease the overall quality of reviews." I was surprised that peer reviewers checking reporting guidelines adherence would decrease the overall quality of the paper. Then I reread it in the paper, but here put in a complete sentence, that makes sense: "Participants were concerned that the quality of peer review could be compromised as reviewers should focus on the manuscript's content and not on the reporting issues." Meaning, I suppose, that peer reviewers would focus on checking reporting guidelines Meaning, I suppose, that peer reviewers would focus on checking reporting guidelines and not on their "expert view" of the study. I get it, but it is not very clear, neither in the text, nor in the abstract, and I invite the authors to rewrite both. Especially because "reporting guidelines" are about content too. They point out the content elements that should not be absent from papers, so using the word "content" in the manuscript text is not really appropriate. Please, re-evaluate. We completely agree with the point you raise. We have re-written this point in the Abstract, Results and Conclusions. "Participants believed that checking adherence to guidelines goes beyond the role of peer reviewers and were concerned that the quality of peer review could be compromised. Reviewers are generally not expected to focus on reporting issues but on providing an expert view on the importance, novelty, and relevance of the manuscript." 8) As I said above, the abstract is good, and there is room for growth (you have now 232 words). However, I could not find the conclusions in the abstract helpful. Could you please inform better what you did conclude? What "points raised in the survey" should be considered?
We have re-written and expanded the Conclusions in the abstract. The abstract is now 295 words. 9) Checklists as forms. ..."assessing completeness of reporting is a subjective task". This is one of the most important sentences in this manuscript, it tells us about using reporting guidelines checklists as evaluation tools -for what they have not been created. I saw no discussion about this in your paper.
We have now discussed this point in the paragraph before the limitations of the study. 10) Discussion. The discussion of the paper is short, and it lacks several points pointed above. Mainly, it does not discuss the limitations and potential bias of this study. Discussing limitations is a requirement from F1000Research journal. We have now included such a section in the Discussion section.

11) You mention you did analyse qualitative information using NVivo. How did you do it?
We have now provided further information on how we did the qualitative analysis with NVivo ("Data analysis" section).

12)
There are at least eight items from the CHERRIES checklist that I could not see in your paper, and you might want to report, even if in a separate (or supplementary) box, from data protection, preventing multiple entries, to randomisation of items, and others.
We have now included the information corresponding to the missing items throughout the whole Methods section. 13) From Box 1, I do not quite get the difference between Interventions 1 and 2. It seems that both ask authors to submit a populated/completed RG checklist. Is it that #1 asks for page numbers and #2 does not? #1 asks for page numbers and #2 for text from the manuscript. Some people may think that if you nudge authors to have to copy-paste text into the checklist they may tend to take it more seriously than if you just ask them to put a page number. We have slightly modified their wording in Box 1 to make it clearer. 14) In my opinion, the sentences from "To contact editors..." to "improve adherence to RGs" belong more to the "Participants" section than to the "Procedure". Especially the first sentence. Also "Participants could suggest further editors that they considered could first sentence. Also "Participants could suggest further editors that they considered could contribute to the survey" is something that I would consider as a snowballing or recruitment method, not the survey procedure per se. So, I would suggest putting everything related to recruitment together in one section, to make it easier for your reader (who might be interested in undertaking a similar survey) to find information on such a critical issue as recruitment. Also: how could participants suggest? Was it a specific question in the SurveyMonkey form?
We have restructured that part of the manuscript and divided it into "Recruitment" and "Survey administration". We have also provided the information requested.

15) Were participants given an estimation of the time required to answer the survey before they began?
We have now included this information ("Survey administration" subsection). 18) Reference 14 also lacks authors and titles. References 13, 14, 16, 21 and 22 lack a lot of elements necessary for readers to be able to find them. When working in a printed version (as I am now), you cannot see what the document is, where it is published or by whom -these are available only by those reading online and able to "click" on the links. Done.

19) I would change the sentence:
"Raw survey results are given as Underlying data20." To something clearer and that does not require the reader to go to the reference list to understand what it is. " Table XX in the supplementary file X20 shows the responses from all 24 participants." Done. A limitation of the study is that the results are based upon a survey return rate of just 24%. The returns are likely biased favouring editors with stronger views on reporting guidelines. In addition, it is not clear whether all ten of the top-ten journals were represented, or whether one or more journals dominate the response rate. In addition it is not completely clear how many participants came from the MiRoR Network, or journals previously publishing studies on Reporting Guidelines. Figure 1 is hard work to interpret and might be presented more usefully as two separate graphs. In addition, the colour scheme could be changed to highlight those interventions considered "Very easy" or "Moderately easy" to implement, and those ranked as "Very effective" or "Moderately effective". (Note that the proportions of each score could still be retained.)

20) Has the study by
However, this is a useful addition to the literature offering valuable insight into some views on guidelines and candidate interventions promoting closer adherence to guidelines. The paper would be improved by a paragraph on Limitations of the Study.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? We thank you for your time and comments on the manuscript. Please find below a point by point response to them:

Partly
A limitation of the study is that the results are based upon a survey return rate of just 24%. The returns are likely biased favouring editors with stronger views on reporting guidelines.
We have now reflected on this point in the Discussion section.
In addition, it is not clear whether all ten of the top-ten journals were represented, or whether one or more journals dominate the response rate. In addition it is not completely clear how many participants came from the MiRoR Network, or journals previously publishing studies on Reporting Guidelines.
We have now added that information to the first paragraph of the Results section. Figure 1 is hard work to interpret and might be presented more usefully as two separate graphs. In addition, the colour scheme could be changed to highlight those interventions considered "Very easy" or "Moderately easy" to implement, and those ranked as "Very effective" or "Moderately effective". (Note that the proportions of each score could still be retained.) In order to make Figure 1 easier to interpret, we have substituted the bar plot by a box plot. We have also plotted the individual and mean scores for each intervention and each outcome. Everything is explained in the legend. However, we prefer not to separate it into two graphs as we believe it is interesting for the readers to be able to compare the scores that each intervention obtains for each of the two outcomes.

The paper would be improved by a paragraph on Limitations of the Study.
We have now included a section on Limitations of the study in the Discussion section.
No competing interests were disclosed. Competing Interests: