Learning from Cochrane systematic reviews: what improvements do these suggest for the design of trials?

Background: Many randomised trials have serious methodological flaws that fatally undermine their results, which makes the research wasteful. This is of concern for many, including those doing systematic reviews that include trials. Cochrane systematic reviews have a section called ‘ Implications for research’, which allows authors of the review to present their conclusions on how future research might be improved. Looking at these conclusions might highlight priority areas for improvement. Methods: We focused on the Cochrane Schizophrenia Review Group and the Multiple sclerosis and rare diseases of the central nervous system Review Group (the MS Review Group). Reviews with citation dates between 2009 and 2019 were identified and the recommendations of review authors in ‘ Implications for research’ were put into categories. Results: Between 2009 and 2019 we identified 162 reviews for the Schizophrenia Review Group and 43 reviews for the MS Review Group. We created 22 categories of recommendations in total, of which 12 were common to both groups. The five most used categories were the same for both: better choice of outcomes; better choice of intervention/comparator; longer follow-up; larger sample size; use of validated scales. Better choice of outcomes and/or intervention/comparator was recommended in over 50% of reviews. Longer follow-up and larger sample size were recommended in over a third, with use of validated scales being suggested in around a fifth of reviews. There was no obvious pattern of improvement over time for trials included in systematic reviews published by both groups. Conclusions: We suggest that trialists working in these and other areas ask themselves, or are compelled to do so by others (e.g. funders), why they have chosen their outcomes, intervention and comparator, whether follow-up is long enough, if the sample size is big enough and whether the scales they choose to measure their outcomes are appropriate.


Introduction
Randomised trials are at the heart of evidence-informed healthcare systems. Not all trials, however, are created equal. Some are excellent, others have serious methodological flaws that fatally undermine their results and make the research wasteful [1][2][3] . This is of particular importance for the systematic reviews that should be used to underpin decision making in health care, and systematic reviewers often identify relevant issues and suggest ways to improve the quality of future trials. Cochrane is an international organisation that aims to provide high-quality information to support health decisions by systematically reviewing research, especially from randomised trials to investigate the effects of healthcare interventions. It is organized across more than 50 Cochrane Review Groups, each of which looks after a particular area of health.
Cochrane pays great attention to the methodological quality of both the reviews and the studies they include 4 . Cochrane systematic reviews all have a section called 'Implications for research', which allows the authors of the review to present their conclusions on how future research might be improved, for example by discussing the types of interventions that need more evaluation, or the outcomes which should, or should not, be measured and reported 5 .
The work described here looked at the 'Implications for research' sections of reviews published by two Cochrane Review Groups between 2009 and 2019 (2019 was incomplete at the time of the work). Our aims were to: 1. Categorise the recommendations made in 'Implications for research'.
2. Explore whether there have been changes over time and between the two groups.

Methods
We focused on the Schizophrenia Review Group and the Multiple sclerosis and rare diseases of the central nervous system Review Group (which we call the MS Review Group hereafter). The Schizophrenia Review Group was chosen because it has a long-standing interest in what is written in the 'Implications for research' section and we expected it to have good, consistent reporting. The MS group was selected because one of us (SP) had a special interest in this clinical area.
The work was split into two stages:

Results
For the period 2009 to 2019, we identified 162 reviews for the Schizophrenia Review Group and 43 reviews for the MS Review Group. A wide variety of interventions were covered by these reviews, including drugs, educational and behavioural interventions, and other therapies such as physical exercise and acupuncture. The median number of included studies in reviews from the Schizophrenia Group was seven (range 0 to 174); for the MS Group it was five (range 0 to 45).
We created 22 categories in total, of which 12 were common to both review groups ( intervention/comparator, longer follow-up, larger sample size and use of validated scales). However, the ranking of each category in that top five list varied from one year to the next; no category was consistently most used ( Figure 1 and Figure 2). There was no obvious pattern of improvement over time for trials included in systematic reviews published by both groups.

Conclusions
There is substantial overlap in the types of recommendation made in the 'Implications for research' sections of systematic reviews done by the Cochrane Schizophrenia and MS Review Groups. These were easier to identify in the Schizophrenia Review Group's reviews because of their consistent approach to presenting implications for research in accordance with published guidance 6 . Many of their reviews also routinely include a suggested design for a future trial in this section.
The five most frequently made recommendations are the same for both groups with better choice of outcomes being top of the list and used in over half the 205 reviews included in our study. Looking across the decade to 2019, there is no obvious pattern of decrease in the areas of methodology that need to be improved in trials included in systematic reviews published by both groups.  Previous research found that small, underpowered studies make up the entirety of evidence in most meta-analyses reported by Cochrane reviews 7 . Only 35% of the reviews in our sample mentioned sample size in the Implication for research, which suggests that reviewers may be underestimating the size of this problem. The most frequent issue raised in our work was the choice of outcome, a persistent problem that led to the COMET initiative to facilitate the development of core outcome sets 5,8 . Cochrane reviewers need to continue to highlight trial design problems; indeed they could perhaps do so more often, especially around sample sizes.

Limitations
It is possible that different researchers would have categorised the implications for research differently, although we did use two reviewers working independently and there was little disagreement. Selecting other Cochrane review groups may have led us to conclude that different recommendations were most common but without doing that work we have no way of knowing this.
Despite looking at just two Cochrane review groups, we believe that our findings are likely to be generalisable to other areas of health and health care but, at a minimum, a good start for the 2020s would be for researchers planning trials in schizophrenia and multiple sclerosis to ask themselves (or be compelled to do so by others, e.g. funders) the following questions: 1. Why did we choose these outcomes?
2. Why is this the right intervention and comparator? © 2020 Gluud C. This is an open access peer review report distributed under the terms of the Creative Commons , which permits unrestricted use, distribution, and reproduction in any medium, provided the original Attribution License work is properly cited.

Christian Gluud
The Copenhagen Trial Unit, Copenhagen University Hospital, Copenhagen, Denmark Pirosca and colleagues have examined the 'Implications for research' section of 205 Cochrane reviews published between 2009 to 2019 from two editorial groups. The five most used categories were the same for both editorial groups: better choice of outcomes; better choice of intervention/comparator; longer follow-up; larger sample size; use of validated scales. Better choice of outcomes and/or better choice of intervention/comparator was recommended in over half of reviews; longer follow-up and larger sample size were recommended in over a third; and use of validated scales being suggested in around a fifth of reviews.
I think the authors should also discus their interesting findings from the point of view: how good are the Cochrane authors? E.g. we know from numerous studies (e.g., Rebecca M Turner .) that 80% to 90% et al of all Cochrane reviews do not reach a fair meta-analytic sample size providing the necessary power to assess medium large and small (but still clinically relevant) intervention effects. How come then that almost 66% of reviews do not mention larger sample sizes?

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results?

Are the conclusions drawn adequately supported by the results? Yes
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Evidence based clinical practice.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 22 Jun 2020 , University of Aberdeen, Aberdeen, UK Shaun P. Treweek

Response to comments on F1KR00CDE F1R-VER24991-A 22/6/2020
Thanks for the reviewers' comments on our manuscript. Our responses are below. We have made the necessary changes to the manuscript using track changes so that they can be easily seen. We have also submitted a clean version with the changes accepted.

Reviewer #2
I think the authors should also discuss their interesting findings from the point of view: how good are the Cochrane authors? E.g. we know from numerous studies (e.g., Rebecca M Turner.) that 80% to 90% et al of all Cochrane reviews do not reach a fair meta analytic sample size providing the necessary power to assess medium large and small (but still clinically relevant) intervention effect. How come then that almost 66% of reviews do not mention larger sample sizes? We have added the following text to the section: 'Conclusions' Previous research found that small, underpowered studies make up the entirety of evidence in most meta-analyses reported by Cochrane reviews . Only 35% of the reviews in our sample mentioned sample size in the Implication for research, which suggests that reviewers may be underestimating the size of this problem. The most frequent issue raised in our work was the choice of outcome, a persistent problem that led to the COMET initiative to facilitate the development of core outcome sets . Cochrane reviewers need to continue to highlight trial design problems; indeed they could perhaps do so more often, especially around sample sizes.
No competing interests were disclosed.

Livia Puljak
Center for Evidence-Based Medicine and Health Care, Catholic University of Croatia, Zagreb, Croatia I have reviewed with interest the manuscript titled "What improvements do Cochrane systematic reviewers suggest for the design of trials? [version 1; peer review: awaiting peer review]". The authors report a study based on a very elegant idea, analysing content of the "Implications for research" section in selected Cochrane reviews. I would like to suggest the following minor revisions: Title: I would suggest revising in a way to reflect the fact that these recommendations came from Cochrane reviews. When I first read the title, I initially thought that this manuscript reports results of a survey among Cochrane authors.

Abstract
"Reviews with citations between 2009 and 2019 were identified" -should this be "reviews published between 2009 and 2019"?
"We created 22 categories" -I would suggest revising "categories" into "categories of recommendations".
Conclusion: I am not sure that it is sufficient to recommend to trialists to "ask themselves". The problems that were mentioned (i.e. recommendations) were consistent over the analysed period. I think it would be more meaningful to conclude that we need to analyse the effects of interventions that will force/motivate trialists to change their actions when designing clinical trials.

Methods
It would be useful to have a section called Data extraction, to describe all the data that were extracted. For example, in the beginning of the Results, the authors have mentioned type of interventions, and number of studies included in analysed reviews, but this was not mentioned as extracted in the Methods.

Results
Results section is rather short. I would appreciate to read more text in this section, i.e. about categories specific only to each review group analyzed. However, I understand that this is brief report, so I would not insist on this, if this is not feasible for this type of article.
In the Discussion, the authors mentioned that some categories were "clearer", but I am not really sure what this means. I also do not see anything mentioned about that in the Methods, and Results.
In the Abstract, the authors wrote " There was no obvious pattern of improvement over time for trials included in systematic reviews published by both groups", and in the Results, the authors wrote "However, the use of these five categories varied over time for each group". Is this supposed to be the same? The expression "varied" is very non-specific. Discussion I would prefer to see this explained in more detail; I am not sure I understand completely what it means: "These were clearer in the Schizophrenia Review Group's reviews because of their structured approach to presenting implications for research in accordance with published guidance". What kind of "structured approach" is this?
Is this a formal requirement, is there a reference to be used for this sentence: "Their reviews also Is this a formal requirement, is there a reference to be used for this sentence: "Their reviews also routinely include a suggested design for a future trial in this section." "cited in over half" -I would suggest to revise "mentioned (or used) in over half".
Conclusion statement at the end of the manuscript: the same comment as for the conclusion statement at the end of the Abstract.
Perhaps the authors should mention some other resources that the trialists should use to improve some of these aspects, such as core outcome sets.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 22 Jun 2020 , University of Aberdeen, Aberdeen, UK Shaun P. Treweek

Response to comments on F1KR00CDE F1R-VER24991-A 22/6/2020
Thanks for the reviewers' comments on our manuscript. Our responses are below. We have made the necessary changes to the manuscript using track changes so that they can be easily seen. We have also submitted a clean version with the changes accepted.

Reviewer #1
Title: I would suggest revising in a way to reflect the fact that these recommendations came Title: I would suggest revising in a way to reflect the fact that these recommendations came from Cochrane reviews. When I first read the title, I initially thought that this manuscript reports results of a survey among Cochrane authors. We have changed the title to "Learning from Cochrane systematic reviews: what improvements do these suggest for the design of trials?" "Reviews with citations between 2009 and 2019 were identified" -should this be "reviews published between 2009 and 2019"? In the first draft of the report we had 'published' but we have decided to change to 'citations' because of the way Cochrane makes its reviews available. Through 'citation' we mean that we examined the most recent version of any review that was first published or was updated in 2009 to 2019. This is important because reviews from before 2009 were still "published" during the decade, but were not updated in that time period.
"We created 22 categories" -I would suggest revising "categories" into "categories of recommendations". Done.
Conclusion: I am not sure that it is sufficient to recommend to trialists to "ask themselves". The problems that were mentioned (i.e. recommendations) were consistent over the analysed period. I think it would be more meaningful to conclude that we need to analyse the effects of interventions that will force/motivate trialists to change their actions when designing clinical trials. We have added a few sentences including the available resources for researchers to change their actions in the section: 'Conclusions' There are resources that can help with the above. For example, COMET (http://www.comet-initiative.org) can help with outcome choice and PRECIS-2 can help with design decisions around comparators and follow-up . Discussions with staff at clinical trial units and other research support centres are also likely to improve designs. Change is slow at present and initiatives to encourage, or force, trialists to consider these questions would be welcome, especially from funders.
In the abstract, we have also added a few words to address this comment (addition in italics, which is also included in the main Conclusions): We suggest that trialists working in these and other areas ask themselves, or are compelled to do ,… so by others (e.g. funders) It would be useful to have a section called Data extraction, to describe all the data that were extracted. For example, in the beginning of the Results, the authors have mentioned type of interventions, and number of studies included in analysed reviews, but this was not mentioned as extracted in the Methods. We have added a new file in the repository including all the data we extracted so that readers can see the data we extracted. We have added the following text to the section: 'Methods' Additionally we extracted information on the number of included participants and studies, the risk of bias etc. (See Data Availability for a file containing the full list).
Results section is rather short. I would appreciate to read more text in this section, i.e. about categories specific only to each review group analyzed. However, I understand that this is brief report, so I would not insist on this, if this is not feasible for this type of article.
As this is a Brief Report, we have decided to keep the section to the point and we have 'Results' 9 As this is a Brief Report, we have decided to keep the section to the point and we have 'Results' not repeated in words information that is in the table. We have added some extra detail though, see response to Comment #8.
In the Discussion, the authors mentioned that some categories were "clearer", but I am not really sure what this means. I also do not see anything mentioned about that in the Methods, and Results. We have replaced this text with: However, the ranking of each category in that top five list varied from one year to the next; no category was consistently most used (Figures 1 and 2). There was no obvious pattern of improvement over time for trials included in systematic reviews published by both groups.
In the Abstract, the authors wrote " There was no obvious pattern of improvement over time for trials included in systematic reviews published by both groups", and in the Results, the authors wrote "However, the use of these five categories varied over time for each group". Is this supposed to be the same? The expression "varied" is very non-specific. The two sentences refer to different things: the one in the abstract to general improvement over time, the one in Results to the position of a given category in the top five from one year to the next. We have modified the text, see response to Comment #7.
I would prefer to see this explained in more detail; I am not sure I understand completely what it means: "These were clearer in the Schizophrenia Review Group's reviews because of their structured approach to presenting implications for research in accordance with published guidance". What kind of "structured approach" is this? Consistency was the key advantage and we have changed the wording of this sentence accordingly: These were easier to identify in the Schizophrenia Review Group's reviews because of their consistent approach to presenting implications for research in accordance with published guidance Is this a formal requirement, is there a reference to be used for this sentence: "Their reviews also routinely include a suggested design for a future trial in this section." We are not aware of this being a formal requirement, or of a reference we can point readers to. We have modified the sentence so that it now comes across as an observation rather than review group policy: Many of their reviews also include a suggested design for a future trial in this section.
"cited in over half" -I would suggest to revise "mentioned (or used) in over half". We have changed to 'used in over half'.
Conclusion statement at the end of the manuscript: the same comment as for the conclusion statement at the end of the Abstract. We have addressed this issue. See response to comment 4.
Perhaps the authors should mention some other resources that the trialists should use to improve some of these aspects, such as core outcome sets. See response to Comment #4.
No competing interests were disclosed. Competing Interests: 6