The use of ‘PICO for synthesis’ and methods for synthesis without meta-analysis: protocol for a survey of current practice in systematic reviews of health interventions

Introduction: Systematic reviews involve synthesis of research to inform decision making by clinicians, consumers, policy makers and researchers. While guidance for synthesis often focuses on meta-analysis, synthesis begins with specifying the ’PICO for each synthesis’ (i.e. the criteria for deciding which populations, interventions, comparators and outcomes are eligible for each analysis). Synthesis may also involve the use of statistical methods other than meta-analysis (e.g. vote counting based on the direction of effect, presenting the range of effects, combining P values) augmented by visual display, tables and text-based summaries. This study examines these two aspects of synthesis. Objectives: To identify and describe current practice in systematic reviews of health interventions in relation to: (i) approaches to grouping and definition of PICO characteristics for synthesis; and (ii) methods of summary and synthesis when meta-analysis is not used. Methods: We will randomly sample 100 systematic reviews of the quantitative effects of public health and health systems interventions published in 2018 and indexed in the Health Evidence and Health Systems Evidence databases. Two authors will independently screen citations for eligibility. Two authors will confirm eligibility based on full text, then extract data for 20% of reviews on the specification and use of PICO for synthesis, and the presentation and synthesis methods used (e.g. statistical synthesis methods, tabulation, visual displays, structured summary). The remaining reviews will be confirmed as eligible and data extracted by a single author. We will use descriptive statistics to summarise the specification of methods and their use in practice. We will compare how clearly the PICO for synthesis is specified in reviews that primarily use meta-analysis and those that do not. Conclusion: This study will provide an understanding of current practice in two important aspects of the synthesis process, enabling future research to test the feasibility and impact of different approaches.


Introduction
Systematic reviews provide a method for collating and synthesising research, and are used to inform decision making by clinicians, consumers, policy makers and researchers 1 . In health intervention research, the synthesis component of systematic reviews is often narrowly considered as the use of statistical methods to combine the results of studies, primarily meta-analysis, and much of the available guidance focuses on this approach. However, 'synthesis' can be considered more broadly as a process, beginning with defining the review questions, planning the groups to be compared, examining the characteristics of the available studies and their data, and applying appropriate methods to present and synthesise quantitative data from among multiple options (see Figure 1). Decisions made early in the process have important impacts on the information included in the synthesis, and meta-analysis may not always be possible or appropriate.
In this study, we plan to examine two intertwined aspects of synthesis that commonly challenge authors of systematic reviews examining the effects of health interventions (identified in italics in Figure 1): approaches to planning how studies will be grouped for synthesis within the review (the 'PICO (Population, Intervention, Comparator, Outcome) for each synthesis'); and the application of methods other than meta-analysis to summarise and synthesise quantitative results (hereafter described as 'other synthesis methods'). There has been limited examination of the range of approaches used to define the PICO for each synthesis and which other synthesis methods are used in current practice. Yet, these are essential aspects of the synthesis in systematic reviews.
Recent guidance published in the Cochrane Handbook for Systematic Reviews of Interventions 2-4 has outlined proposed methods for specifying the PICO for each synthesis and a range of other synthesis methods. Reporting guidance for 'synthesis without meta-analysis' (SWiM) has also been published 5 , covering these topics. However, further research is required to understand current practice and investigate how review authors approach the PICO for each synthesis and other synthesis methods.
We now expand on the concept of 'PICO for each synthesis' and methods for synthesising and presenting findings other than meta-analysis.

PICO for each synthesis
In reviews of the effects of interventions, authors commonly use the 'PICO' framework to prespecify the populations, interventions, comparators and outcomes that will be Steps in evidence synthesis are to plan synthesis, explore data and conduct synthesis. Key issues examined in this study identified in italics. PICO = Population, Intervention, Comparator, Outcome.

Amendments from Version 1
This protocol has been revised in response to peer review feedback. The title has been revised to more clearly communicate our proposed study design. The Background has been revised to provide a more extensive discussion of our methods of interest: specifying the 'PICO for each synthesis' and methods for synthesis and summary other than meta-analysis. We discuss how diversity and heterogeneity in included studies relate to decisions about grouping studies for synthesis. We clarify how our focus on methods to summarise and synthesise quantitative results in the absence of meta-analysis relates to methods commonly described as 'narrative synthesis'. We provide additional detail on how we propose to identify our concepts of interest in practice through our data collection (content previously limited to our data dictionary, which was provided as Extended data). We provide additional detail on our screening, piloting and data extraction methods. The Abstract has also been revised to reflect these changes and provide additional detail.
Any further responses from the reviewers can be found at the end of the article REVISED used to determine whether studies are eligible for the review 6 . While this definition of the 'PICO for the review' is viewed as a core component of a systematic review, more specific criteria are likely to be needed to define which groups of studies will contribute to each analysis within a review: the 'PICO for each synthesis'. The PICO for each synthesis can be considered an operationalisation of the review objectives.
The process for defining the PICO for each synthesis ideally involves identifying characteristics (e.g. of the intervention or population) that may be expected to modify the intervention effect; clearly labelling and defining groups based on these characteristics (these may be based on an existing classification system if available); and planning how these groups will be used in synthesis and reporting. Groups may be analysed together in an overall synthesis, or they may be considered in separate syntheses 4 . Within an overall analysis, the defined groups may be used to explore any differences in the estimated effects (i.e. to explore statistical heterogeneity through the use of subgroup analysis). An example demonstrating the distinction between the PICO for the review and the PICO for each synthesis is presented in Box 1.

Box 1. Example: PICO for the review and PICO for each synthesis
In a review of psychosocial interventions for smoking cessation 7 , the PICO for the review included any psychosocial intervention in pregnant women to help them stop smoking.
One of the objectives of the review was to examine "the effectiveness of the main psychosocial intervention strategies in supporting women to stop smoking in pregnancy (i.e. counselling, health education, feedback, social support, incentives, exercise)". In order to meet this objective, a series of syntheses were presented within the review to assess the effects of each intervention strategy. So, for example, the PICO for the first synthesis presented included any counselling intervention for women during pregnancy compared to usual care, measuring the outcome of smoking abstinence in late pregnancy.
Another objective was to determine whether psychosocial interventions were effective in general. To address this objective, all intervention types were included in a single metaanalysis. Within this analysis, single, multi-component, and tailored interventions were presented as subgroups, to examine whether intervention effects were modified by having multiple or tailored components. Providing such definition has important advantages. Creating a consistent language to describe different groups or interventions can increase clarity of terminology for readers, allowing authors to compare features between the included studies and make consistent, transparent decisions about grouping similar studies for inclusion in a synthesis 3 .
The PICO for synthesis also provides a framework for examining similarities and differences in the characteristics of studies contributing to each analysis, facilitating qualitative synthesis of characteristics needed to interpret results. This qualitative synthesis is a particularly import feature of reviews where there is diversity in study characteristics that may explain findings (e.g. intervention complexity, different study designs) 8 . Such diversity sometimes triggers a decision not to use meta-analysis, and instead adopt alternative methods to synthesise and present findings. In these circumstances, it is common for authors to refer to their synthesis methods as 'narrative' 9 , reflecting the integration of the synthesis of quantitative results from studies with the qualitative synthesis of study characteristics. In this study, we distinguish between these elements and, in the section that follows, focus on the methods used to combine quantitative data on intervention effects using a statistical technique and to present the results of these analyses.
Synthesising and presenting findings without metaanalysis Many systematic reviews examining the effects of health interventions use meta-analysis of effect estimates to combine the results of studies 9,10 . However, it is estimated that between 35% and 56% of systematic reviews do not use meta-analysis at all 9,10 , and a larger percentage of reviews do not use meta-analysis for at least some outcomes. The reasons for not undertaking meta-analysis vary, but a commonly reported reason is that the included studies do not report data that is amenable to metaanalysis 9,11 . For example, studies may report effect estimates without a measure of variance, or only report the direction of effect, or they may report different effect measures that cannot be transformed into a common effect measure 2 . Diversity of study characteristics and the presence of statistical heterogeneity are other reasons given for not meta-analysing, but these are more contentious. The first brings into question whether any synthesis is appropriate, while the second may be addressed by using extensions to meta-analysis (e.g. meta-regression, prediction intervals) that attempt to explain or encompass heterogeneity 2,11 .
When meta-analysis of effect estimates is not possible, a range of summary and other synthesis methods are available (see examples in Table 1). These methods include alternative statistical synthesis methods, such as presenting summary statistics (e.g. range of effects), combining P values, and vote counting based on direction of effect. These synthesis methods may be augmented using tables, visual display (e.g. harvest plots, albatross plots) and, where synthesis is not appropriate, structured summaries of the results of individual studies 2,12 .
Other synthesis methods provide more limited information for health care decision making in comparison to meta-analysis (for example, providing information on the likely direction of effect, rather than an estimate of its magnitude 2 ). Nevertheless, structured summary or synthesis approaches may be preferable to simply presenting an unstructured description of study-level results, in which there is a risk that authors may privilege the results of some studies over others without appropriate justification, possibly introducing bias 9 .
Importantly, the use of other synthesis methods may alter the nature of the question answered by the review and the type of reasoning used to reach conclusions 2,13 .

Research context
We are unaware of other studies that have explicitly examined approaches to defining the PICO for each synthesis and planning comparisons. One cross-sectional study collected data on which PICO characteristics (e.g. population) were used to group studies for presentation or analysis within systematic reviews 9 . However, this study did not capture more detailed information on the basis of these groupings (e.g. was the population grouped by clinical disease characteristics, age or socioeconomic status), nor precisely how these groups were used in the synthesis.
Previous studies have examined the synthesis methods used in systematic reviews, and have estimated the percentage of reviews with and without meta-analysis 10,11,20 . One study examined systematic reviews of public health interventions that did not use meta-analysis in further detail 9 . They captured data on the use and reporting of "narrative" (text-based) synthesis and methods to investigate heterogeneity, but specific details of the synthesis methods used in the reviews were not captured. Another study examined the use of outcome groupings in synthesis and the use of methods other than meta-analysis, but the study was limited to Cochrane systematic reviews published before 2012 21 .

Objectives
The objectives of this study are to identify and describe current practice in systematic reviews examining the quantitative effects of public health and health systems interventions in relation to: 1. Approaches to grouping and definition of PICO characteristics for synthesis.
2. Methods of summary and synthesis when meta-analysis is not used.
Here we report the proposed methods for a cross-sectional study of a sample of systematic reviews. • Synthesis methods (e.g. meta-analysis, descriptive statistics combining P values, vote counting based on statistical significance, vote counting based on the direction of effect) • Methods to investigate or encompass heterogeneity (e.g. subgroup analysis, metaregression, prediction intervals, non-parametric methods) • Presentation methods (e.g. tables, forest plots, box-and-whisker plots, bubble plots, albatross plots, harvest plots, effect direction plots, stacked bar plots, funnel plots) • Methods used to select among multiple effect estimates eligible for a synthesis

• Reporting of changes to planned methods
Examples of data items to be collected from sample, including systematic review characteristics, PICO for each synthesis and summary and synthesis methods. PICO = Population, Intervention, Comparator, Outcome. The complete draft data dictionary is available as Extended data 19 .

Overview
We will identify a sample of systematic reviews examining the quantitative effects of public health or health systems interventions. We will identify and describe the methods used to define the PICO for each synthesis and the methods used to summarise and synthesise results, including meta-analysis and other methods. Two authors will undertake study selection. One author will undertake data extraction, and a second author will conduct independent data extraction from a subset of studies. Any amendments or additions to this protocol will be reported in resulting publications.

Eligibility criteria
We will include systematic reviews that meet the following criteria: 1. A study that aims to synthesise the results of primary studies, states eligibility criteria for inclusion of studies, and reports a search strategy to identify potentially eligible studies.
2. Examines quantitative effects of any public health or health systems intervention, including policies, programs and strategies, as well as treatments and elements of care.
3. Includes at least one comparison with at least two studies, where a comparison is defined as examining the effect on an outcome of an intervention compared with a specific alternative.

Published in English.
We will exclude systemic reviews that: 1. Synthesise the results of other systematic reviews, such as overviews of reviews.
2. Answer questions that are not about effectiveness, for example prevalence, association, unplanned environmental exposures, prognosis, diagnosis and research methodology.
Our criterion for deciding that a review is 'systematic' is intentionally inclusive compared to available definitions 10,22,23 . This is because we are explicitly interested in identifying systematic reviews with a range of methods, and not only those meeting a minimum standard of methods or reporting.
Our focus is on systematic reviews of public health and health systems interventions. Reviews in these areas are likely to feature diversity in included populations and settings, as well as intervention complexity 24 . They are likely to include a range of study designs in addition to randomised trials, which in turn creates diversity in the effect measures used. Systematic reviews of public health and health systems interventions are more likely than other reviews to use synthesis methods other than meta-analysis 9,10 .

Sample size
For reasons of feasibility, we will restrict the number of included reviews to 100. A sample of this size will allow us to estimate the proportion of reviews that use, for example, a particular synthesis or presentation method to within a maximum margin of error of 10%. This assumes a prevalence of 50%, but for a smaller or larger prevalence, the margin of error will be smaller. We anticipate that the proportion of reviews included in our sample that contain no meta-analyses will be approximately 50% 9 .

Search strategy
Records of all the systematic reviews published during 2018 will be obtained from two databases of systematic reviews: Health Systems Evidence and Health Evidence (see Table 2). These databases index systematic reviews of public health and health systems interventions, respectively. Some reviews identified by the search may have final citations outside 2018, for example arising from the difference between the date of online first publication and final publication in an issue of the journal, or the time lag between publication and indexing in a database. In these cases, the reference information will be updated to reflect the final citation, but reviews will not be excluded.

Study selection
The records of systematic reviews retrieved from the two databases will initially be stored in Endnote and duplicate records removed. The selection and data extraction processes will then proceed using EPPI-Reviewer 25 . Reviews will be randomly selected from this larger set using EPPI-Reviewer's random selection function, and screened for eligibility until our target sample of 100 is met.
Records will be independently screened by two authors (MC and one of SB or JM) based on the title and abstract, and any clearly ineligible records excluded. The full text of potentially eligible SRs will then be retrieved and assessed independently against the eligibility criteria by one author (MC). A second author (either SB or JM) will assess the full text of a sample of 20% of records. At each stage, we will resolve any disagreements by consensus, and consult a third author if consensus is not possible.
For each included systematic review, any protocol or registration record referred to in the review will be retrieved. In addition, protocols will be retrieved for any systematic reviews published in the Cochrane or Campbell Libraries, as they are a requirement of publication in these journals.

Data extraction and management
We will develop a data extraction form drawing on a previous methodological study that has examined synthesis and presentation methods used in systematic reviews 21 , as well as frameworks and methods outlined in relevant guidance 2-4 .
We will collect data relating to the review characteristics, PICO characteristics used to group studies for each synthesis, and the synthesis methods used. Examples of data to be collected are presented in Table 1. The complete draft data dictionary is available online as Extended data 19 . Both explicit methods described in the review and implicit methods observed in textual descriptions, tables and figures will be coded. Both planned and implemented methods will be collected where these differ.
In seeking to map current practice, we note that terms such as 'narrative synthesis' can be applied to a wide range of approaches, and will seek to identify specific components in our included reviews rather than relying on broad descriptive terms. We will collect: • Sources of guidance referred to in the text.
• Methods of summary and synthesis explicitly specified in the Methods section.
• Methods of summary and synthesis used in practice (whether named or used implicitly in the text).
• Specific elements that may appear within a text-based summary approach, such as the use of consistent effect measures across studies, the use of non-parametric summary statistics such as ranges, various methods of vote counting, and the use of PICO groupings to structure text or tables.
• Explicit statements by the authors that they have been unable to implement planned PICO groupings or synthesis methods, their stated reasons for this, and what changes they made to their methods in response.
One author (MC) will extract data from all included reviews, and a second author (either SB or JM) will extract data independently on a sample of 20% of the included reviews (including those with and without meta-analysis). We will pilot test the data extraction form and coding guidance on five reviews to ensure we capture all items, to refine the items and guidance when we uncover ambiguity or a lack of clarity, and to achieve a shared understanding of the form. This will be achieved using an iterative process, where we discuss discrepancies and ambiguities as extraction is completed on each review, and revise the data extraction form and coding guidance in response to these discussions. Duplicate data extraction on the selected sample will then proceed, and agreement will be assessed at the end of this phase. For any data items in which a high degree of inconsistency is observed, duplicate data extraction will be undertaken for a further random sample of reviews. During the final, single data extraction phase, any uncertainties arising will be discussed with three authors (MC, SB, JM) and consensus reached.
We will limit our data collection to information contained in the published report(s) of the SR, including protocols and registry records, and will not contact authors to obtain additional information.

Analysis
We will calculate descriptive summary statistics to characterise the extent to which authors specify their PICO for synthesis, and the synthesis and presentation methods. For example, the percentage of reviews where intervention groups are listed by name, are defined in enough detail for replication, and have an explicit role in the planned synthesis. Similarly, these percentages will be calculated for the other PICO elements.
For dichotomous or categorical data, we will calculate percentages and frequencies. For continuous or count data, we will calculate the means (with standard deviations) and medians (with interquartile ranges). We will examine whether approaches used to group the PICO for each synthesis are associated with the type of synthesis method by calculating differences in percentages between groups with 95% confidence intervals. Data will be tabulated and summarised in figures. Analyses will be undertaken using STATA 28 .

Dissemination
The findings of the research outlined in this protocol will be published. Associated datasets, data collection forms and analyses not included in any publication will be made publicly available via an online repository.

Study status
At submission of this protocol, the search had been conducted and screening of abstracts completed. Full text screening and piloting of the data extraction form was in progress.

Discussion
In this review, we will examine the methods choices for two intertwined elements of synthesis in systematic reviews. Namely, the approaches used to define and group PICO characteristics, and the types of synthesis methods other than meta-analysis. The results from our review will provide a snapshot of these practices, and highlight where improvements may be required in the application and reporting of the methods. Further, the study will provide a baseline assessment prior to release of recent guidance published in the Cochrane Handbook of Systematic Reviews of Interventions 2-4 , against which future assessments can be compared.
There are several strengths to our study. Our sample of systematic reviews is likely to be representative of public health and health systems intervention reviews because the source databases from which we will select our sample, and our inclusion criteria, place no restrictions on the intervention type or other features of the systematic reviews (e.g. the type of included study designs). A further strength is that our data extraction items are based on pre-existing frameworks to classify both the PICO groupings and methods of summary and synthesis. This will ensure that we are capturing specific methods and enhance the consistency of our data extraction.
There are some possible limitations in our proposed methods. For some items, the sample size may not be large enough to yield precise estimates of the percentage of systematic reviews that use particular methods. In addition, we will not undertake independent full text screening and data extraction of all studies by two authors, leaving some risk that data will be missed or misclassified. However, the review team has extensive experience in systematic reviews of public health and health services interventions, having written guidance for, coauthored, and edited many such reviews. While this will not mitigate missing information in the papers, it will help with making judgments required in the data extraction. Given that the aim of our study is to gain a broad understanding of current practice, we think this is unlikely to have an important impact on conclusions.
When complete, the findings of this study will be published and communicated at conferences, in addition to dissemination through international networks of researchers and authors of methodological guidance in the field of systematic reviews.
Authors of systematic reviews face challenges in the organisation and analysis of data, including the complexity of grouping studies for comparison, and synthesis methods when meta-analysis is not available. This protocol outlines the methods for a cross-sectional study that aims to examine the approaches used to define and group PICO characteristics, and the types of synthesis methods other than meta-analysis in a sample of systematic reviews of public health and health services interventions.

Data availability
Underlying data No underlying data are associated with this article. This is a very thoughtful and clear response, thank you -no further comments. 6.

Extended data
The changes are clear, no further comment. 7.
My apologies for missing the link to the "Extended Data". The authors are correct that this is exactly what I was looking for. The changes to make this information more prominent are welcome. No further comment.

8.
I thank the authors for their patient and thorough response, no further comment. 9.
The authors make a strong case for single reviewer extraction being appropriate in this study and I am convinced they have introduced appropriate safeguards to ensure sufficiently accurate extraction, no further comment.

10.
The authors present a good argument for not conducting a more detailed piloting process. My concern arose partly from my missing the extended data from the original manuscript, which has been addressed, and partly from my experience as an editor working with relatively inexperienced review teams. For the latter, piloting during protocol development tends to be really important as it throws up important issues in the validity and application of the planned methods. For experienced teams using familiar methods, these concerns would diminish, so I am satisfied with the authors' response. No further comment.

11.
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Systematic review methods in environmental health research.

Version 1
Reviewer Report 05 November 2020 https://doi.org/10.5256/f1000research.26991.r73015 © 2020 Whaley P. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

General comments
The authors present a protocol for a planned systematic review of meta-analysis and "other" methods for synthesising evidence in systematic reviews of public health and health systems interventions. It is definitely a novel, interesting, and worthwhile project of potentially very high value -I find myself in full agreement with the authors that "other" methods are underutilised and underdeveloped in SRs and a survey of current practices will provide valuable evidence for improvements in this area.
However, I am not convinced that the proposed approach is the finished article. Besides some relatively trivial issues with the clarity of the objectives, I am particularly concerned about the lack of detail on how the authors are going to classify the methods employed in the included SRs. There is no comprehensive code book presented, so it is not possible to judge if the authors are gathering information which is going to optimally feed their intended analysis. The authors also discuss how various elements of the methodology will be piloted -but since this will resolve the issues with the protocol, I would strongly advocate that piloting being conducted now, and the protocol revised on the basis of that. I would definitely like to see a piloted code-book presented as part of the protocol. I think this is very promising work, just incomplete. It is also very challenging to communicate some of the issues as written feedback. I hope what I write below makes sense and is fair (or at least is constructive). If I over-explain in places, it is not because I am assuming the authors to be ignorant of their own subject matter but am attempting to provide context for a reader of the published peer-review. Since this review is not anonymous, I am sure the authors can track down my contact details easily enough. I would be more than happy to discuss my comments over the phone if they had any questions.

Specific comments
I am not completely sure if the title clearly and succinctly captures exactly what the review is about. For example, I see why the authors call it a "cross-sectional study", but it seems an odd occasion on which to use the term. Would it be plainer to call it something like: "A survey of current evidence synthesis practices in quantitative systematic reviews of public health and health systems interventions: study protocol" (I'm sure the authors can improve 1. on this!).
The abstract is quite difficult to follow -I had to read the paper in order to really grasp what the abstract was about. The authors might want to reconsider how they are summarising their research, maybe focusing on how PICOs structure SRs and syntheses in the introduction. They should clearly state the objective (it's too compressed at the moment). The Methods section gives a lot of relatively trivial information about the extraction strategy, yet only has once sentence for the difficult and interesting part, which is the methodology for comparing approaches (what does it mean by their intent to "compare approaches" here? The summary seems too compressed.).

2.
I wonder if the concept of "synthesis" is clearly enough introduced, or if it could be more directly introduced and defined at the outset. Currently, the concept is introduced after "however" in sentence 3 of para 1, contextualised by the claim of synthesis methods as often being narrowly considered. This seems back-to-front -would it be better to state what "synthesis" means, qualify this as being broader than some understandings, and then introduce the role of the PICO in structuring evidence synthesis (whether narrative or quantitative).

3.
I am not sure about the terminology in splitting synthesis into meta-analysis vs. "other synthesis methods". There is a rich history of development and analysis of "other" methods, which I am sure the authors are aware of but they could perhaps use more fully. Often, the difference is defined as quantitative vs. either non-quantitative methods, narrative synthesis methods, or synthesis without meta-analysis ("SWIM"). This feels like contextual information that would be useful to the reader (given the author's assumption that there is a lack of awareness of this in the discipline) and might result in a choice of phrasing which better reflects established conventions.

4.
This is a more trivial point. The authors state "Recent guidance published in the Cochrane Handbook for Systematic Reviews of Interventions has outlined proposed options in these two areas" -it would be helpful if these other areas were briefly indicated.

5.
I am not sure if the discussion of reasons for not conducting meta-analysis is sufficiently nuanced. Certainly, lack of numeric data would challenge meta-analysis, but is a common issue not also excess heterogeneity between study designs? (Maybe this is less important for clinical SRs than it is in my field of environmental health.) Either way, in such cases narrative or SWIM methods come to the fore. They are complex, and I worry they are not sufficiently acknowledged in the rather broad statement that various synthesis methods such as harvest plots, etc. "may be preferable to textual description of the results in which there is a risk that authors may privilege the results of some studies over others without appropriate justification, possibly introducing bias". Echoing an earlier comment, I feel that this skates over some quite well-developed narrative or SWIM methodology which seeks to counter these concerns when providing textual summarisation. Introducing more of this theory I think is important for the protocol, coding the included SRs, and interpreting the methods in those SRs. (Note I use the term "narrative" in the sense of synthesis without quantitative methods, not in the sense of a narrative vs. systematic review.)

6.
I wonder if it is worth the authors clarifying up-front that this is a survey of SRs of 7.
quantitative data, just to be clear that qualitative research is not the target of investigation. I think I only resolved this at the point of reading the eligibility criteria. Likewise the focus on public health and health systems interventions.
I am not clear as to how are the authors going to code and classify "other" methods? The authors have not provided a code book, so the utility and validity of their coding methodology cannot be evaluated. It seems like a potentially complex challenge, particularly if studies do not define their synthesis methods (it may be impossible for them to do so if there is no agreed way of categorising PICO-based synthesis approaches). The code definitions and criteria should be defined as far as possible in advance. I also note that in Table 2 "Examples of data collection items" that for summary and synthesis methods that the examples are almost exclusively related to quantitative methods and few examples of SWIM methods are given. This suggests to me that more planning around classifying these "other" methods is needed, to anticipate what they are and how they are accommodated in the analysis.

8.
In terms of providing a code book, as handling editor, I gave the authors of this protocol a similarly hard time as I am giving the present authors. I am also unclear as to how the concept of "PICO for each synthesis" is operationalised in the data extraction methods and the synthesis approach the authors will themselves be using. Introducing this concept seems a central concern of the introduction, yet by the methods section, it seems to have faded into the background. How is this concept going to structure the extraction and analysis, such that after conducting this survey we know how authors of SRs are using PICO statements in developing the synthesis components of their SRs?

10.
I am not convinced that having just one person coding such complex data is sufficient. Coding decisions are likely to be difficult and require discussion. The authors seem to acknowledge this by requiring that 20% of extraction and coding be double-checked, but they provide no plan for what to do if that double-checking finds inconsistency. Is 80% agreement enough? Will the second extractor be trained to the same level as the first? If not, what does disagreement between the primary and secondary reviewer even mean? If agreement is low, do they intend to revise the coding criteria or seek to resolve disagreement? Will they double-check everything to ensure consistency across the full set of extracted data? I suspect it would be simpler, if more time-consuming, to train two extractors to an equal degree, code in parallel, then discuss the results to achieve considered consistency.

11.
In terms of piloting, I would be much more comfortable if this was conducted and reported as part of the protocol development process. Since piloting is part of planning, I am not personally of the view that it is sufficient to state in a protocol that something will be piloted. Doing the piloting now would also help answer quite a few of the questions I have above.

occasion on which to use the term. Would it be plainer to call it something like: "A survey of current evidence synthesis practices in quantitative systematic reviews of public health and health systems interventions: study protocol" (I'm sure the authors can improve on this!).
Response: We used the term 'cross-sectional' based on the title of other studies that have also used survey sampling methodology to sample the reviews (e.g. Page, et al. 1 ). However, we agree with the reviewer's suggestion that 'survey' more accurately captures the study design. In addition, we have revised the title to be plainer, while still capturing the specific systematic review practices to be examined.

Changes made:
The revised title is: "The use of 'PICO for synthesis' and methods for synthesis without meta-analysis: protocol for a survey of current practice in systematic reviews of health interventions" ○

The abstract is quite difficult to follow -I had to read the paper in order to really grasp what the abstract was about. The authors might want to reconsider how they are summarising their research, maybe focusing on how PICOs structure SRs and syntheses in the introduction. They should clearly state the objective (it's too compressed at the moment). The Methods section gives a lot of relatively trivial information about the extraction strategy, yet only has once sentence for the difficult and interesting part, which is the methodology for comparing approaches (what does
it mean by their intent to "compare approaches" here? The summary seems too compressed.).

I am not sure about the terminology in splitting synthesis into meta-analysis vs. "other synthesis methods".
There is a rich history of development and analysis of "other" methods, which I am sure the authors are aware of but they could perhaps use more fully. Often, the difference is defined as quantitative vs. either non-quantitative methods, narrative synthesis methods, or synthesis without meta-analysis ("SWIM"). This feels like contextual information that would be useful to the reader (given the author's assumption that there is a lack of awareness of this in the discipline) and might result in a choice of phrasing which better reflects established conventions.
Response: The split between meta-analysis and 'other methods' mirrors that in our work in the Cochrane Handbook for Systematic Reviews of Interventions, 2 but we now see that 'other methods' is too vague in this context. We have revised the heading to be more explanatory (see list of changes below).
Terminology is something that we have debated at length with our co-authors as contributors to the Cochrane Handbook 2-6 and the SWiM reporting guidance. 7 We agree that it is important to follow convention, but in our view, there is a need to increase conceptual clarity in this area through clearer delineation of the steps and methods captured under the umbrella of 'narrative synthesis' (and synonyms such as 'nonquantitative' and SWiM). Used correctly, these terms are broader than our concept of 'other synthesis methods' where we refer to the methods used to present and aggregate quantitative results from two or more studies (for clarification of scope, see response to Comment 8). For this reason, we don't use 'narrative synthesis' (or synonyms) for these methods, but have made revisions to address these conceptual issues.
Changes made: In the paragraph starting "Recent guidance …" (Introduction, para 3) we have added and referenced the following "Reporting guidance for 'synthesis without metaanalysis' (SWiM) has also been published …" 7 ○ At the end of the section on 'PICO for synthesis', we have added the following paragraph: "The PICO for synthesis also provides a framework for examining similarities and differences in the characteristics of studies contributing to each analysis, facilitating qualitative synthesis of characteristics needed to interpret results. This qualitative synthesis is a particularly import feature of reviews where there is diversity in study characteristics that may explain findings (e.g. intervention complexity, different study designs). 6 Such diversity sometimes triggers a decision not to use meta-analysis, and instead adopt alternative methods to synthesise and present findings. In these circumstances, it is common for authors to refer to their synthesis methods as 'narrative', 8 reflecting the integration of the synthesis of quantitative results from studies with the qualitative synthesis of study characteristics. In this study, we distinguish between these elements and, in the section that follows, focus on the methods used to combine quantitative data on intervention effects using a statistical technique and to present the results of these analyses." ○ Changed section heading in Introduction from 'Other synthesis methods' to 'Synthesising and presenting findings without meta-analysis'. Note that we include the term 'presenting' to encompass structured summary of individual study results and visual presentation of data.

This is a more trivial point. The authors state "Recent guidance published in the Cochrane
Handbook for Systematic Reviews of Interventions has outlined proposed options in these two areas" -it would be helpful if these other areas were briefly indicated. Response: Regarding the point about diversity (heterogeneity), we agree that this is a common reason given for not using meta-analysis, so we have addressed this in the manuscript (see list of changes below).
Regarding consideration of narrative/SWiM methods, as the reviewer notes, this issue relates closely to that covered in our response to Comment 4, where we consider the scope of narrative synthesis in relation to our project. We agree that there is benefit in expanding on these conceptual issues, as done in the new paragraph (see Comment 4). We provide further clarification of our perspective below.
It is not our intention to minimise the complexity of methods encompassed by the concept of 'narrative synthesis'. Instead, we believe that the components and process of narrative synthesis need to be disentangled to provide clear guidance for review authors on how to plan, conduct and report their methods. Following the lead of authors such as Melendez Torres, 9 and our own work on the Cochrane Handbook 2-5 and SWiM, 7 we see three main components commonly conceived as part of a narrative synthesis: the planning work done to decide how studies will be grouped to address the review questions (the PICO for each synthesis), 1.
the qualitative analysis of study characteristics (PICO/PECO and study design features) done to prepare for synthesis, interpret and explain the quantitative synthesis findings, and 2.
the methods used to present and synthesise quantitative results from studies. 3.
Our study will examine (1) and (3) of the above. It is not within the scope of this study to examine (2) or the process by which authors integrate the findings from (2) and (3). Although these latter elements are common in reviews reporting 'narrative synthesis', in our view they are essential features of any review where there is diversity (and complexity) of study characteristics that must be considered in order to identify and explain variation in effects across studies. While this may be described as the 'narrative' synthesis that 'tells the story' of the evidence, it can be done irrespective of whether meta-analysis is used or not.

Changes made:
Addition of new paragraph addressing the concept of narrative synthesis and indicating that diversity of study characteristics may be a trigger for not using metaanalysis (as per comment 4) ('PICO for each synthesis, final para). ○ Added the following to address diversity: "Diversity of study characteristics and the presence of statistical heterogeneity are other reasons given for not meta-analysing, but these are more contentious. The first brings into question whether any synthesis is appropriate, the second may be addressed by using extensions to meta-analysis (e.g. meta-regression, prediction intervals) that attempt to explain or encompass heterogeneity." ('Synthesising and presenting findings without meta-analysis', para 1) ○ Edited the sentence that reads '…textual description of the results …' to 'Nevertheless, structured summary or synthesis approaches may be preferable to simply presenting an unstructured inventory of study-level results…'. We made this change to avoid any misunderstanding that by 'textual description' we are referring to narrative synthesis. ('Synthesising and presenting findings without meta-analysis', para 3) ○ Amended our text to distinguish between structured summaries of the results of individual studies, and more unstructured summaries that are common in reviews.

I wonder if it is worth the authors clarifying up-front that this is a survey of SRs of quantitative data, just to be clear that qualitative research is not the target of investigation. I think I only resolved this at the point of reading the eligibility criteria. Likewise the focus on public health and health systems interventions.
Response: We have revised the text throughout to specify our focus on reviews of quantitative data and public health and services interventions, including the Abstract and Objectives. Response: In addition to the example data collection items presented in the table, we provided a link to our draft data dictionary as Extended Data in the original version of this paper ('Data extraction and management' section and Data Availability statement, see https://doi.org/10.26180/5edb178961d68). The data dictionary provides the items, their response options, and a description of the items along with some guidance on coding. The dictionary covers both the data items relevant to the PICO for each synthesis (sections 2-6) and synthesis, summary and presentation methods (section 7). We believe the information provided in our data dictionary is as comprehensive as the example suggested by the reviewer.
As noted in our response to Comments 4 and 6, our focus is on a subset of methods considered within the broader concept of narrative synthesis (or SWiM). Specifically, the planning of groups for synthesis (PICO for synthesis) and the methods used to present and synthesise quantitative results. The items we have included to capture the synthesis, summary and presentation methods are aligned with the methods we have outlined in Chapter 12 of the Cochrane Handbook, 'Synthesizing and presenting findings using other methods' 2 (https://training.cochrane.org/handbook/current/chapter-12#section-12-2). We have successfully used these items in a previous study. 10 Changes made: Moved the reference to the data dictionary to the first paragraph of the 'Data extraction and management' section to increase its prominence.
○ Added additional detail of the data items in Table 1 (note that the former Table 2 has been renumbered as it is now referenced earlier in the text), and added a reference to the full data dictionary in the table's caption.

○
Added the following paragraph to the 'Data extraction and management' section: "In seeking to map current practice, we note that terms such as 'narrative synthesis' can be applied to a wide range of approaches, and will seek to identify specific components in our included reviews rather than relying on broad descriptive terms. We will collect: ○ Sources of guidance referred to in the text. ○ Specific elements that may appear within a text-based summary approach, such as the use of consistent effect measures across studies, the use of non-parametric summary statistics such as ranges, various methods of vote counting, and the use of PICO groupings to structure text or tables. ○ Explicit statements by the authors that they have been unable to implement planned PICO groupings or synthesis methods, their stated reasons for this, and what changes they made to their methods in response." ○ 9. I am also unclear as to how the concept of "PICO for each synthesis" is operationalised in the data extraction methods and the synthesis approach the authors will themselves be using. Introducing this concept seems a central concern of the introduction, yet by the methods section, it seems to have faded into the background. How is this concept going to structure the extraction and analysis, such that after conducting this survey we know how authors of SRs are using PICO statements in developing the synthesis components of their SRs?
Response: Details of how the PICO for synthesis is operationalised are summarised in Table  1, with full details provided in the draft data dictionary. We will undertake a descriptive analysis to characterise the extent to which authors specify their PICO for synthesis, and the basis for their PICO. For example, the percentage of reviews where intervention groups are listed by name, are defined in enough detail for replication, and have an explicit role in the planned synthesis. Similarly, these percentages will be calculated for the other PICO elements.
For each PICO element and study design, we capture any groupings identified, whether an explicit role in synthesis is reported, whether the groupings were specified at the level of detail required for replication, whether the specified groupings were used in practice, whether additional new groupings were used in practice that were not specified, and statements by the authors describing any change in the planned groupings.
In addition, we describe the basis of the groupings (such as whether groupings are based on existing taxonomies, or whether they are based on e.g. disease categories, health equity characteristics, time of measurement, etc.) using categories drawn from the frameworks provided for each PICO element in Chapter 3, Section 3.2 of the Cochrane Handbook for Systematic Reviews of Interventions 4 (available free online at https://training.cochrane.org/handbook/current/chapter-03#section-3-2).
Changes made: Added additional detail of the data items collected to examine the PICO for each synthesis in Table 1 (previously Table 2).
○ Added the following text to the analysis section "We will calculate descriptive summary statistics to characterise the extent to which authors specify their PICO for synthesis, and the synthesis and presentation methods. Response: We recognise that a potential limitation of the review is that we will not undertake double data extraction of all reviews; this is due to limited resources. However, we are investing substantial time and effort in the piloting stage to refine the items and guidance, and achieve a shared understanding of the form (see below for the changes made in this regard). The team undertaking the data extraction have extensive experience in systematic reviews of public health and health services interventions, having written guidance for, co-authored, and edited many such reviews. While we appreciate the experience of the team does not mitigate missing information in the papers (a point which we note in the discussion of the revised protocol), it does help with making the judgments required in the data extraction. Finally, we believe that the consequences of some errors in the data extraction in a methodological review such as this are less serious than that of reviews examining the effects of interventions and environmental exposures, where misleading results arising from errors in the data extraction can impact policy decisions, patients' health outcomes and quality of care.
Changes made: We have revised the 'Data extraction and management' section of the paper to provide more detail on our proposed approach: "One author (MC) will extract data from all included reviews, and a second author (either SB or JM) will extract data independently on a sample of 20% of the included reviews (including those with and without meta-analysis). We will pilot test the data extraction form and coding guidance on five reviews to ensure we capture all items, to refine the items and guidance when we uncover ambiguity or a lack of clarity, and to achieve a shared understanding of the form. This will be achieved using an iterative process, where we discuss discrepancies and ambiguities as extraction is completed on each review, and revise the data extraction form and coding guidance in response to these discussions. Duplicate data extraction on the selected sample will then proceed, and agreement will be assessed at the end of this phase. For any data items in which a high degree of inconsistency is observed, duplicate data extraction will be undertaken for a further random sample of reviews. During the final, single data extraction phase, any uncertainties arising will be discussed with three authors (MC, SB, JM) and consensus reached." (para 3)

○
We have added the following text to the Discussion: "However, the review team has extensive experience in systematic reviews of public health and health services interventions, having written guidance for, co-authored, and edited many such reviews. While this will not mitigate missing information in the papers, it will help with making judgments required in the data extraction. Given that the aim of our study is to gain a broad understanding of current practice, we think this is unlikely to have an important impact on conclusions." (para 3) ○ 11. In terms of piloting, I would be much more comfortable if this was conducted and reported as part of the protocol development process. Since piloting is part of planning, I am not personally of the view that it is sufficient to state in a protocol that something will be piloted. Doing the piloting now would also help answer quite a few of the questions I have above.
Response: We agree that piloting is an important aspect of a systematic review. Aspects of the data extraction for the present study (i.e. other synthesis methods items) were based on a previous study we undertook, where we had high concordance in coding across reviewers (including both experienced methodologists, and those with limited experience). We note that the current data extraction form is based on a previous study in the 'Data extraction and management' section.
We note that the reviewer suggests that piloting should be conducted and reported as part of the protocol development process. We are not aware that there is an established convention for this. For example, it is not a requirement in Cochrane review protocolswhich outline the methods for reviews evaluating the effects of healthcare interventionsto report a completed piloting process.
Our protocol as submitted provides our planned methods at a point in time. We will report changes to those methods in our final report, including the finalised data dictionary arising from the piloting. We have used this approach in other methods reviews. [11][12][13] For example, in Arnup et al, 11 we undertook a review examining statistical methods used in cluster-randomised crossover trials. In a supplementary file to that paper, Table S1, we outlined changes in methods from protocol, and included the final data extraction form (which incorporated modifications from piloting).
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.