Keywords
reporting bias, publication bias, selective reporting, outcome reporting, research methods, meta-research, interrupted time series, discrepancies in reporting
This article is included in the Research on Research, Policy & Culture gateway.
Interrupted time-series (ITS) studies are commonly used to examine the effects of interventions targeted at populations. Suppression of ITS studies or results within these studies, known as reporting bias, has the potential to bias the evidence-base on a particular topic, with potential consequences for healthcare decision-making. Therefore, we aim to determine whether there is evidence of reporting bias among ITS studies.
We will conduct a search for published protocols of ITS studies and reports of their results in PubMed, MEDLINE, and Embase up to December 31, 2022. We contact the authors of the ITS studies to seek information about their study, including submission status, data for unpublished results, and reasons for non-publication or non-reporting of certain outcomes. We will examine if there is evidence of publication bias by examining whether time-to-publication is influenced by the statistical significance of the study’s results for the primary research question using Cox proportional hazards regression. We will examine whether there is evidence of discrepancies in outcomes by comparing those specified in the protocols with those in the reports of results, and we will examine whether the statistical significance of an outcome’s result is associated with how completely that result is reported using multivariable logistic regression. Finally, we will examine discrepancies between protocols and reports of results in the methods by examining the data collection processes, model characteristics, and statistical analysis methods. Discrepancies will be summarized using descriptive statistics.
These findings will inform systematic reviewers and policymakers about the extent of reporting biases and may inform the development of mechanisms to reduce such biases.
reporting bias, publication bias, selective reporting, outcome reporting, research methods, meta-research, interrupted time series, discrepancies in reporting
We have made minor changes to the Abstract and the main text's Introduction, in response to reviewers' comments. We also elaborated on the Ethical considerations and Data availability sections. For more details, please refer to our response to the reviewers' reports.
See the authors' detailed response to the review by Jos Verbeek
See the authors' detailed response to the review by Qian Yu
An interrupted time series (ITS) study is a non-randomized design, commonly used to evaluate interventions targeted at populations (e.g., introduction of tobacco plain packaging laws on the number of calls to smoking cessation helplines)1 when randomized trials are not practical, and in some circumstances, ethical. This design can be less susceptible to bias than other non-randomized designs, such as before-after designs.2–5 The use of ITS designs in public health has been increasing with time,6–8 but has gained particular prominence during the COVID-19 pandemic to evaluate the effectiveness of COVID-19 reduction strategies (e.g., lockdowns),9,10 as well as their impact on non-COVID-19 conditions.11–13 The design has primarily been used in high-income countries, and to a more limited extent in low- and middle-income countries.
In an ITS study, measurements of an outcome variable are often collected continuously over time and aggregated (using summary statistics such as counts or proportions) within, generally, regular time intervals (e.g., weekly, monthly) for analysis. The ‘interruption’ separates the time series into pre- and post-interruption segments. The underlying time trend in the pre-interruption segment can be estimated and used to predict what would have occurred in the post-interruption period had the interruption not occurred (referred to as the ‘counterfactual’). Differences between the predicted trend and the observed post-interruption trend can be used to quantify the effects of the interruption, such as an immediate change at the time of interruption (‘level change’) or the change in slope between pre- and post-interruption (‘slope change’).4,14,15
Systematic reviews may be undertaken to collate and synthesize evidence on the effects of interventions. In reviews that examine the effects of policy interventions, ITS studies are likely to be eligible because evidence from randomized trials may be limited or unavailable. A key factor underpinning the validity of the findings from systematic reviews is the extent of reporting bias in the evidence-base.
Reporting bias can arise when there is suppression of entire studies (known as publication bias) or results within studies that are unfavorable to the study hypotheses (known as selective reporting bias) due to the nature of the results themselves (i.e., based on their direction, magnitude, or P-value).16,17 Reporting bias has the potential to bias conclusions drawn in systematic reviews (with or without meta-analyses), with potential consequences for healthcare decision-making.18–20 Unlike randomized trials, where prospective registration is required by ethics committees and journals,21,22 and many trial registries exist, such requirements and registries do not exist for ITS studies.23,24 As a result, the same drivers are not in place to prespecify outcomes and analysis plans, or to publish unfavorable results of ITS studies. Therefore, reporting bias may exist to a greater extent in ITS studies.
The extent of the reporting bias among ITS studies is unknown. We aim to determine the extent of reporting bias among ITS studies based on the following three objectives:
1. To examine whether the publication of ITS studies is influenced by their results (i.e., publication bias).
2. To examine (i) whether there are discrepancies between outcomes specified in protocols of ITS studies and reports of their results and (ii) whether there is evidence of selective result reporting among ITS studies.
3. To examine whether there are discrepancies in the reporting of methods between protocols and reports of ITS studies.
We will conduct a search for published protocols of ITS studies indexed in three bibliographic databases (PubMed, MEDLINE, and Embase via Ovid) and in the JMIR Research Protocols, each from inception to December 31, 2022. For bibliographic databases, we will use a sensitivity-maximizing search filter developed by our team for ITS studies.25
Eligible protocols include protocols and statistical analysis plans of ITS studies, including studies that plan to conduct ITS analysis alongside other planned analyses (e.g., qualitative analysis or cost-effectiveness modelling). For the purpose of our study, an ITS study is defined as one in which (a) there are at least two segments separated by a clearly defined interruption (i.e., an intervention or an exposure), (b) there are at least three data points for at least two of the segments, and (c) each data point represents a summary statistic (e.g., mean and rate) of individual observations collected from a group of individuals (e.g., within a country, state, hospital, or other unit) within a period of time (e.g., weekly or monthly). Both controlled and uncontrolled ITS studies will be considered eligible. We will only include ITS study protocols written in English.
One author (PYN) will screen all titles, abstracts, and potentially eligible full-text reports. A 10% random sample of full-text reports deemed ineligible and all full-text reports deemed eligible by the first author will be independently screened by the second author (JEM, ST, EK, or MJP). Discrepancies will be resolved through discussion between the authors or through team discussions.
For each ITS study protocol meeting the inclusion criteria for our study, we will search for corresponding report(s) of the results using the following approaches: 1) searching in Ovid MEDLINE and Embase for the study’s title and acronyms, trial registration number (if reported), and any co-author publications between the first author and last author; 2) checking for updates on registration sites, such as ClinicalTrials.gov; and 3) using forward citation searching tools, such as Web of Science’s Cited Reference Search. For the purpose of this study, ‘report(s) of the results’ are defined as any peer-reviewed report that provides quantitative results for any outcome collected as part of the ITS. Other reports related to this study that present results for data not collected as part of the ITS will not be included. Methods paper utilising results from an ITS study will be excluded if not referenced in the protocol. If multiple eligible reports of results are found, we will include all of them. For each protocol, we will search for a report of the results only if at least 6 months have passed since the date of completion of data collection (as stated or implied in the ITS study protocol), or if not specified, after the date of publication of the protocol. We will not search for results if recruitment or data collection are confirmed to be ongoing at the time of the search.
Data extraction will be conducted using standardized extraction forms created in the Research Electronic Data Capture (REDCap)26 hosted at Monash University. Information will be collected from the protocols and reports of the results (and all their supplementary files) and journal websites. Five authors (PYN, JEM, EK, MJP, and ST) will pilot the data extraction forms on five ITS studies to refine the items and achieve a shared understanding of the forms. One author (PYN) will extract the data for the remaining studies, and a second author (JEM, EK, MJP, or ST) will independently extract data for a random sample of 10% of the studies. For any items where we observed a high degree of inconsistency, we will undertake double data extraction for these items on a further randomly selected sample of studies. In addition, we will hold weekly meetings with all the authors to discuss any uncertainty arising during data extraction. Discrepancies will be resolved, and necessary amendments to the data extraction form will be made following these discussions. The data collection forms are summarized in Table 1.
From each ITS study protocol, one author (PN) will identify the primary ITS research question(s), which will be used to determine which results are eligible for our assessment of publication and outcome/result reporting bias (see Box 1 for an example). In determining the primary ITS research question (or questions) from the protocol, we will only consider the population/setting, the interruption group(s), and the comparator group(s) elements of the research question and not the planned outcomes to be measured (note that we use the term ‘group’ to refer to interventions or exposures that occur in different time periods or segments). In a simple ITS with one interruption, there is only one comparison that can be made (between the pre- and post-interruption periods); for ITS with multiple interruptions, more than two comparisons are possible (Figure 2). Furthermore, the impact of interruption may be assessed in different populations. In reporting the research questions, authors may or may not be explicit about all elements or comparisons of primary interests. For the purpose of our study, we will use the reported primary question elements to determine which results are eligible for our assessment of publication and result reporting bias (see Box 1 for an example). The primary ITS question elements will be those that are reported by the authors as ‘primary,’ or the first reported in the protocol.
Suppose an interrupted time series (ITS) has three segments. The first segment is the pre-intervention control; the second is a minimal implementation of intervention A; and, the third is an intensive implementation of intervention A.
The intervention is evaluated in two populations (population 1 and 2).
If the authors stated “Our primary aim is to compare the intensive implementation A versus control”, then for our assessment of publication and outcome/result reporting bias, we would consider any results pertaining only to this comparison, but for any population (since the population was not stated in the aim).
If the authors stated “Our primary aim is to examine the effect of intervention A”, then for our assessment of publication and outcome/result reporting bias, we would consider any result pertaining to the comparisons ‘minimal intervention A versus control’, ‘intensive intervention A versus control’, ‘intensive intervention A versus minimal intervention A’; and, for any population.
For each ITS study, we will record details of the methods, including the data collection process, model characteristics, and statistical analysis methods, as presented in both the protocol and report(s) of the results. We will record the eligibility criteria to select participants for the ITS, whether data collection is retrospective or prospective (or both), the start and end dates of the data to be collected, and whether the data were collected by the study authors or external to the study (e.g., collected as part of an administrative database). Data collection will be classified as retrospective if data were already available at the time of the protocol and as prospective if data will be collected during the study period. We will collect information to describe all time segments, including the start and end dates and the number of data points per segment. We will extract the model structure (if reported) or attempt to determine the model structure based on reported information (e.g., whether level change and/or slope change was modelled, whether the impact of the interruption was immediate or delayed, and how the interruption period was incorporated in the analysis). For statistical analysis methods, we will record the estimation method (e.g., ordinary least squares, restricted maximum likelihood), methods to deal with autocorrelation, seasonality, non-stationarity, and outliers; any adjustment for covariates; and use of control group and subgroup analyses.
In addition, we will record information related to the status of publication of the study results, as well as the journal name and date of publication for the protocol and all reports of the ITS results.
For all ITS outcomes addressing the primary research question(s), we will record the following details of the outcome, as presented in both the protocol and report(s) of the results: (a) the description of the outcome, (b) whether a higher value indicates benefit or harm, (c) whether the outcome was specified as a primary or secondary outcome, or neither; (d) the data type of the individual-level observations (e.g., dichotomous, continuous); and (e) the data type of the summary statistic used to aggregate the individual-level observations (e.g., count, percentage) within each period.
For all ITS results of the primary research question(s), we will record the effect measures reported (e.g. immediate level change, slope change), and (b) the available details about the results including: effect estimates, confidence intervals (along with level, e.g. 95%), exact P-values, statistical significance threshold (e.g. 5%), direction of the effect estimate (e.g. “favouring interruption group”).
We will contact the corresponding authors to seek unpublished information about their studies when there are (i) no report(s) of the results or (ii) missing or incompletely reported results for the primary research questions (as defined in “Study-level information”). We will contact the authors using the email address provided in the ITS study protocol or report(s) of the results. We will send up to three reminders to each author at a minimum of two weeks apart in the case of non-response. If the corresponding author does not respond, we will attempt to contact the other authors.
Once the study authors provide informed consent to participate in the study, we will provide them with an electronic survey via Qualtrics. The survey will seek information on whether data collection and analysis were completed, whether the study has been submitted to a journal, date of submission, publication and/or rejection, reasons for not submitting and potential reasons for rejection, and the name of the corresponding journal.
For each ITS study, we will ask if the authors are willing to share information pertaining to the outcomes and results relevant to the primary ITS research question(s). We will ask the authors to share the reasons for not reporting an outcome, reporting an outcome that was not specified in the protocol, or for inconsistencies in labelling the primary status of the outcome. For each result that was not reported or not fully reported, we will ask the authors whether the result was statistically significant (at the 5% significance threshold) and whether the effect estimate favors the interruption or the comparator group.
The data received from the authors will be stored in a secure location. In reporting results from our analyses, individual studies or their results will not be identifiable.
Only ITS protocols for which we have searched for report(s) of the results will be included in this analysis. Studies with ongoing recruitment or data collection, those confirmed to be abandoned, and those for which we have confirmed that the analysis has not been undertaken, will be excluded from our analysis.
We will calculate the percentage of protocols that do not have a report of ITS results and summarize the reasons why the results were not published. We define a ‘report of ITS results’ as a peer-reviewed report that includes results for any outcome pertaining to the primary ITS research question(s), as defined in “Study-level information”. We define ‘results’ as quantitative results (e.g. effect estimate, 95% confidence interval, P-value), or qualitative statements about the statistical significance, P-value or direction of the effect estimate. Where data are available, each result will be classified as statistically significant (P-value <0.05 or, if absent, the 95% confidence interval not including the null) or not statistically significant (P-value ≥0.05 or, if absent, 95% confidence interval including the null). The direction of the result will be classified as “favouring interruption group” (i.e. showing beneficial effects or reducing harm) or “favouring comparator group”.
We will undertake a multivariable Cox proportional hazards regression to determine whether a statistically significant effect estimate that favors the interruption group is associated with time to publication. Although statistical significance is not recommended for interpreting results, it is still widely used by researchers and journal editors to assess whether findings are potentially worth publishing17,27 and, therefore, may influence time to publication.28 Hazard ratios and 95% confidence intervals from the Cox regression analysis will be reported. Time to publication of results will be defined based on when the protocol is submitted relative to data collection. For studies where data collection was retrospective relative to the protocol submission, the time to publication will be from the protocol’s submission date to the date of publication of the results. If the date of protocol submission has not been reported, we will substitute it with the date of protocol publication. For studies in which the protocol submission occurs prior to or during data collection, the time to publication will be from the last date of the data collection period to the date of publication of the results. If there is more than one report of the ITS results, we will calculate the time to publication for each report. Protocols for which results for the primary research question(s) are not available will be censored on the date of the last search for results. The following potential confounding factors will be included as covariates in the model: type of funding (government, not-for-profit, industry, undisclosed, or no funding), presence of prospective registration, and timing of data collection relative to the date of the protocol (retrospective, prospective, or both). We will adjust for potential correlations arising from clustering of results within studies, which might arise from studies having multiple outcomes, multiple comparisons, or multiple results (Figure 2).
Studies with at least one ITS report will be eligible for this analysis. For the primary research question(s), as defined in “Study-level information”, we will classify the study as having discrepancies in the reporting of outcomes if
• Any outcome specified in the protocol that was not mentioned in the report of the results.
• Any outcome reported in the report of the results was not pre-specified in the protocol.
• Any outcome was inconsistently labelled (e.g. being described as’primary’ in the protocol but ‘secondary’ in the report of the results, or not being given a label in the protocol but being labelled as ‘primary’ or ‘secondary’ in the report of results). If an outcome is used in a power calculation, it will be considered the primary outcome.
For each effect estimate of each outcome, we classified the results using the approach proposed by Chan et al.29 as follows:
• Fully reported – sufficient data is reported to include a result in a meta-analysis, that is, an effect estimate and a measure of precision (e.g., 95% confidence interval, standard error).
• partially reported – insufficient data are reported to include a result in a meta-analysis (e.g., an effect estimate is reported without any measure of precision); or
• qualitatively reported – only a statement about the statistical significance or the direction of the result (e.g. “there was no significant effect on road fatalities”), or only a P-value, is reported; or
• unreported – an outcome is mentioned in the protocol but no result is reported.
We will calculate the percentage of ITS studies that have discrepancies in the reporting of outcomes, the percentage of results that are fully reported, partially reported, qualitatively reported, or unreported, and summarize the reasons provided by the study authors for any discrepancy in reporting outcomes or failure to report any result. We will conduct a multilevel multivariable logistic regression to determine whether a statistically significant effect estimate that favors the interruption group is associated with the full reporting of the results. We willfit two models, one unadjusted and one adjusted for the following potential confounders: type of funding (government, not-for-profit, industry, undisclosed, no funding) and outcome status (primary/secondary/unspecified).30 Both models will be adjusted for potential correlations arising from clustering of results within studies, which may arise from studies with multiple outcomes, multiple effect estimates for each outcome, and multiple comparisons (in series with more than two segments). In both models, calculated odds ratios and 95% confidence intervals will be reported.
Studies with at least one ITS report will be eligible for this analysis. If there are multiple results reports for the primary research question(s), we will select the report with the most detailed methods for comparison with the protocol. When this decision is not clear, we will determine the selected report via discussion between two authors (PN and EK/ST/MJP/JEM). For the primary research question(s), as defined in “Study-level information”, we will compare the planned methods in the protocol and the primary report of the results to identify discrepancies in any aspect of the ITS methods, including the data collection process, model characteristics, and statistical analysis methods (see Table 1 for details). All discrepancies recorded will be reviewed by at least two statisticians (AF, EK, JEM, or ST) to judge whether the discrepancy was important; that is, whether the discrepancy could potentially change the result. A set of rules on what is considered an important discrepancy for each aspect of the methods will be determined via consensus among all the authors prior to data analysis. We will report the percentage of ITS studies that have any discrepancies and studies that have important discrepancies for each aspect of the methods. We will also summarize the rationales provided for the discrepancies, as reported by the study authors.
Selective non-publication of studies and selective non-reporting of results can bias conclusions drawn in systematic reviews (with or without meta-analyses).18–20,31 Left-unaddressed, reporting biases in the ITS literature have the potential to lead to implementation of large-scale interventions that are at best, not effective, and at worst, harmful. Similar to other non-randomized studies, ITS literature is likely to be more prone to reporting biases than randomized trials because of a lack of mechanisms to encourage publication and reporting of all results, such as study registries, making registration a condition for publication, and guidelines for transparent reporting.23,24,32 Furthermore, the analysis of ITS designs involves making many decisions, such as choice of the model structure, unit of time for aggregating observations, statistical methods, whether and how to adjust for autocorrelation, and other potential confounders, which can yield varied results depending on the decisions chosen.4,15 Such multiplicity in analysis decisions provides an opportunity for study authors to report the most favorable results.
To our knowledge, this is the first study to systematically assess reporting biases among ITS studies across a range of topics and assess discrepancies in the reporting of methods between protocols and reports of their results. Knowledge of the prevalence of reporting biases will be useful for systematic reviewers and other stakeholders who rely on evidence from ITS studies and meta-analyses for decision-making. Furthermore, our results highlight the need for mechanisms to encourage complete reporting, such as the establishment of registries for non-randomized studies or planned analyses of existing datasets (e.g., administrative databases), along with incentives to register such studies and analyses. Moreover, based on our findings regarding the details of the methods most prone to discrepancies, future reporting checklists for ITS studies may incorporate recommendations for reporting these details.
Our study has some limitations. First, it is difficult to identify ITS studies at their inception. Many studies evaluating publication bias and selective reporting of results have focused on randomized trials, where the trials were identified by ethics committees.33,34 We chose not to use ethics committees as a source for identifying ITS studies because ITS studies are sometimes undertaken without ethics approval being sought; thus, using this source would likely provide an incomplete sample of ITS studies. Instead, we will construct our sample from studies with a published protocol. However, using this sample may lead to underestimation of reporting biases, given that the presence of an a priori plan is associated with a higher quality of design35 and a lower likelihood of reporting biases.36 Second, it is possible that we may miss report(s) of the results that are published in gray literature or not made public.
We will seek ethics approval from the Monash University Human Research Ethics Committee before contacting the study authors to clarify and seek missing information from their publications. A consent form will be sent to authors of the included ITS reports via email or via a Monash University approved survey provider (Qualtrics). Only data accompanied by signed consent forms will be included in the study. Findings from the quantitative analysis will be reported in aggregate to maintain study authors’ confidentiality. Data will be stored on a Monash University secure server.
Data that is extracted from the published ITS protocols and their associated reports will be deposited in a free-access data repository with a CC-BY license allowing reuse with attribution, and assigned a digital object identifier (DOI).
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the rationale for, and objectives of, the study clearly described?
Yes
Is the study design appropriate for the research question?
Yes
Are sufficient details of the methods provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Not applicable
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: public health study design, biostatistics
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: environmental and occupational health, systematic reviews
Is the rationale for, and objectives of, the study clearly described?
Yes
Is the study design appropriate for the research question?
Yes
Are sufficient details of the methods provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Clinical pharmacy
Is the rationale for, and objectives of, the study clearly described?
Yes
Is the study design appropriate for the research question?
Partly
Are sufficient details of the methods provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Not applicable
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: environmental and occupational health, systematic reviews
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |||
---|---|---|---|
1 | 2 | 3 | |
Version 2 (revision) 04 Nov 24 |
read | read | |
Version 1 01 Mar 24 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)