Keywords
Altmetrics, bibliometrics, publication impact
This article is included in the Research on Research, Policy & Culture gateway.
This article is included in the Interactive Figures collection.
Altmetrics, bibliometrics, publication impact
Article-level measures of publication impact (alternative metrics or altmetrics) can help to inform the impact of a publication among different audiences and in different contexts. Although the journal impact factor (JIF) may help to identify journals with a high readership, it is widely recognised as being a poor indicator of the quality or impact of individual research articles1,2. We have previously described a novel approach to summarising altmetrics, the EMPIRE (EMpirical Publication Impact and Reach Evaluation) Index, which uses article-level metrics to assess the impact of medical publications in terms relevant to different stakeholders3. The EMPIRE Index provides component scores for scholarly, social and societal impact, as well as a total impact score and predictive reach metrics. It provides richer information than other commonly used metrics such as the Altmetric Attention Score or JIF, with societal impact being the most distinct component score.
It is widely recognised that publication metrics vary by discipline; to facilitate the comparison of publication impact across different disciplines, field-normalised citation impacts are frequently calculated4. Metrics also vary by publication type. For example, a study found that review articles in pharmacology journals received twice as many citations as original articles5. Here, we present an exploratory investigation of whether disease indications and publication types influence the average EMPIRE Index scores.
This exploratory study investigated 12 disease indications, chosen to reflect a variety of common and rare diseases with a variety of aetiologies. Six of these were rare diseases, selected as a convenience sample of disease indications with which the authors were most familiar. No formal statistical power analysis was undertaken. However, we aimed for disease samples from approximately 1000 publications, which would enable publication type sub-analyses. The six rare disease samples were, therefore, pooled.
Relevant publications were identified for each disease by the appearance of the disease name in the publication title. We limited the search period to items with publication dates between 1 May 2017 and 1 May 2018, to give sufficient time for metrics to accumulate while also minimising the time-dependent variation in metrics.
The searches were conducted on PubMed between 22 June 2020 and 3 July 2020, using the following search string:
For each disease, we conducted secondary searches for each publication type using PubMed tags for those of interest (i.e. the search string above and either “review”, "systematic review", "clinical trial, phase iii", “clinical trial” or "observational study"). Altmetrics were obtained for all publications from Altmetric Explorer and PlumX over the period 23 June 2020 to 11 July 2020. Altmetrics were assumed to be zero for any publication for which Altmetric Explorer did not return a result. We also obtained the journal CiteScore for all publications6.
EMPIRE Index scores were calculated for all publications as described previously3. Briefly, selected altmetrics that compose the EMPIRE Index were weighted and aggregated to form three component scores (social impact, scholarly impact and societal impact), which were then summed to form a total impact score.
Each disease area comprised a different mixture of publication types, which we expected could confound the analysis; multivariate analysis on such a heterogenous, non-normal and zero-inflated data set is problematic. Therefore, we opted to create standardised samples through random polling.
A sample was created for each disease area with a standardised mix of publication types chosen to maximise the total number of publications retained (the standardised publication types [SPT] set). First, the two least common publication types (phase 3 clinical trials and systematic reviews) were excluded because of the high variation between disease areas and because they are largely subsets of other publication types (clinical trials and reviews, respectively). Although the observational studies publication type was only slightly more common than systematic reviews, it was retained as it was considered to be functionally very different from clinical trials and reviews. The proportions of each of the remaining three publication types were calculated for each disease set, as well as for the overall set. Publications were then trimmed from each disease set by random sampling, as needed, to match the proportions in the overall set. The trimmed publication sets formed the SPT set.
Similarly, each publication type comprised a different mix of diseases. A standardised disease areas (SDA) set was created by random sampling using a similar approach that ensured each publication type included the same mix of diseases, while maximising the total number of publications retained.
To provide an indication of public interest in each of these diseases, we downloaded weekly Google Trends data on relative interest over time for the period of interest for these diseases (May 1 2017 to May 1 2018). A score of 100 indicates the maximum interest in any week over the search period and across any of the search terms of interest. The year averages presented here are expressed relative to that maximum score.
As these analyses were exploratory, we primarily provide descriptive statistics and only minimal statistical analysis was undertaken. Intra-group differences were assessed using Kruskal-Wallis one-way analysis of variance, a non-parametric test for equality of population means (a significant result indicates that that at least one population median of one group is different from the population median of at least one other group).
In total, 20 577 publications were identified across the 12 disease areas7, of which 5825 (28%) were tagged with one of the publication types of interest (Table 1). Table 1 also shows the Google search interest for each of these diseases.
Google search interest is the average weekly interest across the search period, and is a relative score 0–100 where 100 is the maximum score for any disease in any individual week.
The numbers of publications retained in the SDA set used for publication type comparisons (i.e. with the same disease indication composition for each publication type) are shown in Table 2.
Median EMPIRE Index scores and CiteScores for each disease in the SDA set are shown in Figure 1 and Table 3. Mean EMPIRE Index scores, shown in Figure 2, broadly reflect the median scores. Statistical analysis indicated that there was some significant variation in the medians of each component as well as the total impact score and journal CiteScore. In general, the ranking of publication type is relatively consistent across different types of impact. Notably, phase 3 clinical trials had the highest median and mean scores, while observational studies had the lowest. Systematic reviews had higher impact than reviews. Most articles across all publication types had no societal impact, and significant differences in societal impact were driven by outliers. Of note, eight of the ten publications with the highest societal impact were clinical trials, and six of those were in non-small cell lung cancer (NSCLC).
CI, confidence interval.
The interactive version (online only, accessible here: https://s3.eu-west-2.amazonaws.com/ox.em/webflow/p29ieu21/chart1.html) also shows mean EMPIRE Index scores for each disease by publication type (full set).
The numbers of publications retained in the SPT set used for disease comparisons (i.e. with the same publication type composition for each disease indication) are shown in Table 4.
Median EMPIRE Index scores and journal CiteScores for each disease in the SPT set are shown in Figure 3 and Table 5. Kruskall–Wallis testing indicated at least one significant pairwise difference in the total scores, each component score and journal CiteScore. Migraine and multiple sclerosis (MS) had the highest impact across social and scholarly component scores as well as the total impact score, while NSCLC and psoriasis had the lowest. Most articles across all diseases had no societal impact, with significant differences in societal impact driven by outliers. The eight publications with the highest societal impact were all important clinical outcomes trials (three in type 2 diabetes, three in NSCLC and one each in migraine and asthma).
MS, multiple sclerosis; NSCLC, non-small cell lung cancer; T2D, type 2 diabetes.
Mean EMPIRE Index scores for each disease in the SPT set are shown in Figure 4. The interactive version of Figure 4 (online publication only) also shows the mean EMPIRE Index scores by disease for each publication type (full data set). Mean scores do not show clear trends for differences between disease indications, although societal impact appears to be lower for asthma and MS, and higher for migraine than other diseases. The high societal impact for migraine was driven by review articles; 16 of the 23 migraine articles with societal impact scores above zero were review articles. The scholarly impact for rare diseases appears to be higher than for other disease areas, albeit with low confidence owing to small numbers of publications included.
The interactive version (online only, accessible here: https://s3.eu-west-2.amazonaws.com/ox.em/webflow/p29ieu21/chart2.html) also shows mean EMPIRE Index scores for each disease by publication type (full set). MS, multiple sclerosis; NSCLC, non-small cell lung cancer; T2D, type 2 diabetes.
This analysis found that typical EMPIRE Index scores vary across both disease indications and publication types. These results provide valuable contextual information for interpreting EMPIRE Index scores and publication metric findings in general, for individual publications. For example, these findings can be used to help to understand whether a particular publication has notably high (or low) metrics.
We found considerable differences between disease areas, which broadly reflected public interest in the disease (as assessed through Google search interest). For example, the three diseases with the highest median EMPIRE Index scores, especially social impact, were migraine, MS and asthma; these also had the highest public interest. These differences were not observed in journal CiteScores, meaning that the disease areas with higher EMPIRE Index impact were not necessarily published in ‘high impact’ journals. NSCLC had low public interest (‘lung cancer’ as a general term was higher, but still lower than any of the other five major disease areas examined). Publications in NSCLC also had low median total impact scores, particularly in terms of social impact, despite being published in journals with higher median CiteScores.
Although this suggests distinct differences between diseases in terms of publication impact, it should be noted that the period of interest was only a single year. The findings could therefore have been influenced by the completion of important clinical studies, which can vary from year to year across disease areas.
A clear picture is seen for publication types, with phase 3 trials demonstrating much higher metrics than other types. The high impact of phase 3 clinical trials is to be expected, given that they are intended to provide practice-changing information. Systematic reviews had higher impact than general reviews; interestingly, this was despite being published in journals with similar median CiteScores. This likely reflects that the methodological approach to synthesising systematic literature reviews makes them more impactful. Observational studies had the lowest impact, suggesting observational analyses are still generally regarded as having lower interest.
In general, across both publication types and disease indications, median scores were higher for scholarly impact than for social or societal impact, while mean and maximal scores were broadly similar (or lower). This suggests that score distribution is more skewed for social and societal impact, with many papers generating little interest despite some scholarly impact.
A key strength of this study is the use of an automated approach to identify a large pool of publications for analysis. However, the automated process used depends on the reliability of the underlying data. For example, disease areas were identified through a PubMed search on article titles, which may have excluded some relevant articles or included irrelevant ones. The PubMed search engine uses automatic term mapping, which usually makes the search more inclusive but can introduce inconsistencies8. Publication types were identified by metadata tags, but these can often be inconsistently applied or missing. It can also result in duplication; for example, some phase 3 clinical trial publications in our sample were also classified as clinical trials.
In conclusion, the EMPIRE Index successfully identified differences in impact by disease indication and publication type. This supports the notion that there is no universal gold standard metric for publications, and instead the impact of each publication needs to be evaluated in the context of the type of publication, disease area and potentially other factors. These findings should be considered when using the EMPIRE Index to assess publication impact.
Figshare: EMPIRE Index disease and publication type analysis. https://doi.org/10.6084/m9.figshare.17072435.v17
This project contains the following underlying data:
SMA metrics unlinked 11Jul20.xlsx
Psoriasis metrics unlinked 11Jul20.xlsx
NSCLC metrics unlinked 5Jul20.xlsx
NET metrics unlinked 11Jul20.xlsx
NASH metrics unlinked 11Jul20.xlsx
MS metrics unlinked 5Jul20.xlsx
Migraine metrics unlinked 5Jul20.xlsx
Google search interest (30Jul21).xlsx
DLBCL metrics unlinked 11Jul20.xlsx
Asthma metrics unlinked 5Jul20.xlsx
TSC metrics unlinked 11Jul20.xlsx
TNBC metrics unlinked 11Jul20.xlsx
T2DM metrics unlinked 5Jul20.xlsx
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Partly
Is the study design appropriate and is the work technically sound?
Partly
Are sufficient details of methods and analysis provided to allow replication by others?
No
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Partly
Are the conclusions drawn adequately supported by the results?
No
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Bibliometrics, altmetrics, academic search engines, scholarly social networks
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |||
---|---|---|---|
1 | 2 | 3 | |
Version 5 (revision) 30 Oct 24 |
read | ||
Version 4 (revision) 16 Sep 24 |
read | read | |
Version 3 (revision) 10 Mar 23 |
read | ||
Version 2 (revision) 12 Apr 22 |
read | ||
Version 1 27 Jan 22 |
read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)