F1000Prime: an analysis of discipline-specific reader data from Mendeley

Robin Haunschild; Lutz Bornmann

doi:10.12688/f1000research.6062.2

Home Browse F1000Prime: an analysis of discipline-specific reader data from Mendeley

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Research Note

Revised

F1000Prime: an analysis of discipline-specific reader data from Mendeley

[version 2; peer review: 1 approved with reservations, 1 not approved]

Robin Haunschild¹, Lutz Bornmann¹

PUBLISHED 08 May 2015

Author details Author details

¹ Max Planck Institute for Solid State Research, Heisenbergstr. 1, Stuttgart, 70569, Germany

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Research on Research, Policy & Culture gateway.

Abstract

We have used the F1000Prime recommended paper set (n= 114,582 biomedical papers) to inquire the number of Mendeley readers per (sub-) discipline via the Mendeley Application Programming Interface (API). Although the (sub-) discipline of Mendeley readers is self-assigned and not mandatory, we find that a large share (99.9%) of readers at Mendeley does share their (sub-) discipline. As expected, we find most readers of F1000Prime recommended papers work in the disciplines of biology and medicine. A network analysis reveals strong connections between the disciplines of engineering, chemistry, physics, biology, and medicine.

Keywords

F1000Prime, Altmetrics, Mendeley, paper, evaluation

Corresponding author: Robin Haunschild

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2015 Haunschild R and Bornmann L. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

How to cite: Haunschild R and Bornmann L. F1000Prime: an analysis of discipline-specific reader data from Mendeley [version 2; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2015, 4:41 (https://doi.org/10.12688/f1000research.6062.2) First published: 11 Feb 2015, 4:41 (https://doi.org/10.12688/f1000research.6062.1) Latest published: 08 May 2015, 4:41 (https://doi.org/10.12688/f1000research.6062.2)

Revised Amendments from Version 1

In this new version of the manuscript we have added a new Conclusions section, extended the literature review, methodological presentation and Discussion.

See the authors' detailed response to the review by Rodrigo Costas
See the authors' detailed response to the review by Stefanie Haustein

Introduction

Interest in the broad impact of research (Bornmann, 2012, Bornmann, 2013, King’s College London and Digital Science, 2015) has resulted in new forms of impact measurements. Traditional forms of impact measurements using bibliometrics only allow the measurement of impact on research itself. These new forms which have been named as altmetrics (abbreviation of alternative metrics) pretend to measure the impact of research on other areas of society (than research) by counting the mentions of papers in social media: “Alternative metrics, sometimes shortened to just altmetrics, is an umbrella term covering new ways of approaching, measuring and providing evidence for impact” (Adie, 2014, p. 349). As altmetrics, the number of readers (on Mendeley), mirco-bloggers (on Twitter), and other consumers of research using social media are counted. Although scientometrics research on altmetrics is still in a very early phase (comparable to research on bibliometrics in the 1970s), the use of these data in research evaluation is already an issue. For example, altmetrics is considered in the Snowball Metrics project (Colledge, 2014). This project compiled a set of clearly defined indicators which will be used by participating universities (mostly Anglo-American universities) for research evaluation purposes. Also, many journals (e. g. Nature and PLoS journals) provide altmetric data for their papers on their webpages. Funders have also declared interest in using these metrics (Dinsmore et al., 2014). It seems that altmetrics will be used in practice before scientometrics research has produced standards on their reliable, fair and valid application (Weller, 2015).

This study uses one of the most important sources for altmetrics data, namely Mendeley. Mendeley “claims 3.1 million members. It was originally launched as software for managing and storing documents, but it encourages private and public social networking” (Van Noorden, 2014, p. 126). Since data from Mendeley can be received by an Application Programming Interface (API) without any problems and the coverage of the scientific literature has been pointed out as high (Priem, 2014), Mendeley is a very attractive data source for the reception of research. “Mendeley records the number of users that have listed it [i.e. an article], describing them as readers, whether or not they actually read it. Presumably, listing an article in Mendeley tends to reflect that an article has been read or will be read in the future, although there is no evidence that this assumption is true” (Thelwall & Maflahi, in press). Mohammadi & Thelwall (2014) found significant correlations between Mendeley reader counts and citation counts for the social sciences and humanities. Kraker et al. (2015) analyzed co-readership networks, and they briefly discussed the idea of bibliographic coupling and co-citation. In this study, we take a different approach as compared to Kraker et al. (2015), i.e. our focus is on readership coupling. Overviews about Mendeley readership studies have been presented by Haustein & Larivière (2014), Thelwall & Kousha (in press), and Thelwall & Maflahi (in press).

In order to produce the underlying data set, we match Mendeley data with data from F1000Prime. F1000Prime is a database with biomedical papers and their reviews by peers. It is intended as a support tool for researchers to receive hints for the most important literature. Since it is not clear who actually reads the F1000Prime recommended papers, we investigated the disciplines of researchers (and other people) who have read these papers. We are mainly interested in two questions: are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested? If so, which other disciplines show interest in F1000Prime papers? Which disciplines read F1000Prime papers frequently or seldom together? The latter question will be answered by using social network techniques.

Methods

Peer ratings provided by F1000Prime

F1000Prime is a post-publication peer review system of papers from medical and biological journals. This service is part of the Science Navigation Group, which publishes and develops information services for the professional biomedical community and the consumer market. Papers for F1000Prime are selected by a peer-nominated global "Faculty" of leading scientists and clinicians. The Faculty members rate the papers and explain their importance. This means that only a selected set of papers from the biomedical area covered is reviewed, and most of the papers are actually not (Kreiman & Maunsell, 2011; Wouters & Costas, 2012).

The Faculty nowadays numbers more than 5,000 members worldwide, assisted by further associates, which are organised into more than 40 subjects. Members can choose and evaluate any paper of interest; however, “the great majority pick papers published within the past month, including advance online papers, meaning that users can be made aware of important papers rapidly” (Wets et al., 2003, p. 254). Although many papers published in popular and high-profile journals (e.g. Nature, New England Journal of Medicine, Science) are evaluated, 85% of the papers selected come from specialised or less well-known journals (Wouters & Costas, 2012). The F1000Prime database is regarded as a useful aid for researchers (and other people working research-oriented) to obtain indications of the most relevant papers in the biomedical area: “The aim of Faculty of 1000 is not to provide an evaluation for all papers, as this would simply exacerbate the ‘noise’, but to take advantage of electronic developments to create the optimal human filter for effectively reducing the noise” (Wets et al., 2003, p. 253).

The F1000Prime publication set was provided to one of the authors in 2014. It consists of 149,227 records (papers and recommendations) with 114,582 unique papers which were published in various journals. 104,655 of those papers have a DOI and 112,983 of them have a PubMedID.

Use of the Mendeley API

Within the first half of 2014 the reference manager Mendeley provided a new version of its API. Some restrictions of the previous API were lifted. For example, the usage statistics were previously provided in relative terms and only for the top three entries (Haustein & Larivière, 2014). The new API provides results in absolute numbers and not only for the top three but for all entries. Mendeley provides access to the readership status (e.g. professor, postdoc, or student) and the distribution of the Mendeley readership across scientific disciplines as well as countries via the API. Those sets of data can be correlated with other information available about papers (e.g. citations or Twitter counts).

Before one can start to use the Mendeley API, one has to register as a Mendeley user. Afterwards, registration of the desired application is necessary (http://dev.mendeley.com). Authentication with the API is done via OAuth 2.0. The credentials are set during registration of the application.

We used R (http://www.r-project.org/) to interface to the Mendeley API. It seems to us that using other interfaces does not change the functionality or responsiveness, but we did not try to use other interfaces. Mendeley provides sample codes for Javascript, Python, R, and Ruby (http://dev.mendeley.com/code/sample_code.html), whereby all requests to the API use HTTP GET and POST requests. Therefore, we suppose that any other scripting or programming language may be used. The reply is sent in Javascript Object Notation (JSON).

We requested user statistics for the F1000Prime publication set (n = 114,582 papers) using the PubMedID and DOI between the 4^th and 6^th of December 2014. Although the DOI (and possibly also the PubMedID) is not the unique identifier which it was intended to be (Franceschini et al., 2015), it is the currently best way to identify publications in the Mendeley API. If we could not found the PubMedID in the Mendeley database, the DOI was used. It is rather unlikely but possible that both identifiers are erroneous for the same paper. Therefore, we expect only an insignificant impact of erroneous DOIs and PubMedIDs on our results. We observed seemingly random connection problems. Sometimes those problems occurred after a few hundred or a few thousand requests. The largest chunk of requests we were able to get through the API without connection problems consisted of 47,629 papers. This large number of records is contrasted with smaller chunks of requests (between 1,049 and 9,307 records). Fortunately, data retrieval through the Mendeley API is very easy to restart. Therefore, we continued data retrieval with the same publication record where the connection problem occurred, so that no data loss occurred due to the connection problems.

Mendeley provides a breakdown of the user count into sub-disciplines. The possible values for disciplines and sub-disciplines can be obtained directly from the API via the GET /disciplines endpoint. Each discipline has a certain number of sub-disciplines. The sub-discipline “miscellaneous” occurs in every discipline. Each Mendeley user can select a discipline and a sub-discipline from a drop-down menu. This piece of information is not mandatory, like the user’s location.

Network analysis

Pajek is used to create the F1000Prime readership network (http://pajek.imfm.si/doku.php; de Nooy et al., 2011) applying the spring embedder of Kamada & Kawai (1989). Two reader counts from discipline A and at least two reader counts from discipline B constitute two links between both disciplines. From a matrix point of view, we have a symmetric readership coupling matrix, which has a two in row A and column B and vice versa for the aforementioned example. For detecting communities in the common readership of F1000Prime recommended papers, we used the VOS Clustering algorithm (Waltman et al., 2010), which is available in Pajek. The aim of this algorithm is to provide further insights into the structure of the network (Milojević, 2014). Our Pajek file is available free of charge at http://dx.doi.org/10.6084/m9.figshare.1386685.

Results

It is not possible to distinguish if a Mendeley user only bookmarked a paper or also has read the bookmarked paper. However, for clarity reasons, we refer to bookmarks and observed reads as reader counts (following other studies). We found 6,263,913 Mendeley reader counts (on average 54.67 reader counts per paper) for the F1000Prime publication set. For 99.9% (n=6,257,603) of those reader counts, the discipline and sub-discipline information is also available. This is a much higher percentage than for the geographical location (Haunschild, et al., in press). For the F1000Prime publication set, the vast majority (74.94%) of Mendeley users is found in the “miscellaneous” sub-discipline of all disciplines. Therefore, we added up all the readers of all sub-disciplines for each discipline.

The results of our study are presented in Table 1. Nine disciplines contribute at least 1% to the reader counts of the F1000Prime publication set. The remaining 16 disciplines have less than 1%. As expected, most readers (81.78%) of the F1000Prime literature assign themselves to the biomedical (sub-) disciplines. All other disciplines comprise the remaining 15.19% of the F1000Prime readership at Mendeley. The third largest readership is found in the discipline psychology which is partly related to medicine. After chemistry, which is also related to biology, five other disciplines show readership values above 1% within the F1000Prime literature. Those disciplines seem rather unrelated to the field of biomedical research, especially environmental sciences (according to Figure 1). 3.05% of the F1000Prime reader counts (n = 190,919) at Mendeley come from other disciplines. The disciplines with most reader counts below 1% are: social sciences (0.67%), mathematics (0.42%), electrical and electronic engineering (0.27%), education (0.22%), and materials sciences (0.21%).

Table 1. Mendeley users from different disciplines reading F1000Prime recommended papers.

(sorted in decreasing order).

Discipline	Mendeley reader counts (absolute)	% of Mendeley reader counts
	of F1000Prime recommended papers
Biology	4,149,857	66.32
Medicine	967,412	15.46
Psychology	248,746	3.98
Chemistry	182,773	2.92
Environmental sciences	169,287	2.71
Engineering	100,636	1.61
Computer and information science	96,403	1.54
Physics	88,586	1.42
Earth sciences	62,984	1.01
Social sciences	42,146	0.67
Mathematics	26,219	0.42
Electrical and electronic engineering	17,086	0.27
Education	13,471	0.22
Materials sciences	13,358	0.21
Economics	12,247	0.20
Sports and recreation	10,030	0.16
Business administration	9,307	0.15
Arts and literature	9,187	0.15
Philosophy	8,542	0.14
Management	6,689	0.11
Humanities	6,567	0.10
Astronomy and astrophysics	6,173	0.10
Linguistics	5,446	0.09
Design	2,974	0.05
Law	1,477	0.02

Figure 1.

Network of F1000Prime recommended readers from arts and literature (AnL), astronomy and astrophysics (AsAs), biology (Bio), business administration (BuAd), chemistry (Chem), computer and information science (CIS), design (Des), earth sciences (ESci), economics (Eco), education (Edu), electrical and electronic engineering (EEE), engineering (Eng), environmental sciences (Env), humanities (Hum), law (Law), linguistics (Ling), management (Man), materials sciences (Mate), mathematics (Math), medicine (Med), philosophy (Phil), physics (Phys), psychology (Psy), social sciences (SoSc), sports and recreation (SpRe).

We also analyzed connections between the disciplines (see above). These are shown in Figure 1. A paper which is read by Mendeley users of different disciplines (e.g. biology and physics) constitutes a connection between these disciplines. Therefore, a paper which is read by Mendeley users of the same discipline does not contribute to the network system, but a paper which is read by Mendeley users of different disciplines contributes connections to the network. The size of the vertices in Figure 1 reflects the numbers of reader counts for each discipline. The thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines. One link is established between disciplines A and B if one reader count from both disciplines is observed for the same paper.

The location of the discipline vertex also informs about the connectivity. The closer the vertex is located towards the center, the more connections to different disciplines are found. There are 25 disciplines and 300 links among those disciplines in the dataset. With a density of 1, the network is rather dense. The average node degree is 24. According to Figure 1, the strongest connection shows up between biology (Bio) and medicine (Med). The disciplines computers and information science (CIS), engineering (Eng), and chemistry (Chem) have rather strong connections to biology (Bio) and medicine (Med). The discipline arts and literature (AnL) shows a low amount of readers (0.15%, close to sports and recreation with 0.16%) as well as a good connection to other disciplines in the network.

The community detection algorithm detected one dominant community in the network with biology, medicine, engineering, chemistry, and physics. Probably, the reader counts contributing to the green vertices in the network seem to be only weakly associated to the bio-medical literature. The disciplines shown as yellow vertices amount to 87.9% of the reader counts of the F1000Prime papers, while the remaining 12.1% reader counts are contributed by the disciplines shown as green vertices.

Discussion

The (sub-) discipline of Mendeley readers is self-assigned and not mandatory. Still, we found that a large share (99.9%) of F1000Prime paper readers at Mendeley share their (sub-) discipline. Most readers (74.94%) assign the “miscellaneous” sub-discipline of their discipline to themselves. As the F1000Prime publication set is a collection of high-quality biomedical papers, we found – as expected – most readers in the disciplines of biology and medicine.

The network analyses revealed strong connections between engineering, chemistry, physics, biology, and medicine as well as their rather high reader percentages. These disciplines form a core set which can be differentiated from all other disciplines. In other words, besides this dominating set of disciplines no other set of strongly connecting disciplines could be identified. However, many of these other disciplines (e.g. mathematics, education, and arts and literature) are highly inter-connected (and connected with the core set), as their central location in the network indicates. Environmental sciences, psychology, and computer and information science are closely linked to biology although they are in a different community according to the employed algorithm. Chemistry and biology have a very strong link, much stronger than environmental sciences and biology. This is probably due to bio-chemical papers in the F1000Prime publication set.

Using a very specific data set, this study shows that Mendeley data can be used to investigate meaningfully the readership of a set of publications. Since the used data set here was from the biomedical area, the results agreed more or less to the formerly formulated expectations. Thus, it would be interesting in future research to use the new Mendeley API in order to investigate the readership of inter-disciplinary data sets or data sets for topics (e.g. climate change) which are inter-disciplinarily examined by researchers. Here, interesting insights can be expected using the new Mendeley API and the methods proposed in this study. For future studies, it would be also interesting to combine the discipline information from Mendeley with other user-specific data (the status group and the country of the users). This combination could lead to comparisons of discipline-specific networks between different countries (e.g. USA and India) and status groups (e.g. students and professors). However, for such comparisons the necessary data should be made available by Mendeley.

Conclusions

As expected, most Mendeley reader counts of F1000Prime publications can be associated with biology and medicine, although a significant percentage of reader counts originate from less related disciplines. According to the employed community algorithm 87.9% of the reader counts of the F1000Prime publications constitute the core of the readership of this publication set, while 12.1% of the reader counts are rather unexpected reader counts from less connected disciplines.

Data availability

Figshare: Mendeley reader counts for F1000Prime papers and Pajek network file. Dois: 10.6084/m9.figshare.1301463 (Haunschild & Bornmann, 2014), 10.6084/m9.figshare.1386685 (Haunschild & Bornmann, 2015).

Author contributions

Wrote manuscript: RH and LB, Data acquisition: RH and LB, Data processing: RH, Data analysis: RH and LB, Produced graphics: LB, Manuscript revision: RH and LB.

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Faculty Opinions recommended

References

Adie E: Taking the Alternative Mainstream. Profesional De La Informacion. 2014; 23(4): 349–351. Publisher Full Text
Bornmann L: Measuring the societal impact of research: research is less and less assessed on scientific impact alone--we should aim to quantify the increasingly important contributions of science to society. EMBO Reports. 2012; 13(8): 673–676. PubMed Abstract | Publisher Full Text | Free Full Text
Bornmann L: What is societal impact of research and how can it be assessed? A literature survey. J Am Soc Inf Sci Technol. 2013; 64(2): 217–233. Publisher Full Text
Colledge L: Snowball Metrics Recipe Book. Amsterdam, the Netherlands: Snowball Metrics program partners. 2014. Reference Source
de Nooy W, Mrvar A, Batagelj V: Exploratory social network analysis with Pajek. New York, NY, USA: Cambridge University Press. 2011. Reference Source
Dinsmore A, Allen L, Dolby K: Alternative Perspectives on Impact: The Potential of ALMs and Altmetrics to Inform Funders about Research Impact. PLoS Biol. 2014; 12(11): e1002003. PubMed Abstract | Publisher Full Text | Free Full Text
Franceschini F, Maisano D, Mastrogiacomo L: Errors in DOI indexing by bibliometric databases. Scientometrics. 2015; 102(3): 2181–2186. Publisher Full Text
Haunschild R, Bornmann L: Mendeley reader counts for F1000Prime papers. Figshare. 2014. Data Source
Haunschild R, Bornmann L: Mendeley reader counts for F1000Prime papers. Figshare. 2015. Data Source
Haunschild R, Stefaner M, Bornmann L: Who publishes, reads, and cites papers? An analysis of country information. in press. Reference Source
Haustein S, Larivière V: A multidimensional analysis of Aslib proceedings – using everything but the impact factor. Aslib Journal of Information Management. 2014; 66(4): 358–380. Publisher Full Text
Kamada T, Kawai S: An algorithm for drawing general undirected graphs. Inf Process Lett. 1989; 31(1): 7–15. Publisher Full Text
King’s College London and Digital Science. The nature, scale and beneficiaries of research impact: An initial analysis of Research Excellence Framework (REF) 2014 impact case studies. London, UK: King’s College London. 2015. Reference Source
Kraker P, Schlögl C, Jack K, et al.: Visualization of co-readership patterns from an online reference management system. J Inform. 2015; 9(1): 169–182. Publisher Full Text
Kreiman G, Maunsell JH: Nine criteria for a measure of scientific output. Front Comput Neurosci. 2011; 5: 48. PubMed Abstract | Publisher Full Text | Free Full Text
Milojević S: Network Analysis and Indicators. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring Scholarly Impact. Springer International Publishing. 2014; 57–82. Publisher Full Text
Mohammadi E, Thelwall M: Mendeley Readership Altmetrics for the Social Sciences and Humanities: Research Evaluation and Knowledge Flows. J Assoc Inf Sci Technol. 2014; 65(8): 1627–1638. Publisher Full Text
Priem J: Altmetrics. In B. Cronin & C. R. Sugimoto (Eds.), Beyond bibliometrics: harnessing multi-dimensional indicators of performance. Cambridge, MA, USA: MIT Press. 2014. Reference Source
Thelwall M, Kousha K: Can Mendeley Bookmarks Reflect Readership? A Survey of User Motivations. J Assoc Inf Sci Technol. in press. Reference Source
Thelwall M, Maflahi N: Are scholarly articles disproportionately read in their own country? An analysis of Mendeley readers. J Assoc Inf Sci Technol. in press. Publisher Full Text
Van Noorden R: Online collaboration: Scientists and the social network. Nature. 2014; 512(7513): 126–129. Publisher Full Text
Waltman L, van Eck NJ, Noyons ECM: A unified approach to mapping and clustering of bibliometric networks. J Inform. 2010; 4(4): 629–635. Publisher Full Text
Weller K: Social Media and Altmetrics: An Overview of Current Alternative Approaches to Measuring Scholarly Impact. Incentives and Performance. In I. M. Welpe, J. Wollersheim, S. Ringelhan & M. Osterloh (Eds.), Incentives and Performance. Springer International Publishing. 2015; 261–276. Publisher Full Text
Wets K, Weedon D, Velterop J: Post-publication filtering and evaluation: Faculty of 1000. Learned Publishing. 2003; 16(4): 249–258. Publisher Full Text
Wouters P, Costas R: Users, narcissism and control–tracking the impact of scholarly publications in the 21st century. Utrecht, The Netherlands: SURFfoundation. 2012. Reference Source

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 11 Feb 2015

Author details Author details

¹ Max Planck Institute for Solid State Research, Heisenbergstr. 1, Stuttgart, 70569, Germany

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (2)

version 2

Revised

Published: 08 May 2015, 4:41

https://doi.org/10.12688/f1000research.6062.2

version 1

Published: 11 Feb 2015, 4:41

https://doi.org/10.12688/f1000research.6062.1

© 2015 Haunschild R and Bornmann L. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Haunschild R and Bornmann L. F1000Prime: an analysis of discipline-specific reader data from Mendeley [version 2; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2015, 4:41 (https://doi.org/10.12688/f1000research.6062.2)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 2

VERSION 2

PUBLISHED 08 May 2015

Revised

Views

Reviewer Report 11 Aug 2015

Stefanie Haustein, École de bibliothéconomie et des sciences de l’information (EBSI), Université de Montréal, Montréal, Canada

Not Approved

https://doi.org/10.5256/f1000research.6911.r8600

The modifications that the authors have made in the current version are mostly superficial and improvements are minor. The main weaknesses remain, which is why I do not approve the indexation of this article.

To provide support for my decision, the major shortcomings are outlined below:

Research questions
The issue with the research questions remains and it is central to the paper. What is the rational for the questions and how do they inform us regarding the reading/saving practices of Mendeley users? From my perspective the questions reflect what could very easily be derived from the dataset -- the absolute number of readers per Mendeley discipline -- instead of what would be interesting to analyze -- the Mendeley readers’ discipline in comparison to an expected value.

To be more specific:

RQ1a/b: Are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested? If so, which other disciplines show interest in F1000Prime papers?
As noted in my previous report, it is not clear what the authors expect to find. How many readers from each of the fields are “normal”? How much would be expected? Do as few as 1 reader from a discipline other than biology or medicine already indicate “interest from other disciplines”? Mohammadi & Thelwall (2014) created an expected value based on the discipline of the paper in the Web of Science. An alternative would be the discipline of the citing papers, which would actually be more suitable, as it is comparable to readers. Without such a comparison, I do not see the value of the analysis. What does it mean that two-thirds of Mendeley reader counts of F1000Prime recommended papers came from biology readers and 15% from medicine?

RQ2: Which disciplines read F1000Prime papers frequently or seldom together?
How can a discipline read? Rather: To what extent do Mendeley readers of different disciplines save the same papers? Again, this lacks a reference value without which it is impossible to assess whether certain values are “normal”.

In my previous review I had suggested to use the PubMed/Medline subject classification as an expected value regarding disciplines and to analyze the knowledge flow from authors to readers on the level of disciplines. The fact that the authors of the study already use the PMID to match datasets, would it make quite easy to retrieve this information and establish a benchmark and help to answer the research questions in a more meaningful way. This has not been addressed by the authors.

Reference list
The authors did include some of the suggested references but mention them briefly rather than integrating findings and methods into their study. One example of such a passing mention is Mohammadi & Thelwall (2014), which, in my opinion, is fundamental regarding the subject as well as methods of the study, because they analyzed the disciplines of the Mendeley readers and compare them to the discipline of the papers - an important reference value that is missing in this study. Unfortunately the authors only cite the paper with regards to correlations between readers and citations, which is not relevant to their study:
“Mohammadi & Thelwall (2014) found significant correlations between Mendeley reader counts and citation counts for the social sciences and humanities.”
As the authors noted in their reply to the review, it is true that that Mohammadi & Thelwall (2014)’s paper is based on a previous restrictions of the Mendeley API (to the top 3 disciplines per paper), but this does not make their methods of comparing reader and paper disciplines less relevant to the present study.

Dataset
The authors added a paragraph about the initial F1000Prime set (114,582 papers) and extended the retrieval method a bit. It is still not known how many of the queried papers were found on Mendeley and how many of them did not have any readers. Publication years, document types and particularly disciplines of these papers are not known.

As the other reviewer noted, it is mandatory to select a discipline when signing up to Mendeley but the authors write: "Each user can select a discipline and sub-discipline from a drop-down menu. This piece of information is not mandatory, like the user's location." In their recent preprint they write: "It is optional for the users of Mendeley to provide their disciplinary affiliations (selecting from predefined sub-disciplines) and location." This is contradictory. The fact that readers are provided by 99.9% of readers but only 17.6% provide the country (according to their ISSI paper) also implies that the discipline is mandatory. More information should be provided about this.

Network analysis
The additional information provided about the construction of the network is helpful do understand the methods but also raises more questions:
"Two reader counts from discipline A and at least two reader counts from discipline B constitute two links between both disciplines. From a matrix point of view, we have a symmetric readership coupling matrix, which has a two in row A and column B and vice versa for the aforementioned example."
Does this mean that the smallest common denominator between two disciplines was chosen? That would mean: If paper x has 10 readers from discipline A and 12 readers from discipline B, and paper y has 10 readers from A and 1000 readers from B, the connection between A and B for paper x and y is 10 each, which would add up to 20 in the matrix? This methods favors papers with large number of readers as well as underestimates large disciplines.

Why was it chosen over others such as a binary counts based on papers, where the connection between discipline A and B would be the number of papers that had readers from A and B? Both methods (and other weighted approaches) have their advantages and disadvantages, which affect results. The authors should discuss this.

The new network graph positioned the nodes of the same cluster together but all other shortcomings remain: the most important advantage of a network visualization, namely the network structure is still missing, as almost all nodes are connected with each other. In fact, the authors note that the density is 1: "With a density of 1, the network is rather dense." At a density of 1 it is not rather dense but as dense as it gets. A density of 1 would mean that each discipline has appeared together with each and every other at least once. Is that really the case even for the less frequent disciplines?

Eliminating weak links would provide structure. Connections should be normalized as the strongest connections in terms of absolute counts are always between the most occuring disciplines. This will also influence the community detection. The authors note that they do not normalize the connections as different normalization methods lead to different results. This is of course true, but discussing similarity between entities of different size, they are still superior to the absolute number.

I still think that self-loops for the number of readers from the same discipline would provide very important information in the network graph. Currently papers with readers from one discipline only are excluded entirely. The authors note that they did not include self-loops due to limitations in Pajek. If Pajek is not able to handle self-loops, they can be easily visualized with UCInet or Gephi. These tools also have more visualization options than VOSviewer such as changing edge widths and colors.

Discussion and conclusion
The authors have extended the discussions and conclusions but due to the above-mentioned weaknesses they are not convincing. I do not see how this study shows that Mendeley data can be used meaningfully beyond the fact that Mendeley provides discipline data for its users.

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Author Response 01 Sep 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

01 Sep 2015

Author Response

As also stated in our response to Rodrigo Costas, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research ... Continue reading As also stated in our response to Rodrigo Costas, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
As also stated in our response to Rodrigo Costas, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 01 Sep 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

01 Sep 2015

Author Response

As also stated in our response to Rodrigo Costas, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research ... Continue reading As also stated in our response to Rodrigo Costas, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
As also stated in our response to Rodrigo Costas, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Views

Reviewer Report 06 Aug 2015

Rodrigo Costas, Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands

Approved with Reservations

https://doi.org/10.5256/f1000research.6911.r8599

The authors have done some changes and modifications to the paper, improving some of the issues raised by the reviewers in their previous version. Particularly relevant are the better descriptions of the data collection and how matrices for the network analysis have been calculated.

However, some of the main critical observations raised in my previous review (and many times also pointed out by the other reviewer) still remain unchanged.

The paper still lacks solid (and to some extent relevant) research questions. In my first review I pointed to the weakness of the two research questions, which essentially remain the same. Even for answering the proposed questions one wonders why not using more comprehensive approaches. For example, what is the value of knowing whether F1000Prime papers are read by biomedicine readers? Wouldn't it be better to compare these distributions with the overall distribution of readers with other biomedical sources? More comparative analysis could be performed in order to be able to provide more meaningful statements and results.
Some of the new explanations also open new questions and may be subject of more explanations and discussion. For example, it seems that the co-occurrence readership matrix is constructed based on the number of readerships determined by the smallest discipline (when a paper is read by users from more than one discipline). Why is this approach selected? Wouldn't other approaches be also feasible? Would the results be substantially different? Have the authors considered the potential limitations of this approach? For example, this choice causes that smaller disciplines (i.e. with less overall users in Mendeley) may appear as more disconnected simply because their values will likely be smaller as compared to the bigger disciplines. I think the authors could have the chance to present a more thorough discussion on the choice of this approach, which could eventually be considered as critical for the future development of this type of analysis.
The network results are still not clear to me (see my previous review). What is the value or usefulness of these results? Actually, based on my previous comment, I also wonder now if the linkages with fields like Chemistry or Physics may be just the effect of the bigger size of these communities of readers in Mendeley. In other words, I wonder if this map is just the result of the size effect of the distribution of Mendeley users. I think this issue needs to be (at least) discussed.
Other aspects (also from my previous review) include that it is still not clear what is the period of analysis. Have all publications in F1000Prime been considered regardless their year of publication? It is also not clear to me what is the coverage of F1000Prime papers in Mendeley? (Do all F1000Prime papers have readerships in Mendeley?). Have F1000Prime papers with zero readers been excluded from the analysis (particularly for the reporting of the average 54.67 reader counts per paper)? Also the authors claim that “the (sub) discipline in Mendeley is self-assigned and not mandatory”. Actually, when creating a new account in Mendeley the discipline is mandatory but not the sub-discipline. This makes me wonder if actually the claim that “readers (74.94%) assign the “miscellaneous” sub-discipline of their discipline” is correct or of if this “miscellaneous” label is actually added by default by Mendeley (or if there have been changes in Mendeley’s policy for self-reporting users’ disciplinary background). I guess a more critical view should be considered here. The authors have added in the new version the claim that “this study shows that Mendeley data can be used to investigate meaningfully the readership of a set of publications”. I’m not sure what these “meaningful” uses are.

In conclusion, I maintain my previous assessment. Some elements have been improved, but some critical points like a formulation of a relevant research question or a more critical discussion of the methodology chosen are still lacking. They are based on the crossing of two databases (Mendeley & F1000Prime) with not clear added value in doing it. In addition, I would like to see a more thorough discussion about the chosen approach for the readership coupling and on the value of the type of analysis of Mendeley communities presented. Somehow these were all points raised by both reviewers in the previous round and to my understanding still remain unanswered in this version.

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Author Response 01 Sep 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

01 Sep 2015

Author Response

As also stated in our response to Stefanie Haustein, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research ... Continue reading As also stated in our response to Stefanie Haustein, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
As also stated in our response to Stefanie Haustein, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 01 Sep 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

01 Sep 2015

Author Response

As also stated in our response to Stefanie Haustein, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research ... Continue reading As also stated in our response to Stefanie Haustein, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
As also stated in our response to Stefanie Haustein, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Version 1

VERSION 1

PUBLISHED 11 Feb 2015

Views

Reviewer Report 31 Mar 2015

Stefanie Haustein, École de bibliothéconomie et des sciences de l’information (EBSI), Université de Montréal, Montréal, Canada

Not Approved

https://doi.org/10.5256/f1000research.6490.r7640

The study combines data from F1000 and Mendeley to analyze the (self-reported) disciplines of Mendeley users saving articles recommended in F1000. Research questions lack focus and clarity, analysis and discussions are weak, conclusions are absent. This is mainly due to the fact that the analysis focuses on a very small component of the data (sum of reader count per discipline), despite the fact that the two datasets contain many more interesting aspects that are worth analyzing. Many relevant previous studies are ignored.

Due to the many shortcomings and weaknesses I do not approve the indexation of this article in its current state.

Major revisions:

The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how far this study is different from previous work, in particular Mohammadi & Thelwall, who analyzed very similar aspects on Mendeley and, in addition, compared the discipline of users to that of the citing papers. For the present study, it is not clear what the authors expect to find (how much biology readers are normal?) and what the data is able to show: Do papers recommended on F1000 have Mendeley users from more diverse disciplines than expected?

It would be much more interesting and valuable to observe the effect of being recommended on F1000 by comparing Mendeley readership counts and disciplines of users of the dataset used in this study with a control set of papers that were not recommended. This could be achieved by analyzing and comparing the data for the population of PubMed articles for a certain set of recent years: Does the F1000 recommendation provide visibility to papers that increases the number of readers on Mendeley as well as the diversity of the audience in terms of disciplines and academic status? PubMed/Medline could also provide a meaningful subject classification for papers to measure interdisciplinary knowledge flows from authors to readers.

The authors also need to clarify in how far the present study differs and distinguishes itself from their other publications on similar topics and the same datasets:

1) Who reads F1000Prime publications?
2) Who publishes, reads, and cites papers? An analysis of country information
3) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime
4) Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime
5) Overlay maps based on Mendeley data: The use of altmetrics for readership networks
6) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS (altmetrics) and F1000Prime (paper tags)
7) The authors also mention Haunschild, Stefaner & Bornmann (in preparation), which seems to focus on the geographic location of Mendeley users of the same dataset. Could this aspect not be integrated in the present study?
The reference list is particularly poor and not acceptable in its current form. There have been plenty of studies by Thelwall, Mohammadi, Costas, Zahedi, Kraker, Haustein and others that have evaluated Mendeley reader counts - these are completely ignored. The introduction should be additionally supported by core altmetrics publications by Priem (particularly his overview of altmetrics in Beyond Bibliometrics which includes a definition of altmetrics), Piwowar, regarding reference managers Taraborelli and Haustein & Siebenlist, as well as the above mentioned authors for research on altmetrics. Peter Kraker's work on readership networks based on Mendeley users needs to be considered as well.
The parallels between early bibliometrics research and current altmetrics lack references either to the particular bibliometrics studies or detailed discussions of this parallel, for example in Haustein, Bowman & Costas.

In addition, some of the references cited in the text are also not listed in the reference list. This needs proper revision.
The dataset is not clearly defined. What is the F1000 Prime publication dataset? How and when was it retrieved? Are those all recommendations ever made in the database? What publications do the 114,582 papers refer to (journals, publications years, document types, discipline, research field, etc.)? What is the metadata quality of these entries; in particular, how many of these have a correct DOI, PMID or both? In how far does the availability of identifiers as well as the characteristics of papers (publication year, journals) influence and bias the matching with and availability in Mendeley? Regarding the matching of results: how many unique documents do the 6,263,913 reader counts refer to? What is the percentage of documents that could not be found in Mendeley? Readers/users should be distinguished from reader counts - to avoid the implication that there are 6.2 million readers. Mendeley has around 3 million users (2.8 million as of February 2014; Haustein & Larivière), who create reader(ship) counts by adding documents to their libraries.
The description of the methods for the network analysis is too brief: How were co-occurrences calculated? How were they normalized? Due to the density of 1 (i.e., all nodes are connected), the network layout is not very meaningful and not easy to interpret, often counterintuitive. The informativeness of the network could be improved by removing weak links to obtain a more meaningful network structure, where central (i.e., well-connected) nodes are positioned in the center of the network and less important ones in the periphery. Moreover, similar nodes as detected by the clustering algorithm (yellow and green in Figure 1) should be placed close together. In addition, it would make sense to include self-loops for papers saved by users of the same discipline to highlight homogeneous user groups. As the authors use the VOSviewer clustering method, why was VOSviewer not chosen for the mapping? In my opinion, it provides much more meaningful robust networks and better visualizations than Pajek. Other alternatives are Gephi, GUESS, UCInet, etc.

Regarding the interpretation of the network, the authors state that "[t]he thicker and darker the edges between two disciplines, the more frequently [the users] have read a F1000Prime paper jointly". Should it not rather be that "[t]he thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines"?
The grouping into clusters seems to a certain extent counterintuitive: Why is environmental science grouped together with psychology instead of biology? Could this be introduced by (the lack of) normalization? These counterintuitive results need to be discussed!
The discussion needs to be extended and a conclusion is missing. It is not clear what the study actually shows/proofs and in how far the few results (sum of readers per discipline) warrant a separate publication. The dataset of F1000 recommendations and Mendeley include many other pieces of interesting information such as the recommendation scores, F1000 tags (from F1000) and the geographic location and academic status of users (from Mendeley), which could be included to make the study much stronger and contribute to the understanding of readership counts and the effect of F1000 recommendations. In addition, subject classifications for the papers and locations of authors could be included to show if readers come from the same or different disciplines and countries as the paper and authors. I would also recommend the above mentioned extension of the study to include papers that were not recommended in F1000 to measure the effect of recommendations on readership counts. Combining these different aspects, one could investigate whether recommendations on F1000 lead to more diverse user groups on Mendeley in terms of discipline, country and academic status. For example, is a biology paper recommended and tagged as "good for teaching" on F1000 read by more Bachelor students from biology than a biology paper that was not recommended and tagged as such?

Minor revisions:

The first sentence "Interest in the broad impact of research (Bornmann, 2012, 2013) has resulted in new forms of impact measurements." simplifies the situation too much: there is also the technological push and publishers' interest who resulted in the availability of new metrics, plus these metrics have not been validated as measuring impact yet. Also, the references to support interest in broad impact measures should refer to sources that show these interests such as REF etc. instead of papers by Bornmann, which claim that these interests exist.
Regarding the use of altmetrics: apart from Snowball Metrics, they are also applied in the sense that various journals now show them to indicate the "impact" and use of articles (for example, PLOS journals, Nature, Wiley journals etc.). Funders have also declared interest in using these metrics (for example, see Dinsmore, Allen & Dolby).
"Since data from Mendeley can be received by an Application Programming Interface (API) without any problems" - this is not completely true, there are a lot of issues with data quality and reliability for Mendeley, see for example: Bar-Ilan and Zahedi, Haustein & Bowman. These limitations need to be acknowledged in particular because the study is based on matching DOIs and PMIDs - Mendeley entries without these or incorrect IDs will be lost. What is the error rate introduced by using these identifiers only?
In the methods, authors should specify what was done when problems with the API connection occurred. How was it insured that data was not lost due to these problems?
It would be helpful to add the number of unique papers and mean (+ std. dev.) number of reader counts per paper per discipline to Table 1 and include also the other disciplines with less than 1% of reader counts.

Competing Interests: No competing interests were disclosed.

CITE

Report a concern

Author Response 08 May 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

08 May 2015

Author Response

“The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how ... Continue reading “The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how far this study is different from previous work, in particular Mohammadi & Thelwall, who analyzed very similar aspects on Mendeley and, in addition, compared the discipline of users to that of the citing papers. For the present study, it is not clear what the authors expect to find (how much biology readers are normal?) and what the data is able to show: Do papers recommended on F1000 have Mendeley users from more diverse disciplines than expected?”

The motivation of the study is not solely based on availability of the datasets. The research questions do reflect this.

We have added the reference (Mohammadi & Thelwall). Clearly, Mohammadi & Thelwall focus on a specific publication year and different disciplines than our study. Technical differences are highlighted in the Section Methods, Subsection Use of the Mendeley API. Mohammadi & Thelwall used the old API where only the top 3 categories in percentages. Our study used the new API where absolute reader numbers are provided and the top 3 restriction is no longer in place. All sub-disciplines with at least one reader are available in the API.

We do not know how many readers from biology are normal for the F1000Prime publication set. This is one of the reasons why we pursued this research. As we have no real expectation value for F1000Prime readers from biology, it is not possible to judge if the observed reader counts are as expected, higher, or lower.

“It would be much more interesting and valuable to observe the effect of being recommended on F1000 by comparing Mendeley readership counts and disciplines of users of the dataset used in this study with a control set of papers that were not recommended. This could be achieved by analyzing and comparing the data for the population of PubMed articles for a certain set of recent years: Does the F1000 recommendation provide visibility to papers that increases the number of readers on Mendeley as well as the diversity of the audience in terms of disciplines and academic status? PubMed/Medline could also provide a meaningful subject classification for papers to measure interdisciplinary knowledge flows from authors to readers.”

While this is an interesting question, it is outside the scope of our current research question. Also, it is not easy (maybe even impossible) to answer it. Even if a paper was recommended into F1000Prime and has a very high Mendeley count, we do not know if this is due to the F1000Prime recommendation or not. Maybe, the paper is well written and interesting, attracted many Mendeley reader counts and was recommended into F1000Prime.

“The authors also need to clarify in how far the present study differs and distinguishes itself from their other publications on similar topics and the same datasets:

1) Who reads F1000Prime publications?”

The paper 1 is actually the preprint version of the current paper. We uploaded the manuscript to Figshare after submitting it to F1000Research.

“2) Who publishes, reads, and cites papers? An analysis of country information”

The paper 2 is concerned about the academic status information of Mendeley readers of F1000Prime papers. Furthermore, the type of analysis is completely different.

“3) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime”

This is an old version of Paper 6.

“4) Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime”

The paper 4 focusses on Twitter counts provided by Altmetric.

“5) Overlay maps based on Mendeley data: The use of altmetrics for readership networks”

The paper 5 uses a different data set (WoS publication year 2012) than our current paper. It focusses on the generation of overlay maps and is already in press in a different journal.

“6) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS (altmetrics) and F1000Prime (paper tags)”

The paper 6 studied the intersection of altmetrics data from PLoS and F1000Prime publications. This intersection is rather small with 1082 papers. Our current paper studies Mendeley reader counts of 114,582 papers as noted in the Section Methods.

“7) The authors also mention Haunschild, Stefaner & Bornmann (in preparation), which seems to focus on the geographic location of Mendeley users of the same dataset. Could this aspect not be integrated in the present study?”

This paper is already in press and will be presented at the ISSI 2015 conference. Thus, it cannot be integrated into the present study. Furthermore, the topics of both papers are too different, so that it would not be possible to merge both into a concise article.

Although the topics of the papers 2-7 might be similar (all deal with altmetrics), the focus of each paper is very different. In many cases, also the data set is very different.

“The reference list is particularly poor and not acceptable in its current form. There have been plenty of studies by Thelwall, Mohammadi, Costas, Zahedi, Kraker, Haustein and others that have evaluated Mendeley reader counts - these are completely ignored. The introduction should be additionally supported by core altmetrics publications by Priem (particularly his overview of altmetrics in Beyond Bibliometrics which includes a definition of altmetrics), Piwowar, regarding reference managers Taraborelli and Haustein & Siebenlist, as well as the above mentioned authors for research on altmetrics. Peter Kraker's work on readership networks based on Mendeley users needs to be considered as well.

The parallels between early bibliometrics research and current altmetrics lack references either to the particular bibliometrics studies or detailed discussions of this parallel, for example in Haustein, Bowman & Costas.”

Priem’s overview of Altmetrics in the book “Beyond Bibliometrics” is already referenced in the text. Haustein, S., & Larivière, V. (2014) is also cited. We have extended the literature review in the new version of the manuscript somewhat. Considering that this was intended to be a shorter article, we think it is not appropriate to include an exhaustive literature review.

“In addition, some of the references cited in the text are also not listed in the reference list. This needs proper revision.”

We thank the referee for this comment. Unfortunately, a large part of the list of references got lost, due to a problem with our software in the final stages between submission and publication. We have included the lost references in the revised version.

“The dataset is not clearly defined. What is the F1000 Prime publication dataset? How and when was it retrieved? Are those all recommendations ever made in the database? What publications do the 114,582 papers refer to (journals, publications years, document types, discipline, research field, etc.)? What is the metadata quality of these entries; in particular, how many of these have a correct DOI, PMID or both? In how far does the availability of identifiers as well as the characteristics of papers (publication year, journals) influence and bias the matching with and availability in Mendeley? Regarding the matching of results: how many unique documents do the 6,263,913 reader counts refer to? What is the percentage of documents that could not be found in Mendeley? Readers/users should be distinguished from reader counts - to avoid the implication that there are 6.2 million readers. Mendeley has around 3 million users (2.8 million as of February 2014; Haustein & Larivière), who create reader(ship) counts by adding documents to their libraries.”

The employed F1000Prime publication set consists of 114,582 journal articles in journals such as Nature, PNAS, Science, Cell, PLoS ONE, etc. Additionally, there is at least one recommendation for each paper. We have added this information in the revised version of the manuscript. We checked the DOIs and PubMedIDs. There are only two wrong (duplicated) DOIs in the publication set. We found not a single PubMedID which is wrong. Considering that this was intended to be a shorter article, we have included a brief note regarding this in the new version of the paper.

The data set is deposited at the Figshare link in the paper. A new Figshare link has been included which also includes a network file which can be loaded in Pajek to see detailed properties of the network.

We have added descriptions of regarding the data set, problems of retrieval of reader data, and the relation between reader counts and unique documents.

“The description of the methods for the network analysis is too brief: How were co-occurrences calculated? How were they normalized? Due to the density of 1 (i.e., all nodes are connected), the network layout is not very meaningful and not easy to interpret, often counterintuitive. The informativeness of the network could be improved by removing weak links to obtain a more meaningful network structure, where central (i.e., well-connected) nodes are positioned in the center of the network and less important ones in the periphery. Moreover, similar nodes as detected by the clustering algorithm (yellow and green in Figure 1) should be placed close together. In addition, it would make sense to include self-loops for papers saved by users of the same discipline to highlight homogeneous user groups. As the authors use the VOSviewer clustering method, why was VOSviewer not chosen for the mapping? In my opinion, it provides much more meaningful robust networks and better visualizations than Pajek. Other alternatives are Gephi, GUESS, UCInet, etc.”

Based on these suggestions, we have replaced Figure 1 with a new version and extended the methodological description of the network analysis. VOSViewer has not chosen as visualization program because the co-occurences are shown as shorter distances but not as thicker connection lines. We prefer the thicker connection lines for this paper. Unfortunately, the self-loops could not be included due to system limits of Pajek.

“Regarding the interpretation of the network, the authors state that ‘[t]he thicker and darker the edges between two disciplines, the more frequently [the users] have read a F1000Prime paper jointly’. Should it not rather be that ‘[t]he thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines’?
The grouping into clusters seems to a certain extent counterintuitive: Why is environmental science grouped together with psychology instead of biology? Could this be introduced by (the lack of) normalization? These counterintuitive results need to be discussed!”

We have revised the formulation accordingly and included more discussion on the counterintuitive results. Different normalization procedures lead to different results. Thus, we prefer to show the visualization without normalization.

“The discussion needs to be extended and a conclusion is missing. It is not clear what the study actually shows/proofs and in how far the few results (sum of readers per discipline) warrant a separate publication. The dataset of F1000 recommendations and Mendeley include many other pieces of interesting information such as the recommendation scores, F1000 tags (from F1000) and the geographic location and academic status of users (from Mendeley), which could be included to make the study much stronger and contribute to the understanding of readership counts and the effect of F1000 recommendations. In addition, subject classifications for the papers and locations of authors could be included to show if readers come from the same or different disciplines and countries as the paper and authors. I would also recommend the above mentioned extension of the study to include papers that were not recommended in F1000 to measure the effect of recommendations on readership counts. Combining these different aspects, one could investigate whether recommendations on F1000 lead to more diverse user groups on Mendeley in terms of discipline, country and academic status. For example, is a biology paper recommended and tagged as "good for teaching" on F1000 read by more Bachelor students from biology than a biology paper that was not recommended and tagged as such?”

Unfortunately, the other information from Mendeley (geographic location and academic status) are completely decoupled from the sub-discipline information. Therefore, it is not possible to define a “Bachelor student from biology” using current Mendeley data. We could create similar figures for academic status and geographic location, but they are not that interesting in this case. As a bio-medical publication set is studied, the vast majority of readers are expected from medicine and biology. To some extend this expectation is fulfilled, but some interesting readership connections between other disciplines and biology and/or medicine are found. We have no such expectation to test regarding location or academic status.

“The first sentence 'Interest in the broad impact of research (Bornmann, 2012, 2013) has resulted in new forms of impact measurements.' simplifies the situation too much: there is also the technological push and publishers' interest who resulted in the availability of new metrics, plus these metrics have not been validated as measuring impact yet. Also, the references to support interest in broad impact measures should refer to sources that show these interests such as REF etc. instead of papers by Bornmann, which claim that these interests exist.”

We revised this sentence.

“Regarding the use of altmetrics: apart from Snowball Metrics, they are also applied in the sense that various journals now show them to indicate the "impact" and use of articles (for example, PLOS journals, Nature, Wiley journals etc.). Funders have also declared interest in using these metrics (for example, see Dinsmore, Allen & Dolby).”

Thank you for the suggestion. We have included this into the introduction.

"’Since data from Mendeley can be received by an Application Programming Interface (API) without any problems’ - this is not completely true, there are a lot of issues with data quality and reliability for Mendeley, see for example: Bar-Ilan and Zahedi, Haustein & Bowman. These limitations need to be acknowledged in particular because the study is based on matching DOIs and PMIDs - Mendeley entries without these or incorrect IDs will be lost. What is the error rate introduced by using these identifiers only?”

We have revised this sentence.

“In the methods, authors should specify what was done when problems with the API connection occurred. How was it insured that data was not lost due to these problems?”

We have added a more detailed description of the retrieval procedure.

“It would be helpful to add the number of unique papers and mean (+ std. dev.) number of reader counts per paper per discipline to Table 1 and include also the other disciplines with less than 1% of reader counts.”

We added also the other disciplines below 1% of the readers. Including also the number of unique papers, the mean number of readers, and the standard deviations would make the table much harder to understand. All raw data are deposited at a Figshare link so that people interested in other types of analysis can perform them on their own.
“The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how far this study is different from previous work, in particular Mohammadi & Thelwall, who analyzed very similar aspects on Mendeley and, in addition, compared the discipline of users to that of the citing papers. For the present study, it is not clear what the authors expect to find (how much biology readers are normal?) and what the data is able to show: Do papers recommended on F1000 have Mendeley users from more diverse disciplines than expected?”

The motivation of the study is not solely based on availability of the datasets. The research questions do reflect this.

We have added the reference (Mohammadi & Thelwall). Clearly, Mohammadi & Thelwall focus on a specific publication year and different disciplines than our study. Technical differences are highlighted in the Section Methods, Subsection Use of the Mendeley API. Mohammadi & Thelwall used the old API where only the top 3 categories in percentages. Our study used the new API where absolute reader numbers are provided and the top 3 restriction is no longer in place. All sub-disciplines with at least one reader are available in the API.

We do not know how many readers from biology are normal for the F1000Prime publication set. This is one of the reasons why we pursued this research. As we have no real expectation value for F1000Prime readers from biology, it is not possible to judge if the observed reader counts are as expected, higher, or lower.

“It would be much more interesting and valuable to observe the effect of being recommended on F1000 by comparing Mendeley readership counts and disciplines of users of the dataset used in this study with a control set of papers that were not recommended. This could be achieved by analyzing and comparing the data for the population of PubMed articles for a certain set of recent years: Does the F1000 recommendation provide visibility to papers that increases the number of readers on Mendeley as well as the diversity of the audience in terms of disciplines and academic status? PubMed/Medline could also provide a meaningful subject classification for papers to measure interdisciplinary knowledge flows from authors to readers.”

While this is an interesting question, it is outside the scope of our current research question. Also, it is not easy (maybe even impossible) to answer it. Even if a paper was recommended into F1000Prime and has a very high Mendeley count, we do not know if this is due to the F1000Prime recommendation or not. Maybe, the paper is well written and interesting, attracted many Mendeley reader counts and was recommended into F1000Prime.

“The authors also need to clarify in how far the present study differs and distinguishes itself from their other publications on similar topics and the same datasets:

1) Who reads F1000Prime publications?”

The paper 1 is actually the preprint version of the current paper. We uploaded the manuscript to Figshare after submitting it to F1000Research.

“2) Who publishes, reads, and cites papers? An analysis of country information”

The paper 2 is concerned about the academic status information of Mendeley readers of F1000Prime papers. Furthermore, the type of analysis is completely different.

“3) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime”

This is an old version of Paper 6.

“4) Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime”

The paper 4 focusses on Twitter counts provided by Altmetric.

“5) Overlay maps based on Mendeley data: The use of altmetrics for readership networks”

The paper 5 uses a different data set (WoS publication year 2012) than our current paper. It focusses on the generation of overlay maps and is already in press in a different journal.

“6) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS (altmetrics) and F1000Prime (paper tags)”

The paper 6 studied the intersection of altmetrics data from PLoS and F1000Prime publications. This intersection is rather small with 1082 papers. Our current paper studies Mendeley reader counts of 114,582 papers as noted in the Section Methods.

“7) The authors also mention Haunschild, Stefaner & Bornmann (in preparation), which seems to focus on the geographic location of Mendeley users of the same dataset. Could this aspect not be integrated in the present study?”

This paper is already in press and will be presented at the ISSI 2015 conference. Thus, it cannot be integrated into the present study. Furthermore, the topics of both papers are too different, so that it would not be possible to merge both into a concise article.

Although the topics of the papers 2-7 might be similar (all deal with altmetrics), the focus of each paper is very different. In many cases, also the data set is very different.

“The reference list is particularly poor and not acceptable in its current form. There have been plenty of studies by Thelwall, Mohammadi, Costas, Zahedi, Kraker, Haustein and others that have evaluated Mendeley reader counts - these are completely ignored. The introduction should be additionally supported by core altmetrics publications by Priem (particularly his overview of altmetrics in Beyond Bibliometrics which includes a definition of altmetrics), Piwowar, regarding reference managers Taraborelli and Haustein & Siebenlist, as well as the above mentioned authors for research on altmetrics. Peter Kraker's work on readership networks based on Mendeley users needs to be considered as well.

The parallels between early bibliometrics research and current altmetrics lack references either to the particular bibliometrics studies or detailed discussions of this parallel, for example in Haustein, Bowman & Costas.”

Priem’s overview of Altmetrics in the book “Beyond Bibliometrics” is already referenced in the text. Haustein, S., & Larivière, V. (2014) is also cited. We have extended the literature review in the new version of the manuscript somewhat. Considering that this was intended to be a shorter article, we think it is not appropriate to include an exhaustive literature review.

“In addition, some of the references cited in the text are also not listed in the reference list. This needs proper revision.”

We thank the referee for this comment. Unfortunately, a large part of the list of references got lost, due to a problem with our software in the final stages between submission and publication. We have included the lost references in the revised version.

“The dataset is not clearly defined. What is the F1000 Prime publication dataset? How and when was it retrieved? Are those all recommendations ever made in the database? What publications do the 114,582 papers refer to (journals, publications years, document types, discipline, research field, etc.)? What is the metadata quality of these entries; in particular, how many of these have a correct DOI, PMID or both? In how far does the availability of identifiers as well as the characteristics of papers (publication year, journals) influence and bias the matching with and availability in Mendeley? Regarding the matching of results: how many unique documents do the 6,263,913 reader counts refer to? What is the percentage of documents that could not be found in Mendeley? Readers/users should be distinguished from reader counts - to avoid the implication that there are 6.2 million readers. Mendeley has around 3 million users (2.8 million as of February 2014; Haustein & Larivière), who create reader(ship) counts by adding documents to their libraries.”

The employed F1000Prime publication set consists of 114,582 journal articles in journals such as Nature, PNAS, Science, Cell, PLoS ONE, etc. Additionally, there is at least one recommendation for each paper. We have added this information in the revised version of the manuscript. We checked the DOIs and PubMedIDs. There are only two wrong (duplicated) DOIs in the publication set. We found not a single PubMedID which is wrong. Considering that this was intended to be a shorter article, we have included a brief note regarding this in the new version of the paper.

The data set is deposited at the Figshare link in the paper. A new Figshare link has been included which also includes a network file which can be loaded in Pajek to see detailed properties of the network.

We have added descriptions of regarding the data set, problems of retrieval of reader data, and the relation between reader counts and unique documents.

“The description of the methods for the network analysis is too brief: How were co-occurrences calculated? How were they normalized? Due to the density of 1 (i.e., all nodes are connected), the network layout is not very meaningful and not easy to interpret, often counterintuitive. The informativeness of the network could be improved by removing weak links to obtain a more meaningful network structure, where central (i.e., well-connected) nodes are positioned in the center of the network and less important ones in the periphery. Moreover, similar nodes as detected by the clustering algorithm (yellow and green in Figure 1) should be placed close together. In addition, it would make sense to include self-loops for papers saved by users of the same discipline to highlight homogeneous user groups. As the authors use the VOSviewer clustering method, why was VOSviewer not chosen for the mapping? In my opinion, it provides much more meaningful robust networks and better visualizations than Pajek. Other alternatives are Gephi, GUESS, UCInet, etc.”

Based on these suggestions, we have replaced Figure 1 with a new version and extended the methodological description of the network analysis. VOSViewer has not chosen as visualization program because the co-occurences are shown as shorter distances but not as thicker connection lines. We prefer the thicker connection lines for this paper. Unfortunately, the self-loops could not be included due to system limits of Pajek.

“Regarding the interpretation of the network, the authors state that ‘[t]he thicker and darker the edges between two disciplines, the more frequently [the users] have read a F1000Prime paper jointly’. Should it not rather be that ‘[t]he thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines’?
The grouping into clusters seems to a certain extent counterintuitive: Why is environmental science grouped together with psychology instead of biology? Could this be introduced by (the lack of) normalization? These counterintuitive results need to be discussed!”

We have revised the formulation accordingly and included more discussion on the counterintuitive results. Different normalization procedures lead to different results. Thus, we prefer to show the visualization without normalization.

“The discussion needs to be extended and a conclusion is missing. It is not clear what the study actually shows/proofs and in how far the few results (sum of readers per discipline) warrant a separate publication. The dataset of F1000 recommendations and Mendeley include many other pieces of interesting information such as the recommendation scores, F1000 tags (from F1000) and the geographic location and academic status of users (from Mendeley), which could be included to make the study much stronger and contribute to the understanding of readership counts and the effect of F1000 recommendations. In addition, subject classifications for the papers and locations of authors could be included to show if readers come from the same or different disciplines and countries as the paper and authors. I would also recommend the above mentioned extension of the study to include papers that were not recommended in F1000 to measure the effect of recommendations on readership counts. Combining these different aspects, one could investigate whether recommendations on F1000 lead to more diverse user groups on Mendeley in terms of discipline, country and academic status. For example, is a biology paper recommended and tagged as "good for teaching" on F1000 read by more Bachelor students from biology than a biology paper that was not recommended and tagged as such?”

Unfortunately, the other information from Mendeley (geographic location and academic status) are completely decoupled from the sub-discipline information. Therefore, it is not possible to define a “Bachelor student from biology” using current Mendeley data. We could create similar figures for academic status and geographic location, but they are not that interesting in this case. As a bio-medical publication set is studied, the vast majority of readers are expected from medicine and biology. To some extend this expectation is fulfilled, but some interesting readership connections between other disciplines and biology and/or medicine are found. We have no such expectation to test regarding location or academic status.

“The first sentence 'Interest in the broad impact of research (Bornmann, 2012, 2013) has resulted in new forms of impact measurements.' simplifies the situation too much: there is also the technological push and publishers' interest who resulted in the availability of new metrics, plus these metrics have not been validated as measuring impact yet. Also, the references to support interest in broad impact measures should refer to sources that show these interests such as REF etc. instead of papers by Bornmann, which claim that these interests exist.”

We revised this sentence.

“Regarding the use of altmetrics: apart from Snowball Metrics, they are also applied in the sense that various journals now show them to indicate the "impact" and use of articles (for example, PLOS journals, Nature, Wiley journals etc.). Funders have also declared interest in using these metrics (for example, see Dinsmore, Allen & Dolby).”

Thank you for the suggestion. We have included this into the introduction.

"’Since data from Mendeley can be received by an Application Programming Interface (API) without any problems’ - this is not completely true, there are a lot of issues with data quality and reliability for Mendeley, see for example: Bar-Ilan and Zahedi, Haustein & Bowman. These limitations need to be acknowledged in particular because the study is based on matching DOIs and PMIDs - Mendeley entries without these or incorrect IDs will be lost. What is the error rate introduced by using these identifiers only?”

We have revised this sentence.

“In the methods, authors should specify what was done when problems with the API connection occurred. How was it insured that data was not lost due to these problems?”

We have added a more detailed description of the retrieval procedure.

“It would be helpful to add the number of unique papers and mean (+ std. dev.) number of reader counts per paper per discipline to Table 1 and include also the other disciplines with less than 1% of reader counts.”

We added also the other disciplines below 1% of the readers. Including also the number of unique papers, the mean number of readers, and the standard deviations would make the table much harder to understand. All raw data are deposited at a Figshare link so that people interested in other types of analysis can perform them on their own.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 08 May 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

08 May 2015

Author Response

“The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how ... Continue reading “The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how far this study is different from previous work, in particular Mohammadi & Thelwall, who analyzed very similar aspects on Mendeley and, in addition, compared the discipline of users to that of the citing papers. For the present study, it is not clear what the authors expect to find (how much biology readers are normal?) and what the data is able to show: Do papers recommended on F1000 have Mendeley users from more diverse disciplines than expected?”

The motivation of the study is not solely based on availability of the datasets. The research questions do reflect this.

We have added the reference (Mohammadi & Thelwall). Clearly, Mohammadi & Thelwall focus on a specific publication year and different disciplines than our study. Technical differences are highlighted in the Section Methods, Subsection Use of the Mendeley API. Mohammadi & Thelwall used the old API where only the top 3 categories in percentages. Our study used the new API where absolute reader numbers are provided and the top 3 restriction is no longer in place. All sub-disciplines with at least one reader are available in the API.

We do not know how many readers from biology are normal for the F1000Prime publication set. This is one of the reasons why we pursued this research. As we have no real expectation value for F1000Prime readers from biology, it is not possible to judge if the observed reader counts are as expected, higher, or lower.

“It would be much more interesting and valuable to observe the effect of being recommended on F1000 by comparing Mendeley readership counts and disciplines of users of the dataset used in this study with a control set of papers that were not recommended. This could be achieved by analyzing and comparing the data for the population of PubMed articles for a certain set of recent years: Does the F1000 recommendation provide visibility to papers that increases the number of readers on Mendeley as well as the diversity of the audience in terms of disciplines and academic status? PubMed/Medline could also provide a meaningful subject classification for papers to measure interdisciplinary knowledge flows from authors to readers.”

While this is an interesting question, it is outside the scope of our current research question. Also, it is not easy (maybe even impossible) to answer it. Even if a paper was recommended into F1000Prime and has a very high Mendeley count, we do not know if this is due to the F1000Prime recommendation or not. Maybe, the paper is well written and interesting, attracted many Mendeley reader counts and was recommended into F1000Prime.

“The authors also need to clarify in how far the present study differs and distinguishes itself from their other publications on similar topics and the same datasets:

1) Who reads F1000Prime publications?”

The paper 1 is actually the preprint version of the current paper. We uploaded the manuscript to Figshare after submitting it to F1000Research.

“2) Who publishes, reads, and cites papers? An analysis of country information”

The paper 2 is concerned about the academic status information of Mendeley readers of F1000Prime papers. Furthermore, the type of analysis is completely different.

“3) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime”

This is an old version of Paper 6.

“4) Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime”

The paper 4 focusses on Twitter counts provided by Altmetric.

“5) Overlay maps based on Mendeley data: The use of altmetrics for readership networks”

The paper 5 uses a different data set (WoS publication year 2012) than our current paper. It focusses on the generation of overlay maps and is already in press in a different journal.

“6) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS (altmetrics) and F1000Prime (paper tags)”

The paper 6 studied the intersection of altmetrics data from PLoS and F1000Prime publications. This intersection is rather small with 1082 papers. Our current paper studies Mendeley reader counts of 114,582 papers as noted in the Section Methods.

“7) The authors also mention Haunschild, Stefaner & Bornmann (in preparation), which seems to focus on the geographic location of Mendeley users of the same dataset. Could this aspect not be integrated in the present study?”

This paper is already in press and will be presented at the ISSI 2015 conference. Thus, it cannot be integrated into the present study. Furthermore, the topics of both papers are too different, so that it would not be possible to merge both into a concise article.

Although the topics of the papers 2-7 might be similar (all deal with altmetrics), the focus of each paper is very different. In many cases, also the data set is very different.

“The reference list is particularly poor and not acceptable in its current form. There have been plenty of studies by Thelwall, Mohammadi, Costas, Zahedi, Kraker, Haustein and others that have evaluated Mendeley reader counts - these are completely ignored. The introduction should be additionally supported by core altmetrics publications by Priem (particularly his overview of altmetrics in Beyond Bibliometrics which includes a definition of altmetrics), Piwowar, regarding reference managers Taraborelli and Haustein & Siebenlist, as well as the above mentioned authors for research on altmetrics. Peter Kraker's work on readership networks based on Mendeley users needs to be considered as well.

The parallels between early bibliometrics research and current altmetrics lack references either to the particular bibliometrics studies or detailed discussions of this parallel, for example in Haustein, Bowman & Costas.”

Priem’s overview of Altmetrics in the book “Beyond Bibliometrics” is already referenced in the text. Haustein, S., & Larivière, V. (2014) is also cited. We have extended the literature review in the new version of the manuscript somewhat. Considering that this was intended to be a shorter article, we think it is not appropriate to include an exhaustive literature review.

“In addition, some of the references cited in the text are also not listed in the reference list. This needs proper revision.”

We thank the referee for this comment. Unfortunately, a large part of the list of references got lost, due to a problem with our software in the final stages between submission and publication. We have included the lost references in the revised version.

“The dataset is not clearly defined. What is the F1000 Prime publication dataset? How and when was it retrieved? Are those all recommendations ever made in the database? What publications do the 114,582 papers refer to (journals, publications years, document types, discipline, research field, etc.)? What is the metadata quality of these entries; in particular, how many of these have a correct DOI, PMID or both? In how far does the availability of identifiers as well as the characteristics of papers (publication year, journals) influence and bias the matching with and availability in Mendeley? Regarding the matching of results: how many unique documents do the 6,263,913 reader counts refer to? What is the percentage of documents that could not be found in Mendeley? Readers/users should be distinguished from reader counts - to avoid the implication that there are 6.2 million readers. Mendeley has around 3 million users (2.8 million as of February 2014; Haustein & Larivière), who create reader(ship) counts by adding documents to their libraries.”

The employed F1000Prime publication set consists of 114,582 journal articles in journals such as Nature, PNAS, Science, Cell, PLoS ONE, etc. Additionally, there is at least one recommendation for each paper. We have added this information in the revised version of the manuscript. We checked the DOIs and PubMedIDs. There are only two wrong (duplicated) DOIs in the publication set. We found not a single PubMedID which is wrong. Considering that this was intended to be a shorter article, we have included a brief note regarding this in the new version of the paper.

The data set is deposited at the Figshare link in the paper. A new Figshare link has been included which also includes a network file which can be loaded in Pajek to see detailed properties of the network.

We have added descriptions of regarding the data set, problems of retrieval of reader data, and the relation between reader counts and unique documents.

“The description of the methods for the network analysis is too brief: How were co-occurrences calculated? How were they normalized? Due to the density of 1 (i.e., all nodes are connected), the network layout is not very meaningful and not easy to interpret, often counterintuitive. The informativeness of the network could be improved by removing weak links to obtain a more meaningful network structure, where central (i.e., well-connected) nodes are positioned in the center of the network and less important ones in the periphery. Moreover, similar nodes as detected by the clustering algorithm (yellow and green in Figure 1) should be placed close together. In addition, it would make sense to include self-loops for papers saved by users of the same discipline to highlight homogeneous user groups. As the authors use the VOSviewer clustering method, why was VOSviewer not chosen for the mapping? In my opinion, it provides much more meaningful robust networks and better visualizations than Pajek. Other alternatives are Gephi, GUESS, UCInet, etc.”

Based on these suggestions, we have replaced Figure 1 with a new version and extended the methodological description of the network analysis. VOSViewer has not chosen as visualization program because the co-occurences are shown as shorter distances but not as thicker connection lines. We prefer the thicker connection lines for this paper. Unfortunately, the self-loops could not be included due to system limits of Pajek.

“Regarding the interpretation of the network, the authors state that ‘[t]he thicker and darker the edges between two disciplines, the more frequently [the users] have read a F1000Prime paper jointly’. Should it not rather be that ‘[t]he thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines’?
The grouping into clusters seems to a certain extent counterintuitive: Why is environmental science grouped together with psychology instead of biology? Could this be introduced by (the lack of) normalization? These counterintuitive results need to be discussed!”

We have revised the formulation accordingly and included more discussion on the counterintuitive results. Different normalization procedures lead to different results. Thus, we prefer to show the visualization without normalization.

“The discussion needs to be extended and a conclusion is missing. It is not clear what the study actually shows/proofs and in how far the few results (sum of readers per discipline) warrant a separate publication. The dataset of F1000 recommendations and Mendeley include many other pieces of interesting information such as the recommendation scores, F1000 tags (from F1000) and the geographic location and academic status of users (from Mendeley), which could be included to make the study much stronger and contribute to the understanding of readership counts and the effect of F1000 recommendations. In addition, subject classifications for the papers and locations of authors could be included to show if readers come from the same or different disciplines and countries as the paper and authors. I would also recommend the above mentioned extension of the study to include papers that were not recommended in F1000 to measure the effect of recommendations on readership counts. Combining these different aspects, one could investigate whether recommendations on F1000 lead to more diverse user groups on Mendeley in terms of discipline, country and academic status. For example, is a biology paper recommended and tagged as "good for teaching" on F1000 read by more Bachelor students from biology than a biology paper that was not recommended and tagged as such?”

Unfortunately, the other information from Mendeley (geographic location and academic status) are completely decoupled from the sub-discipline information. Therefore, it is not possible to define a “Bachelor student from biology” using current Mendeley data. We could create similar figures for academic status and geographic location, but they are not that interesting in this case. As a bio-medical publication set is studied, the vast majority of readers are expected from medicine and biology. To some extend this expectation is fulfilled, but some interesting readership connections between other disciplines and biology and/or medicine are found. We have no such expectation to test regarding location or academic status.

“The first sentence 'Interest in the broad impact of research (Bornmann, 2012, 2013) has resulted in new forms of impact measurements.' simplifies the situation too much: there is also the technological push and publishers' interest who resulted in the availability of new metrics, plus these metrics have not been validated as measuring impact yet. Also, the references to support interest in broad impact measures should refer to sources that show these interests such as REF etc. instead of papers by Bornmann, which claim that these interests exist.”

We revised this sentence.

“Regarding the use of altmetrics: apart from Snowball Metrics, they are also applied in the sense that various journals now show them to indicate the "impact" and use of articles (for example, PLOS journals, Nature, Wiley journals etc.). Funders have also declared interest in using these metrics (for example, see Dinsmore, Allen & Dolby).”

Thank you for the suggestion. We have included this into the introduction.

"’Since data from Mendeley can be received by an Application Programming Interface (API) without any problems’ - this is not completely true, there are a lot of issues with data quality and reliability for Mendeley, see for example: Bar-Ilan and Zahedi, Haustein & Bowman. These limitations need to be acknowledged in particular because the study is based on matching DOIs and PMIDs - Mendeley entries without these or incorrect IDs will be lost. What is the error rate introduced by using these identifiers only?”

We have revised this sentence.

“In the methods, authors should specify what was done when problems with the API connection occurred. How was it insured that data was not lost due to these problems?”

We have added a more detailed description of the retrieval procedure.

“It would be helpful to add the number of unique papers and mean (+ std. dev.) number of reader counts per paper per discipline to Table 1 and include also the other disciplines with less than 1% of reader counts.”

We added also the other disciplines below 1% of the readers. Including also the number of unique papers, the mean number of readers, and the standard deviations would make the table much harder to understand. All raw data are deposited at a Figshare link so that people interested in other types of analysis can perform them on their own.
“The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how far this study is different from previous work, in particular Mohammadi & Thelwall, who analyzed very similar aspects on Mendeley and, in addition, compared the discipline of users to that of the citing papers. For the present study, it is not clear what the authors expect to find (how much biology readers are normal?) and what the data is able to show: Do papers recommended on F1000 have Mendeley users from more diverse disciplines than expected?”

The motivation of the study is not solely based on availability of the datasets. The research questions do reflect this.

We have added the reference (Mohammadi & Thelwall). Clearly, Mohammadi & Thelwall focus on a specific publication year and different disciplines than our study. Technical differences are highlighted in the Section Methods, Subsection Use of the Mendeley API. Mohammadi & Thelwall used the old API where only the top 3 categories in percentages. Our study used the new API where absolute reader numbers are provided and the top 3 restriction is no longer in place. All sub-disciplines with at least one reader are available in the API.

We do not know how many readers from biology are normal for the F1000Prime publication set. This is one of the reasons why we pursued this research. As we have no real expectation value for F1000Prime readers from biology, it is not possible to judge if the observed reader counts are as expected, higher, or lower.

“It would be much more interesting and valuable to observe the effect of being recommended on F1000 by comparing Mendeley readership counts and disciplines of users of the dataset used in this study with a control set of papers that were not recommended. This could be achieved by analyzing and comparing the data for the population of PubMed articles for a certain set of recent years: Does the F1000 recommendation provide visibility to papers that increases the number of readers on Mendeley as well as the diversity of the audience in terms of disciplines and academic status? PubMed/Medline could also provide a meaningful subject classification for papers to measure interdisciplinary knowledge flows from authors to readers.”

While this is an interesting question, it is outside the scope of our current research question. Also, it is not easy (maybe even impossible) to answer it. Even if a paper was recommended into F1000Prime and has a very high Mendeley count, we do not know if this is due to the F1000Prime recommendation or not. Maybe, the paper is well written and interesting, attracted many Mendeley reader counts and was recommended into F1000Prime.

“The authors also need to clarify in how far the present study differs and distinguishes itself from their other publications on similar topics and the same datasets:

1) Who reads F1000Prime publications?”

The paper 1 is actually the preprint version of the current paper. We uploaded the manuscript to Figshare after submitting it to F1000Research.

“2) Who publishes, reads, and cites papers? An analysis of country information”

The paper 2 is concerned about the academic status information of Mendeley readers of F1000Prime papers. Furthermore, the type of analysis is completely different.

“3) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime”

This is an old version of Paper 6.

“4) Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime”

The paper 4 focusses on Twitter counts provided by Altmetric.

“5) Overlay maps based on Mendeley data: The use of altmetrics for readership networks”

The paper 5 uses a different data set (WoS publication year 2012) than our current paper. It focusses on the generation of overlay maps and is already in press in a different journal.

“6) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS (altmetrics) and F1000Prime (paper tags)”

The paper 6 studied the intersection of altmetrics data from PLoS and F1000Prime publications. This intersection is rather small with 1082 papers. Our current paper studies Mendeley reader counts of 114,582 papers as noted in the Section Methods.

“7) The authors also mention Haunschild, Stefaner & Bornmann (in preparation), which seems to focus on the geographic location of Mendeley users of the same dataset. Could this aspect not be integrated in the present study?”

This paper is already in press and will be presented at the ISSI 2015 conference. Thus, it cannot be integrated into the present study. Furthermore, the topics of both papers are too different, so that it would not be possible to merge both into a concise article.

Although the topics of the papers 2-7 might be similar (all deal with altmetrics), the focus of each paper is very different. In many cases, also the data set is very different.

“The reference list is particularly poor and not acceptable in its current form. There have been plenty of studies by Thelwall, Mohammadi, Costas, Zahedi, Kraker, Haustein and others that have evaluated Mendeley reader counts - these are completely ignored. The introduction should be additionally supported by core altmetrics publications by Priem (particularly his overview of altmetrics in Beyond Bibliometrics which includes a definition of altmetrics), Piwowar, regarding reference managers Taraborelli and Haustein & Siebenlist, as well as the above mentioned authors for research on altmetrics. Peter Kraker's work on readership networks based on Mendeley users needs to be considered as well.

The parallels between early bibliometrics research and current altmetrics lack references either to the particular bibliometrics studies or detailed discussions of this parallel, for example in Haustein, Bowman & Costas.”

Priem’s overview of Altmetrics in the book “Beyond Bibliometrics” is already referenced in the text. Haustein, S., & Larivière, V. (2014) is also cited. We have extended the literature review in the new version of the manuscript somewhat. Considering that this was intended to be a shorter article, we think it is not appropriate to include an exhaustive literature review.

“In addition, some of the references cited in the text are also not listed in the reference list. This needs proper revision.”

We thank the referee for this comment. Unfortunately, a large part of the list of references got lost, due to a problem with our software in the final stages between submission and publication. We have included the lost references in the revised version.

“The dataset is not clearly defined. What is the F1000 Prime publication dataset? How and when was it retrieved? Are those all recommendations ever made in the database? What publications do the 114,582 papers refer to (journals, publications years, document types, discipline, research field, etc.)? What is the metadata quality of these entries; in particular, how many of these have a correct DOI, PMID or both? In how far does the availability of identifiers as well as the characteristics of papers (publication year, journals) influence and bias the matching with and availability in Mendeley? Regarding the matching of results: how many unique documents do the 6,263,913 reader counts refer to? What is the percentage of documents that could not be found in Mendeley? Readers/users should be distinguished from reader counts - to avoid the implication that there are 6.2 million readers. Mendeley has around 3 million users (2.8 million as of February 2014; Haustein & Larivière), who create reader(ship) counts by adding documents to their libraries.”

The employed F1000Prime publication set consists of 114,582 journal articles in journals such as Nature, PNAS, Science, Cell, PLoS ONE, etc. Additionally, there is at least one recommendation for each paper. We have added this information in the revised version of the manuscript. We checked the DOIs and PubMedIDs. There are only two wrong (duplicated) DOIs in the publication set. We found not a single PubMedID which is wrong. Considering that this was intended to be a shorter article, we have included a brief note regarding this in the new version of the paper.

The data set is deposited at the Figshare link in the paper. A new Figshare link has been included which also includes a network file which can be loaded in Pajek to see detailed properties of the network.

We have added descriptions of regarding the data set, problems of retrieval of reader data, and the relation between reader counts and unique documents.

“The description of the methods for the network analysis is too brief: How were co-occurrences calculated? How were they normalized? Due to the density of 1 (i.e., all nodes are connected), the network layout is not very meaningful and not easy to interpret, often counterintuitive. The informativeness of the network could be improved by removing weak links to obtain a more meaningful network structure, where central (i.e., well-connected) nodes are positioned in the center of the network and less important ones in the periphery. Moreover, similar nodes as detected by the clustering algorithm (yellow and green in Figure 1) should be placed close together. In addition, it would make sense to include self-loops for papers saved by users of the same discipline to highlight homogeneous user groups. As the authors use the VOSviewer clustering method, why was VOSviewer not chosen for the mapping? In my opinion, it provides much more meaningful robust networks and better visualizations than Pajek. Other alternatives are Gephi, GUESS, UCInet, etc.”

Based on these suggestions, we have replaced Figure 1 with a new version and extended the methodological description of the network analysis. VOSViewer has not chosen as visualization program because the co-occurences are shown as shorter distances but not as thicker connection lines. We prefer the thicker connection lines for this paper. Unfortunately, the self-loops could not be included due to system limits of Pajek.

“Regarding the interpretation of the network, the authors state that ‘[t]he thicker and darker the edges between two disciplines, the more frequently [the users] have read a F1000Prime paper jointly’. Should it not rather be that ‘[t]he thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines’?
The grouping into clusters seems to a certain extent counterintuitive: Why is environmental science grouped together with psychology instead of biology? Could this be introduced by (the lack of) normalization? These counterintuitive results need to be discussed!”

We have revised the formulation accordingly and included more discussion on the counterintuitive results. Different normalization procedures lead to different results. Thus, we prefer to show the visualization without normalization.

“The discussion needs to be extended and a conclusion is missing. It is not clear what the study actually shows/proofs and in how far the few results (sum of readers per discipline) warrant a separate publication. The dataset of F1000 recommendations and Mendeley include many other pieces of interesting information such as the recommendation scores, F1000 tags (from F1000) and the geographic location and academic status of users (from Mendeley), which could be included to make the study much stronger and contribute to the understanding of readership counts and the effect of F1000 recommendations. In addition, subject classifications for the papers and locations of authors could be included to show if readers come from the same or different disciplines and countries as the paper and authors. I would also recommend the above mentioned extension of the study to include papers that were not recommended in F1000 to measure the effect of recommendations on readership counts. Combining these different aspects, one could investigate whether recommendations on F1000 lead to more diverse user groups on Mendeley in terms of discipline, country and academic status. For example, is a biology paper recommended and tagged as "good for teaching" on F1000 read by more Bachelor students from biology than a biology paper that was not recommended and tagged as such?”

Unfortunately, the other information from Mendeley (geographic location and academic status) are completely decoupled from the sub-discipline information. Therefore, it is not possible to define a “Bachelor student from biology” using current Mendeley data. We could create similar figures for academic status and geographic location, but they are not that interesting in this case. As a bio-medical publication set is studied, the vast majority of readers are expected from medicine and biology. To some extend this expectation is fulfilled, but some interesting readership connections between other disciplines and biology and/or medicine are found. We have no such expectation to test regarding location or academic status.

“The first sentence 'Interest in the broad impact of research (Bornmann, 2012, 2013) has resulted in new forms of impact measurements.' simplifies the situation too much: there is also the technological push and publishers' interest who resulted in the availability of new metrics, plus these metrics have not been validated as measuring impact yet. Also, the references to support interest in broad impact measures should refer to sources that show these interests such as REF etc. instead of papers by Bornmann, which claim that these interests exist.”

We revised this sentence.

“Regarding the use of altmetrics: apart from Snowball Metrics, they are also applied in the sense that various journals now show them to indicate the "impact" and use of articles (for example, PLOS journals, Nature, Wiley journals etc.). Funders have also declared interest in using these metrics (for example, see Dinsmore, Allen & Dolby).”

Thank you for the suggestion. We have included this into the introduction.

"’Since data from Mendeley can be received by an Application Programming Interface (API) without any problems’ - this is not completely true, there are a lot of issues with data quality and reliability for Mendeley, see for example: Bar-Ilan and Zahedi, Haustein & Bowman. These limitations need to be acknowledged in particular because the study is based on matching DOIs and PMIDs - Mendeley entries without these or incorrect IDs will be lost. What is the error rate introduced by using these identifiers only?”

We have revised this sentence.

“In the methods, authors should specify what was done when problems with the API connection occurred. How was it insured that data was not lost due to these problems?”

We have added a more detailed description of the retrieval procedure.

“It would be helpful to add the number of unique papers and mean (+ std. dev.) number of reader counts per paper per discipline to Table 1 and include also the other disciplines with less than 1% of reader counts.”

We added also the other disciplines below 1% of the readers. Including also the number of unique papers, the mean number of readers, and the standard deviations would make the table much harder to understand. All raw data are deposited at a Figshare link so that people interested in other types of analysis can perform them on their own.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Views

Reviewer Report 06 Mar 2015

Rodrigo Costas, Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands

Approved with Reservations

https://doi.org/10.5256/f1000research.6490.r7642

This paper presents an analysis of F1000Prime recommended publications in combination with Mendeley users statistics. I think that in general terms the methodology is fine and the results are correct, having some descriptive interest. However, I have the following major ... Continue reading

The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested?”, and “Which disciplines read F1000Prime papers frequently or seldom together?”) are too general and basically the results reported suggest: “1. Yes, F1000 papers are mostly saved by biomedical Mendeley users” and “2. yes, there are reasonable connections between disciplines (e.g. Biology and Medicine) while some others are not easy to understand”. After all, the reader is left with the questions "why is the analysis of the disciplines of the Mendeley users of F1000Prime recommended publications relevant? What have I learnt from this paper?". For example, do Mendeley users link F1000 papers thematically in a special manner? Or, does the use of Mendeley readerships have a different characteristic in the thematic analysis of disciplines that wouldn't be possible with other methods (e.g. bibliographic coupling)? Is the Mendeley users ‘crowdsourced’ disciplinary classification valid/useful for the classification of F1000Prime papers?
There are some methodological omissions. For example, what is the exact number of publications finally considered in the study? In the figshare dataset there are 147177 rows of data. If the article reports n=114582 papers (does this mean that Mendeley has a coverage of 78% of F1000Prime recommended papers?) Please, clarify this point. Also what are the publication years of the publications finally considered?
In the paper there is only a brief comment to the fact that Mendeley may not necessary measure actual "reads" (at the end of the introductory section). In fact there seems to be some confusion in what are "readers" and readerships (or simply the act of adding papers by Mendeley users). For example, in the results section it is stated that "we found 6,263,913 Mendeley readers". This is a bit misleading. These 6 millions are events of the act of adding documents in their Mendeley libraries by an undetermined number of different Mendeley users. I recommend to revise the consistency of the vocabulary in this regard. This clarification is important for example to understand how the matrix of Mendeley readerships is constructed (see minor comment below).
The results presented are not very surprising. Basically around 86% of F1000Prime publications are saved by Biomedical users, which is what would be expected considering the nature of F1000Prime. So what is the added value of this analysis? Are F1000 recommended publications more interdisciplinary than other biomedical publications as captured by Mendeley users? The network (Figure 1) and ‘community’ analysis are also not very informative. What does it mean that Bio, Med, Eng, etc. belong to the same community? I don't see the reason why Psy is not in the same community as Med. The authors say that biology is related to chemistry while not to environmental sciences. I don't see the logic of this result. Why chemistry is more linked to biology that Environmental sciences? (which intuitively I would expect to be related to biology). What does it mean the “central location” of engineering, material sciences and computer and information science by having connections to many other disciplines? Does it mean that users from these areas are more multidisciplinary than other Mendeley users? I think a much stronger case needs to be made to explain the value of these analyses and results.

Other minor comments include:

The Kreiman & Maunsell reference is missing.
PubMedIDs and DOIs are used as the linking element. Although using PubMedIDs and DOIs is straightforward (and they have been used in other studies), problems with the metadata and ids recorded in Mendeley have been reported and need to be acknowledged (http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf).
It is stated that "a paper which is read by Mendeley users of different disciplines ... constitutes a connection between these disciplines". So, would this be a kind of "Mendeley readerships coupling"? For example Kraker et al. (2015) analyzed ‘co-readership’ networks and they briefly discussed the idea of bibliographic coupling and co-citation. In this paper a different approach as compared to Kraker and colleagues seems to be taken, i.e. here the focus seems to be more on readerships coupling. I think a discussion of the analytical approach would be necessary here.
When explaining the construction of the matrix for the network analysis, how are the links exactly determined? In other words, from a matrix point of view, if the disciplines are the columns and the rows, what is exactly counted in the cells? To clarify this point would be very helpful to the reader and also for other potential scholars interested in the methodology.

My main conclusion is that this paper is correct in technical terms and it has some descriptive merit, but it lacks a relevant focal point. Suggestions to improve the paper could include to turn it into a more methodological paper, so readers can learn in a step-wise mode how to produce and analyze Mendeley readerships coupling networks. Other more ambitious approaches would be to study for example if the disciplinary connections crowdsourced by Mendeley users match (or not) other classifications (e.g. Medline, WoS subject categories, etc.), thus the value of Mendeley and F1000Prime as tools to map disciplinary fields could be highlighted.

I hope my comments are useful to the authors of the paper.

Competing Interests: No competing interests were disclosed.

CITE

Report a concern

Author Response 08 May 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

08 May 2015

Author Response

“The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine ... Continue reading “The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested?”, and “Which disciplines read F1000Prime papers frequently or seldom together?”) are too general and basically the results reported suggest: “1. Yes, F1000 papers are mostly saved by biomedical Mendeley users” and “2. yes, there are reasonable connections between disciplines (e.g. Biology and Medicine) while some others are not easy to understand”. After all, the reader is left with the questions "why is the analysis of the disciplines of the Mendeley users of F1000Prime recommended publications relevant? What have I learnt from this paper?". For example, do Mendeley users link F1000 papers thematically in a special manner? Or, does the use of Mendeley readerships have a different characteristic in the thematic analysis of disciplines that wouldn't be possible with other methods (e.g. bibliographic coupling)? Is the Mendeley users ‘crowdsourced’ disciplinary classification valid/useful for the classification of F1000Prime papers?”

The discussion section has been extended. However, we abstained from having a very long discussion section, since the paper was intended as a shorter article.

“There are some methodological omissions. For example, what is the exact number of publications finally considered in the study? In the figshare dataset there are 147177 rows of data. If the article reports n=114582 papers (does this mean that Mendeley has a coverage of 78% of F1000Prime recommended papers?) Please, clarify this point. Also what are the publication years of the publications finally considered?”

We have revised the Methods section to clarify this point. The employed F1000Prime publication set consists of 114,582 unique papers. Each paper has at least one recommendation. The papers with multiple recommendations occur multiple times. Therefore, the F1000Prime publication set has 147,177 entries. Of course, we analyzed only the 114,582 unique papers.

“In the paper there is only a brief comment to the fact that Mendeley may not necessary measure actual "reads" (at the end of the introductory section). In fact there seems to be some confusion in what are "readers" and readerships (or simply the act of adding papers by Mendeley users). For example, in the results section it is stated that "we found 6,263,913 Mendeley readers". This is a bit misleading. These 6 millions are events of the act of adding documents in their Mendeley libraries by an undetermined number of different Mendeley users. I recommend to revise the consistency of the vocabulary in this regard. This clarification is important for example to understand how the matrix of Mendeley readerships is constructed (see minor comment below).”

We are aware of the fact that we measure reader counts (or bookmarks to papers) and not individual readers. We have revised the parts which might have given a different impression. Also, the results section acknowledges this fact in the revised version.

“The results presented are not very surprising. Basically around 86% of F1000Prime publications are saved by Biomedical users, which is what would be expected considering the nature of F1000Prime. So what is the added value of this analysis? Are F1000 recommended publications more interdisciplinary than other biomedical publications as captured by Mendeley users? The network (Figure 1) and ‘community’ analysis are also not very informative. What does it mean that Bio, Med, Eng, etc. belong to the same community? I don't see the reason why Psy is not in the same community as Med. The authors say that biology is related to chemistry while not to environmental sciences. I don't see the logic of this result. Why chemistry is more linked to biology that Environmental sciences? (which intuitively I would expect to be related to biology). What does it mean the “central location” of engineering, material sciences and computer and information science by having connections to many other disciplines? Does it mean that users from these areas are more multidisciplinary than other Mendeley users? I think a much stronger case needs to be made to explain the value of these analyses and results.”

First, a study is also valuable if an expected result is concluded. Second, this study also discovers unexpected readership connections. We added more explanations about the network analysis in the revised version of the paper.

“The Kreiman & Maunsell reference is missing.”

Unfortunately, many references got lost in a very late stage of the initial version of the manuscript. The Kreiman & Maunsell reference was cited in the text but did not appear in the reference list. We have recovered the lost references in the revised version.

“PubMedIDs and DOIs are used as the linking element. Although using PubMedIDs and DOIs is straightforward (and they have been used in other studies), problems with the metadata and ids recorded in Mendeley have been reported and need to be acknowledged (http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf).”

As this manuscript was intended as a short article we have included a brief note regarding this in the new version of the paper.

“It is stated that ‘a paper which is read by Mendeley users of different disciplines ... constitutes a connection between these disciplines’. So, would this be a kind of "Mendeley readerships coupling"? For example Kraker et al. (2015) analyzed ‘co-readership’ networks and they briefly discussed the idea of bibliographic coupling and co-citation. In this paper a different approach as compared to Kraker and colleagues seems to be taken, i.e. here the focus seems to be more on readerships coupling. I think a discussion of the analytical approach would be necessary here.”

We have included the reference and a corresponding discussion in the introduction of the revised version of the paper.

“When explaining the construction of the matrix for the network analysis, how are the links exactly determined? In other words, from a matrix point of view, if the disciplines are the columns and the rows, what is exactly counted in the cells? To clarify this point would be very helpful to the reader and also for other potential scholars interested in the methodology.”

We have extended the description of the network analysis in the revised version of the paper.
“The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested?”, and “Which disciplines read F1000Prime papers frequently or seldom together?”) are too general and basically the results reported suggest: “1. Yes, F1000 papers are mostly saved by biomedical Mendeley users” and “2. yes, there are reasonable connections between disciplines (e.g. Biology and Medicine) while some others are not easy to understand”. After all, the reader is left with the questions "why is the analysis of the disciplines of the Mendeley users of F1000Prime recommended publications relevant? What have I learnt from this paper?". For example, do Mendeley users link F1000 papers thematically in a special manner? Or, does the use of Mendeley readerships have a different characteristic in the thematic analysis of disciplines that wouldn't be possible with other methods (e.g. bibliographic coupling)? Is the Mendeley users ‘crowdsourced’ disciplinary classification valid/useful for the classification of F1000Prime papers?”

The discussion section has been extended. However, we abstained from having a very long discussion section, since the paper was intended as a shorter article.

“There are some methodological omissions. For example, what is the exact number of publications finally considered in the study? In the figshare dataset there are 147177 rows of data. If the article reports n=114582 papers (does this mean that Mendeley has a coverage of 78% of F1000Prime recommended papers?) Please, clarify this point. Also what are the publication years of the publications finally considered?”

We have revised the Methods section to clarify this point. The employed F1000Prime publication set consists of 114,582 unique papers. Each paper has at least one recommendation. The papers with multiple recommendations occur multiple times. Therefore, the F1000Prime publication set has 147,177 entries. Of course, we analyzed only the 114,582 unique papers.

“In the paper there is only a brief comment to the fact that Mendeley may not necessary measure actual "reads" (at the end of the introductory section). In fact there seems to be some confusion in what are "readers" and readerships (or simply the act of adding papers by Mendeley users). For example, in the results section it is stated that "we found 6,263,913 Mendeley readers". This is a bit misleading. These 6 millions are events of the act of adding documents in their Mendeley libraries by an undetermined number of different Mendeley users. I recommend to revise the consistency of the vocabulary in this regard. This clarification is important for example to understand how the matrix of Mendeley readerships is constructed (see minor comment below).”

We are aware of the fact that we measure reader counts (or bookmarks to papers) and not individual readers. We have revised the parts which might have given a different impression. Also, the results section acknowledges this fact in the revised version.

“The results presented are not very surprising. Basically around 86% of F1000Prime publications are saved by Biomedical users, which is what would be expected considering the nature of F1000Prime. So what is the added value of this analysis? Are F1000 recommended publications more interdisciplinary than other biomedical publications as captured by Mendeley users? The network (Figure 1) and ‘community’ analysis are also not very informative. What does it mean that Bio, Med, Eng, etc. belong to the same community? I don't see the reason why Psy is not in the same community as Med. The authors say that biology is related to chemistry while not to environmental sciences. I don't see the logic of this result. Why chemistry is more linked to biology that Environmental sciences? (which intuitively I would expect to be related to biology). What does it mean the “central location” of engineering, material sciences and computer and information science by having connections to many other disciplines? Does it mean that users from these areas are more multidisciplinary than other Mendeley users? I think a much stronger case needs to be made to explain the value of these analyses and results.”

First, a study is also valuable if an expected result is concluded. Second, this study also discovers unexpected readership connections. We added more explanations about the network analysis in the revised version of the paper.

“The Kreiman & Maunsell reference is missing.”

Unfortunately, many references got lost in a very late stage of the initial version of the manuscript. The Kreiman & Maunsell reference was cited in the text but did not appear in the reference list. We have recovered the lost references in the revised version.

“PubMedIDs and DOIs are used as the linking element. Although using PubMedIDs and DOIs is straightforward (and they have been used in other studies), problems with the metadata and ids recorded in Mendeley have been reported and need to be acknowledged (http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf).”

As this manuscript was intended as a short article we have included a brief note regarding this in the new version of the paper.

“It is stated that ‘a paper which is read by Mendeley users of different disciplines ... constitutes a connection between these disciplines’. So, would this be a kind of "Mendeley readerships coupling"? For example Kraker et al. (2015) analyzed ‘co-readership’ networks and they briefly discussed the idea of bibliographic coupling and co-citation. In this paper a different approach as compared to Kraker and colleagues seems to be taken, i.e. here the focus seems to be more on readerships coupling. I think a discussion of the analytical approach would be necessary here.”

We have included the reference and a corresponding discussion in the introduction of the revised version of the paper.

“When explaining the construction of the matrix for the network analysis, how are the links exactly determined? In other words, from a matrix point of view, if the disciplines are the columns and the rows, what is exactly counted in the cells? To clarify this point would be very helpful to the reader and also for other potential scholars interested in the methodology.”

We have extended the description of the network analysis in the revised version of the paper.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 08 May 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

08 May 2015

Author Response

“The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine ... Continue reading “The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested?”, and “Which disciplines read F1000Prime papers frequently or seldom together?”) are too general and basically the results reported suggest: “1. Yes, F1000 papers are mostly saved by biomedical Mendeley users” and “2. yes, there are reasonable connections between disciplines (e.g. Biology and Medicine) while some others are not easy to understand”. After all, the reader is left with the questions "why is the analysis of the disciplines of the Mendeley users of F1000Prime recommended publications relevant? What have I learnt from this paper?". For example, do Mendeley users link F1000 papers thematically in a special manner? Or, does the use of Mendeley readerships have a different characteristic in the thematic analysis of disciplines that wouldn't be possible with other methods (e.g. bibliographic coupling)? Is the Mendeley users ‘crowdsourced’ disciplinary classification valid/useful for the classification of F1000Prime papers?”

The discussion section has been extended. However, we abstained from having a very long discussion section, since the paper was intended as a shorter article.

“There are some methodological omissions. For example, what is the exact number of publications finally considered in the study? In the figshare dataset there are 147177 rows of data. If the article reports n=114582 papers (does this mean that Mendeley has a coverage of 78% of F1000Prime recommended papers?) Please, clarify this point. Also what are the publication years of the publications finally considered?”

We have revised the Methods section to clarify this point. The employed F1000Prime publication set consists of 114,582 unique papers. Each paper has at least one recommendation. The papers with multiple recommendations occur multiple times. Therefore, the F1000Prime publication set has 147,177 entries. Of course, we analyzed only the 114,582 unique papers.

“In the paper there is only a brief comment to the fact that Mendeley may not necessary measure actual "reads" (at the end of the introductory section). In fact there seems to be some confusion in what are "readers" and readerships (or simply the act of adding papers by Mendeley users). For example, in the results section it is stated that "we found 6,263,913 Mendeley readers". This is a bit misleading. These 6 millions are events of the act of adding documents in their Mendeley libraries by an undetermined number of different Mendeley users. I recommend to revise the consistency of the vocabulary in this regard. This clarification is important for example to understand how the matrix of Mendeley readerships is constructed (see minor comment below).”

We are aware of the fact that we measure reader counts (or bookmarks to papers) and not individual readers. We have revised the parts which might have given a different impression. Also, the results section acknowledges this fact in the revised version.

“The results presented are not very surprising. Basically around 86% of F1000Prime publications are saved by Biomedical users, which is what would be expected considering the nature of F1000Prime. So what is the added value of this analysis? Are F1000 recommended publications more interdisciplinary than other biomedical publications as captured by Mendeley users? The network (Figure 1) and ‘community’ analysis are also not very informative. What does it mean that Bio, Med, Eng, etc. belong to the same community? I don't see the reason why Psy is not in the same community as Med. The authors say that biology is related to chemistry while not to environmental sciences. I don't see the logic of this result. Why chemistry is more linked to biology that Environmental sciences? (which intuitively I would expect to be related to biology). What does it mean the “central location” of engineering, material sciences and computer and information science by having connections to many other disciplines? Does it mean that users from these areas are more multidisciplinary than other Mendeley users? I think a much stronger case needs to be made to explain the value of these analyses and results.”

First, a study is also valuable if an expected result is concluded. Second, this study also discovers unexpected readership connections. We added more explanations about the network analysis in the revised version of the paper.

“The Kreiman & Maunsell reference is missing.”

Unfortunately, many references got lost in a very late stage of the initial version of the manuscript. The Kreiman & Maunsell reference was cited in the text but did not appear in the reference list. We have recovered the lost references in the revised version.

“PubMedIDs and DOIs are used as the linking element. Although using PubMedIDs and DOIs is straightforward (and they have been used in other studies), problems with the metadata and ids recorded in Mendeley have been reported and need to be acknowledged (http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf).”

As this manuscript was intended as a short article we have included a brief note regarding this in the new version of the paper.

“It is stated that ‘a paper which is read by Mendeley users of different disciplines ... constitutes a connection between these disciplines’. So, would this be a kind of "Mendeley readerships coupling"? For example Kraker et al. (2015) analyzed ‘co-readership’ networks and they briefly discussed the idea of bibliographic coupling and co-citation. In this paper a different approach as compared to Kraker and colleagues seems to be taken, i.e. here the focus seems to be more on readerships coupling. I think a discussion of the analytical approach would be necessary here.”

We have included the reference and a corresponding discussion in the introduction of the revised version of the paper.

“When explaining the construction of the matrix for the network analysis, how are the links exactly determined? In other words, from a matrix point of view, if the disciplines are the columns and the rows, what is exactly counted in the cells? To clarify this point would be very helpful to the reader and also for other potential scholars interested in the methodology.”

We have extended the description of the network analysis in the revised version of the paper.
“The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested?”, and “Which disciplines read F1000Prime papers frequently or seldom together?”) are too general and basically the results reported suggest: “1. Yes, F1000 papers are mostly saved by biomedical Mendeley users” and “2. yes, there are reasonable connections between disciplines (e.g. Biology and Medicine) while some others are not easy to understand”. After all, the reader is left with the questions "why is the analysis of the disciplines of the Mendeley users of F1000Prime recommended publications relevant? What have I learnt from this paper?". For example, do Mendeley users link F1000 papers thematically in a special manner? Or, does the use of Mendeley readerships have a different characteristic in the thematic analysis of disciplines that wouldn't be possible with other methods (e.g. bibliographic coupling)? Is the Mendeley users ‘crowdsourced’ disciplinary classification valid/useful for the classification of F1000Prime papers?”

The discussion section has been extended. However, we abstained from having a very long discussion section, since the paper was intended as a shorter article.

“There are some methodological omissions. For example, what is the exact number of publications finally considered in the study? In the figshare dataset there are 147177 rows of data. If the article reports n=114582 papers (does this mean that Mendeley has a coverage of 78% of F1000Prime recommended papers?) Please, clarify this point. Also what are the publication years of the publications finally considered?”

We have revised the Methods section to clarify this point. The employed F1000Prime publication set consists of 114,582 unique papers. Each paper has at least one recommendation. The papers with multiple recommendations occur multiple times. Therefore, the F1000Prime publication set has 147,177 entries. Of course, we analyzed only the 114,582 unique papers.

“In the paper there is only a brief comment to the fact that Mendeley may not necessary measure actual "reads" (at the end of the introductory section). In fact there seems to be some confusion in what are "readers" and readerships (or simply the act of adding papers by Mendeley users). For example, in the results section it is stated that "we found 6,263,913 Mendeley readers". This is a bit misleading. These 6 millions are events of the act of adding documents in their Mendeley libraries by an undetermined number of different Mendeley users. I recommend to revise the consistency of the vocabulary in this regard. This clarification is important for example to understand how the matrix of Mendeley readerships is constructed (see minor comment below).”

We are aware of the fact that we measure reader counts (or bookmarks to papers) and not individual readers. We have revised the parts which might have given a different impression. Also, the results section acknowledges this fact in the revised version.

“The results presented are not very surprising. Basically around 86% of F1000Prime publications are saved by Biomedical users, which is what would be expected considering the nature of F1000Prime. So what is the added value of this analysis? Are F1000 recommended publications more interdisciplinary than other biomedical publications as captured by Mendeley users? The network (Figure 1) and ‘community’ analysis are also not very informative. What does it mean that Bio, Med, Eng, etc. belong to the same community? I don't see the reason why Psy is not in the same community as Med. The authors say that biology is related to chemistry while not to environmental sciences. I don't see the logic of this result. Why chemistry is more linked to biology that Environmental sciences? (which intuitively I would expect to be related to biology). What does it mean the “central location” of engineering, material sciences and computer and information science by having connections to many other disciplines? Does it mean that users from these areas are more multidisciplinary than other Mendeley users? I think a much stronger case needs to be made to explain the value of these analyses and results.”

First, a study is also valuable if an expected result is concluded. Second, this study also discovers unexpected readership connections. We added more explanations about the network analysis in the revised version of the paper.

“The Kreiman & Maunsell reference is missing.”

Unfortunately, many references got lost in a very late stage of the initial version of the manuscript. The Kreiman & Maunsell reference was cited in the text but did not appear in the reference list. We have recovered the lost references in the revised version.

“PubMedIDs and DOIs are used as the linking element. Although using PubMedIDs and DOIs is straightforward (and they have been used in other studies), problems with the metadata and ids recorded in Mendeley have been reported and need to be acknowledged (http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf).”

As this manuscript was intended as a short article we have included a brief note regarding this in the new version of the paper.

“It is stated that ‘a paper which is read by Mendeley users of different disciplines ... constitutes a connection between these disciplines’. So, would this be a kind of "Mendeley readerships coupling"? For example Kraker et al. (2015) analyzed ‘co-readership’ networks and they briefly discussed the idea of bibliographic coupling and co-citation. In this paper a different approach as compared to Kraker and colleagues seems to be taken, i.e. here the focus seems to be more on readerships coupling. I think a discussion of the analytical approach would be necessary here.”

We have included the reference and a corresponding discussion in the introduction of the revised version of the paper.

“When explaining the construction of the matrix for the network analysis, how are the links exactly determined? In other words, from a matrix point of view, if the disciplines are the columns and the rows, what is exactly counted in the cells? To clarify this point would be very helpful to the reader and also for other potential scholars interested in the methodology.”

We have extended the description of the network analysis in the revised version of the paper.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 11 Feb 2015

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 2 (revision) 08 May 15	read	read
Version 1 11 Feb 15	read	read

Rodrigo Costas, Leiden University, Leiden, The Netherlands
Stefanie Haustein, Université de Montréal, Montréal, Canada

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

39 Views

11 Aug 2015 | for Version 2

Stefanie Haustein, École de bibliothéconomie et des sciences de l’information (EBSI), Université de Montréal, Montréal, Canada

39 Views Cite this report Responses(1)

Not Approved

Competing Interests

No competing interests were disclosed.

Respond to this report

Responses (1)

Author Response

01 Sep 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

As also stated in our response to Rodrigo Costas, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.

View more View less

Competing Interests

No competing interests were disclosed.

Back to all reports

Reviewer Report

35 Views

06 Aug 2015 | for Version 2

Rodrigo Costas, Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands

35 Views Cite this report Responses(1)

Approved With Reservations

The paper still lacks solid (and to some extent relevant) research questions. In my first review I pointed to the weakness of the two research questions, which essentially remain the same. Even for answering the proposed questions one wonders why not using more comprehensive approaches. For example, what is the value of knowing whether F1000Prime papers are read by biomedicine readers? Wouldn't it be better to compare these distributions with the overall distribution of readers with other biomedical sources? More comparative analysis could be performed in order to be able to provide more meaningful statements and results.
Some of the new explanations also open new questions and may be subject of more explanations and discussion. For example, it seems that the co-occurrence readership matrix is constructed based on the number of readerships determined by the smallest discipline (when a paper is read by users from more than one discipline). Why is this approach selected? Wouldn't other approaches be also feasible? Would the results be substantially different? Have the authors considered the potential limitations of this approach? For example, this choice causes that smaller disciplines (i.e. with less overall users in Mendeley) may appear as more disconnected simply because their values will likely be smaller as compared to the bigger disciplines. I think the authors could have the chance to present a more thorough discussion on the choice of this approach, which could eventually be considered as critical for the future development of this type of analysis.
The network results are still not clear to me (see my previous review). What is the value or usefulness of these results? Actually, based on my previous comment, I also wonder now if the linkages with fields like Chemistry or Physics may be just the effect of the bigger size of these communities of readers in Mendeley. In other words, I wonder if this map is just the result of the size effect of the distribution of Mendeley users. I think this issue needs to be (at least) discussed.
Other aspects (also from my previous review) include that it is still not clear what is the period of analysis. Have all publications in F1000Prime been considered regardless their year of publication? It is also not clear to me what is the coverage of F1000Prime papers in Mendeley? (Do all F1000Prime papers have readerships in Mendeley?). Have F1000Prime papers with zero readers been excluded from the analysis (particularly for the reporting of the average 54.67 reader counts per paper)? Also the authors claim that “the (sub) discipline in Mendeley is self-assigned and not mandatory”. Actually, when creating a new account in Mendeley the discipline is mandatory but not the sub-discipline. This makes me wonder if actually the claim that “readers (74.94%) assign the “miscellaneous” sub-discipline of their discipline” is correct or of if this “miscellaneous” label is actually added by default by Mendeley (or if there have been changes in Mendeley’s policy for self-reporting users’ disciplinary background). I guess a more critical view should be considered here. The authors have added in the new version the claim that “this study shows that Mendeley data can be used to investigate meaningfully the readership of a set of publications”. I’m not sure what these “meaningful” uses are.

Competing Interests

No competing interests were disclosed.

Respond to this report

Responses (1)

Author Response

01 Sep 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

As also stated in our response to Stefanie Haustein, we have submitted our paper as a Research Note. This document type is explained as follows on the F1000Research website: “Research Notes include single-finding papers that can be reported with one or two illustrations (figures/tables), and lab protocols. Posters from conferences or internal meetings may be summarized as Research Notes” (http://f1000research.com/for-authors/article-guidelines).

The aim with our paper was to provide an overview of the readership of F1000Prime papers: Which scientists from life sciences and other disciplines read F1000Prime papers? The post-publication peer review system Faculty of 1000 and the journal F1000Research are very closely connected. Therefore, we have selected this journal to submit our Research Note. As F1000Research is no specialized scientometric journal, basic bibliometric analysis is employed in our study rather than advanced bibliometric techniques (such as normalization and more elaborated network analysis techniques). In contrast to the reviewers, we did not plan to provide insights into meaning of altmetrics, comparison of different network analysis techniques, and extensive literature review. In our opinion, this would be better suited for a full paper in a different journal (specialized in scientometrics) with a focus of the analyses on generalizable results.

It seems to us that both reviewers would like to see another type of study and depth of analysis than we have intended. For example, the recommendation to normalize the reader counts: It is not possible anymore to gather reliable reference values now, as our data set was gathered in December 2014 and altmetric data change very quickly. Several other recent studies presented non-normalized altmetric data (e.g. Haustein & Lariviere, 2014; Sud & Thelwall, 2015; Zahedi, Costas, & Wouters, 2014). Additionally, there are still no established procedures for normalization of altmetric data and to use the normalized data for network analyses. Such methods would have to be developed and tested first. Usage of normalized data is especially customary for altmetric studies which aim towards research evaluation. As stated above, we did not plan to evaluate the impact of F1000Prime papers but to provide an overview of the disciplinary affiliations of the readership. Similarly, the other major points of the reviewers (more extensive literature overview and a more detailed network analysis) would also require another type of study and depth of analysis than we have intended.

Thus, we refrain from producing a new version of the manuscript.

View more View less

Competing Interests

No competing interests were disclosed.

Back to all reports

Reviewer Report

83 Views

31 Mar 2015 | for Version 1

Stefanie Haustein, École de bibliothéconomie et des sciences de l’information (EBSI), Université de Montréal, Montréal, Canada

83 Views Cite this report Responses(1)

Not Approved

The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how far this study is different from previous work, in particular Mohammadi & Thelwall, who analyzed very similar aspects on Mendeley and, in addition, compared the discipline of users to that of the citing papers. For the present study, it is not clear what the authors expect to find (how much biology readers are normal?) and what the data is able to show: Do papers recommended on F1000 have Mendeley users from more diverse disciplines than expected?

It would be much more interesting and valuable to observe the effect of being recommended on F1000 by comparing Mendeley readership counts and disciplines of users of the dataset used in this study with a control set of papers that were not recommended. This could be achieved by analyzing and comparing the data for the population of PubMed articles for a certain set of recent years: Does the F1000 recommendation provide visibility to papers that increases the number of readers on Mendeley as well as the diversity of the audience in terms of disciplines and academic status? PubMed/Medline could also provide a meaningful subject classification for papers to measure interdisciplinary knowledge flows from authors to readers.

The authors also need to clarify in how far the present study differs and distinguishes itself from their other publications on similar topics and the same datasets:

1) Who reads F1000Prime publications?
2) Who publishes, reads, and cites papers? An analysis of country information
3) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime
4) Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime
5) Overlay maps based on Mendeley data: The use of altmetrics for readership networks
6) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS (altmetrics) and F1000Prime (paper tags)
7) The authors also mention Haunschild, Stefaner & Bornmann (in preparation), which seems to focus on the geographic location of Mendeley users of the same dataset. Could this aspect not be integrated in the present study?
The reference list is particularly poor and not acceptable in its current form. There have been plenty of studies by Thelwall, Mohammadi, Costas, Zahedi, Kraker, Haustein and others that have evaluated Mendeley reader counts - these are completely ignored. The introduction should be additionally supported by core altmetrics publications by Priem (particularly his overview of altmetrics in Beyond Bibliometrics which includes a definition of altmetrics), Piwowar, regarding reference managers Taraborelli and Haustein & Siebenlist, as well as the above mentioned authors for research on altmetrics. Peter Kraker's work on readership networks based on Mendeley users needs to be considered as well.
The parallels between early bibliometrics research and current altmetrics lack references either to the particular bibliometrics studies or detailed discussions of this parallel, for example in Haustein, Bowman & Costas.

In addition, some of the references cited in the text are also not listed in the reference list. This needs proper revision.
The dataset is not clearly defined. What is the F1000 Prime publication dataset? How and when was it retrieved? Are those all recommendations ever made in the database? What publications do the 114,582 papers refer to (journals, publications years, document types, discipline, research field, etc.)? What is the metadata quality of these entries; in particular, how many of these have a correct DOI, PMID or both? In how far does the availability of identifiers as well as the characteristics of papers (publication year, journals) influence and bias the matching with and availability in Mendeley? Regarding the matching of results: how many unique documents do the 6,263,913 reader counts refer to? What is the percentage of documents that could not be found in Mendeley? Readers/users should be distinguished from reader counts - to avoid the implication that there are 6.2 million readers. Mendeley has around 3 million users (2.8 million as of February 2014; Haustein & Larivière), who create reader(ship) counts by adding documents to their libraries.
The description of the methods for the network analysis is too brief: How were co-occurrences calculated? How were they normalized? Due to the density of 1 (i.e., all nodes are connected), the network layout is not very meaningful and not easy to interpret, often counterintuitive. The informativeness of the network could be improved by removing weak links to obtain a more meaningful network structure, where central (i.e., well-connected) nodes are positioned in the center of the network and less important ones in the periphery. Moreover, similar nodes as detected by the clustering algorithm (yellow and green in Figure 1) should be placed close together. In addition, it would make sense to include self-loops for papers saved by users of the same discipline to highlight homogeneous user groups. As the authors use the VOSviewer clustering method, why was VOSviewer not chosen for the mapping? In my opinion, it provides much more meaningful robust networks and better visualizations than Pajek. Other alternatives are Gephi, GUESS, UCInet, etc.

Regarding the interpretation of the network, the authors state that "[t]he thicker and darker the edges between two disciplines, the more frequently [the users] have read a F1000Prime paper jointly". Should it not rather be that "[t]he thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines"?
The grouping into clusters seems to a certain extent counterintuitive: Why is environmental science grouped together with psychology instead of biology? Could this be introduced by (the lack of) normalization? These counterintuitive results need to be discussed!
The discussion needs to be extended and a conclusion is missing. It is not clear what the study actually shows/proofs and in how far the few results (sum of readers per discipline) warrant a separate publication. The dataset of F1000 recommendations and Mendeley include many other pieces of interesting information such as the recommendation scores, F1000 tags (from F1000) and the geographic location and academic status of users (from Mendeley), which could be included to make the study much stronger and contribute to the understanding of readership counts and the effect of F1000 recommendations. In addition, subject classifications for the papers and locations of authors could be included to show if readers come from the same or different disciplines and countries as the paper and authors. I would also recommend the above mentioned extension of the study to include papers that were not recommended in F1000 to measure the effect of recommendations on readership counts. Combining these different aspects, one could investigate whether recommendations on F1000 lead to more diverse user groups on Mendeley in terms of discipline, country and academic status. For example, is a biology paper recommended and tagged as "good for teaching" on F1000 read by more Bachelor students from biology than a biology paper that was not recommended and tagged as such?

Minor revisions:

The first sentence "Interest in the broad impact of research (Bornmann, 2012, 2013) has resulted in new forms of impact measurements." simplifies the situation too much: there is also the technological push and publishers' interest who resulted in the availability of new metrics, plus these metrics have not been validated as measuring impact yet. Also, the references to support interest in broad impact measures should refer to sources that show these interests such as REF etc. instead of papers by Bornmann, which claim that these interests exist.
Regarding the use of altmetrics: apart from Snowball Metrics, they are also applied in the sense that various journals now show them to indicate the "impact" and use of articles (for example, PLOS journals, Nature, Wiley journals etc.). Funders have also declared interest in using these metrics (for example, see Dinsmore, Allen & Dolby).
"Since data from Mendeley can be received by an Application Programming Interface (API) without any problems" - this is not completely true, there are a lot of issues with data quality and reliability for Mendeley, see for example: Bar-Ilan and Zahedi, Haustein & Bowman. These limitations need to be acknowledged in particular because the study is based on matching DOIs and PMIDs - Mendeley entries without these or incorrect IDs will be lost. What is the error rate introduced by using these identifiers only?
In the methods, authors should specify what was done when problems with the API connection occurred. How was it insured that data was not lost due to these problems?
It would be helpful to add the number of unique papers and mean (+ std. dev.) number of reader counts per paper per discipline to Table 1 and include also the other disciplines with less than 1% of reader counts.

Competing Interests

No competing interests were disclosed.

Respond to this report

Responses (1)

Author Response

08 May 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

“The research questions lack clarity and the motivation of the study should not solely be based on the availability of datasets (Mendeley and F1000). It should be emphasized in how far this study is different from previous work, in particular Mohammadi & Thelwall, who analyzed very similar aspects on Mendeley and, in addition, compared the discipline of users to that of the citing papers. For the present study, it is not clear what the authors expect to find (how much biology readers are normal?) and what the data is able to show: Do papers recommended on F1000 have Mendeley users from more diverse disciplines than expected?”

The motivation of the study is not solely based on availability of the datasets. The research questions do reflect this.

We have added the reference (Mohammadi & Thelwall). Clearly, Mohammadi & Thelwall focus on a specific publication year and different disciplines than our study. Technical differences are highlighted in the Section Methods, Subsection Use of the Mendeley API. Mohammadi & Thelwall used the old API where only the top 3 categories in percentages. Our study used the new API where absolute reader numbers are provided and the top 3 restriction is no longer in place. All sub-disciplines with at least one reader are available in the API.

We do not know how many readers from biology are normal for the F1000Prime publication set. This is one of the reasons why we pursued this research. As we have no real expectation value for F1000Prime readers from biology, it is not possible to judge if the observed reader counts are as expected, higher, or lower.

“It would be much more interesting and valuable to observe the effect of being recommended on F1000 by comparing Mendeley readership counts and disciplines of users of the dataset used in this study with a control set of papers that were not recommended. This could be achieved by analyzing and comparing the data for the population of PubMed articles for a certain set of recent years: Does the F1000 recommendation provide visibility to papers that increases the number of readers on Mendeley as well as the diversity of the audience in terms of disciplines and academic status? PubMed/Medline could also provide a meaningful subject classification for papers to measure interdisciplinary knowledge flows from authors to readers.”

While this is an interesting question, it is outside the scope of our current research question. Also, it is not easy (maybe even impossible) to answer it. Even if a paper was recommended into F1000Prime and has a very high Mendeley count, we do not know if this is due to the F1000Prime recommendation or not. Maybe, the paper is well written and interesting, attracted many Mendeley reader counts and was recommended into F1000Prime.

“The authors also need to clarify in how far the present study differs and distinguishes itself from their other publications on similar topics and the same datasets:

1) Who reads F1000Prime publications?”

The paper 1 is actually the preprint version of the current paper. We uploaded the manuscript to Figshare after submitting it to F1000Research.

“2) Who publishes, reads, and cites papers? An analysis of country information”

The paper 2 is concerned about the academic status information of Mendeley readers of F1000Prime papers. Furthermore, the type of analysis is completely different.

“3) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime”

This is an old version of Paper 6.

“4) Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime”

The paper 4 focusses on Twitter counts provided by Altmetric.

“5) Overlay maps based on Mendeley data: The use of altmetrics for readership networks”

The paper 5 uses a different data set (WoS publication year 2012) than our current paper. It focusses on the generation of overlay maps and is already in press in a different journal.

“6) Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS (altmetrics) and F1000Prime (paper tags)”

The paper 6 studied the intersection of altmetrics data from PLoS and F1000Prime publications. This intersection is rather small with 1082 papers. Our current paper studies Mendeley reader counts of 114,582 papers as noted in the Section Methods.

“7) The authors also mention Haunschild, Stefaner & Bornmann (in preparation), which seems to focus on the geographic location of Mendeley users of the same dataset. Could this aspect not be integrated in the present study?”

This paper is already in press and will be presented at the ISSI 2015 conference. Thus, it cannot be integrated into the present study. Furthermore, the topics of both papers are too different, so that it would not be possible to merge both into a concise article.

Although the topics of the papers 2-7 might be similar (all deal with altmetrics), the focus of each paper is very different. In many cases, also the data set is very different.

“The reference list is particularly poor and not acceptable in its current form. There have been plenty of studies by Thelwall, Mohammadi, Costas, Zahedi, Kraker, Haustein and others that have evaluated Mendeley reader counts - these are completely ignored. The introduction should be additionally supported by core altmetrics publications by Priem (particularly his overview of altmetrics in Beyond Bibliometrics which includes a definition of altmetrics), Piwowar, regarding reference managers Taraborelli and Haustein & Siebenlist, as well as the above mentioned authors for research on altmetrics. Peter Kraker's work on readership networks based on Mendeley users needs to be considered as well.

The parallels between early bibliometrics research and current altmetrics lack references either to the particular bibliometrics studies or detailed discussions of this parallel, for example in Haustein, Bowman & Costas.”

Priem’s overview of Altmetrics in the book “Beyond Bibliometrics” is already referenced in the text. Haustein, S., & Larivière, V. (2014) is also cited. We have extended the literature review in the new version of the manuscript somewhat. Considering that this was intended to be a shorter article, we think it is not appropriate to include an exhaustive literature review.

“In addition, some of the references cited in the text are also not listed in the reference list. This needs proper revision.”

We thank the referee for this comment. Unfortunately, a large part of the list of references got lost, due to a problem with our software in the final stages between submission and publication. We have included the lost references in the revised version.

“The dataset is not clearly defined. What is the F1000 Prime publication dataset? How and when was it retrieved? Are those all recommendations ever made in the database? What publications do the 114,582 papers refer to (journals, publications years, document types, discipline, research field, etc.)? What is the metadata quality of these entries; in particular, how many of these have a correct DOI, PMID or both? In how far does the availability of identifiers as well as the characteristics of papers (publication year, journals) influence and bias the matching with and availability in Mendeley? Regarding the matching of results: how many unique documents do the 6,263,913 reader counts refer to? What is the percentage of documents that could not be found in Mendeley? Readers/users should be distinguished from reader counts - to avoid the implication that there are 6.2 million readers. Mendeley has around 3 million users (2.8 million as of February 2014; Haustein & Larivière), who create reader(ship) counts by adding documents to their libraries.”

The employed F1000Prime publication set consists of 114,582 journal articles in journals such as Nature, PNAS, Science, Cell, PLoS ONE, etc. Additionally, there is at least one recommendation for each paper. We have added this information in the revised version of the manuscript. We checked the DOIs and PubMedIDs. There are only two wrong (duplicated) DOIs in the publication set. We found not a single PubMedID which is wrong. Considering that this was intended to be a shorter article, we have included a brief note regarding this in the new version of the paper.

The data set is deposited at the Figshare link in the paper. A new Figshare link has been included which also includes a network file which can be loaded in Pajek to see detailed properties of the network.

We have added descriptions of regarding the data set, problems of retrieval of reader data, and the relation between reader counts and unique documents.

“The description of the methods for the network analysis is too brief: How were co-occurrences calculated? How were they normalized? Due to the density of 1 (i.e., all nodes are connected), the network layout is not very meaningful and not easy to interpret, often counterintuitive. The informativeness of the network could be improved by removing weak links to obtain a more meaningful network structure, where central (i.e., well-connected) nodes are positioned in the center of the network and less important ones in the periphery. Moreover, similar nodes as detected by the clustering algorithm (yellow and green in Figure 1) should be placed close together. In addition, it would make sense to include self-loops for papers saved by users of the same discipline to highlight homogeneous user groups. As the authors use the VOSviewer clustering method, why was VOSviewer not chosen for the mapping? In my opinion, it provides much more meaningful robust networks and better visualizations than Pajek. Other alternatives are Gephi, GUESS, UCInet, etc.”

Based on these suggestions, we have replaced Figure 1 with a new version and extended the methodological description of the network analysis. VOSViewer has not chosen as visualization program because the co-occurences are shown as shorter distances but not as thicker connection lines. We prefer the thicker connection lines for this paper. Unfortunately, the self-loops could not be included due to system limits of Pajek.

“Regarding the interpretation of the network, the authors state that ‘[t]he thicker and darker the edges between two disciplines, the more frequently [the users] have read a F1000Prime paper jointly’. Should it not rather be that ‘[t]he thicker and darker the edges between two disciplines, the more frequently papers were saved by users of these two particular disciplines’?
The grouping into clusters seems to a certain extent counterintuitive: Why is environmental science grouped together with psychology instead of biology? Could this be introduced by (the lack of) normalization? These counterintuitive results need to be discussed!”

We have revised the formulation accordingly and included more discussion on the counterintuitive results. Different normalization procedures lead to different results. Thus, we prefer to show the visualization without normalization.

“The discussion needs to be extended and a conclusion is missing. It is not clear what the study actually shows/proofs and in how far the few results (sum of readers per discipline) warrant a separate publication. The dataset of F1000 recommendations and Mendeley include many other pieces of interesting information such as the recommendation scores, F1000 tags (from F1000) and the geographic location and academic status of users (from Mendeley), which could be included to make the study much stronger and contribute to the understanding of readership counts and the effect of F1000 recommendations. In addition, subject classifications for the papers and locations of authors could be included to show if readers come from the same or different disciplines and countries as the paper and authors. I would also recommend the above mentioned extension of the study to include papers that were not recommended in F1000 to measure the effect of recommendations on readership counts. Combining these different aspects, one could investigate whether recommendations on F1000 lead to more diverse user groups on Mendeley in terms of discipline, country and academic status. For example, is a biology paper recommended and tagged as "good for teaching" on F1000 read by more Bachelor students from biology than a biology paper that was not recommended and tagged as such?”

Unfortunately, the other information from Mendeley (geographic location and academic status) are completely decoupled from the sub-discipline information. Therefore, it is not possible to define a “Bachelor student from biology” using current Mendeley data. We could create similar figures for academic status and geographic location, but they are not that interesting in this case. As a bio-medical publication set is studied, the vast majority of readers are expected from medicine and biology. To some extend this expectation is fulfilled, but some interesting readership connections between other disciplines and biology and/or medicine are found. We have no such expectation to test regarding location or academic status.

“The first sentence 'Interest in the broad impact of research (Bornmann, 2012, 2013) has resulted in new forms of impact measurements.' simplifies the situation too much: there is also the technological push and publishers' interest who resulted in the availability of new metrics, plus these metrics have not been validated as measuring impact yet. Also, the references to support interest in broad impact measures should refer to sources that show these interests such as REF etc. instead of papers by Bornmann, which claim that these interests exist.”

We revised this sentence.

“Regarding the use of altmetrics: apart from Snowball Metrics, they are also applied in the sense that various journals now show them to indicate the "impact" and use of articles (for example, PLOS journals, Nature, Wiley journals etc.). Funders have also declared interest in using these metrics (for example, see Dinsmore, Allen & Dolby).”

Thank you for the suggestion. We have included this into the introduction.

"’Since data from Mendeley can be received by an Application Programming Interface (API) without any problems’ - this is not completely true, there are a lot of issues with data quality and reliability for Mendeley, see for example: Bar-Ilan and Zahedi, Haustein & Bowman. These limitations need to be acknowledged in particular because the study is based on matching DOIs and PMIDs - Mendeley entries without these or incorrect IDs will be lost. What is the error rate introduced by using these identifiers only?”

We have revised this sentence.

“In the methods, authors should specify what was done when problems with the API connection occurred. How was it insured that data was not lost due to these problems?”

We have added a more detailed description of the retrieval procedure.

“It would be helpful to add the number of unique papers and mean (+ std. dev.) number of reader counts per paper per discipline to Table 1 and include also the other disciplines with less than 1% of reader counts.”

We added also the other disciplines below 1% of the readers. Including also the number of unique papers, the mean number of readers, and the standard deviations would make the table much harder to understand. All raw data are deposited at a Figshare link so that people interested in other types of analysis can perform them on their own.

View more View less

Competing Interests

No competing interests were disclosed.

Back to all reports

Reviewer Report

72 Views

06 Mar 2015 | for Version 1

Rodrigo Costas, Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands

72 Views Cite this report Responses(1)

Approved With Reservations

The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested?”, and “Which disciplines read F1000Prime papers frequently or seldom together?”) are too general and basically the results reported suggest: “1. Yes, F1000 papers are mostly saved by biomedical Mendeley users” and “2. yes, there are reasonable connections between disciplines (e.g. Biology and Medicine) while some others are not easy to understand”. After all, the reader is left with the questions "why is the analysis of the disciplines of the Mendeley users of F1000Prime recommended publications relevant? What have I learnt from this paper?". For example, do Mendeley users link F1000 papers thematically in a special manner? Or, does the use of Mendeley readerships have a different characteristic in the thematic analysis of disciplines that wouldn't be possible with other methods (e.g. bibliographic coupling)? Is the Mendeley users ‘crowdsourced’ disciplinary classification valid/useful for the classification of F1000Prime papers?
There are some methodological omissions. For example, what is the exact number of publications finally considered in the study? In the figshare dataset there are 147177 rows of data. If the article reports n=114582 papers (does this mean that Mendeley has a coverage of 78% of F1000Prime recommended papers?) Please, clarify this point. Also what are the publication years of the publications finally considered?
In the paper there is only a brief comment to the fact that Mendeley may not necessary measure actual "reads" (at the end of the introductory section). In fact there seems to be some confusion in what are "readers" and readerships (or simply the act of adding papers by Mendeley users). For example, in the results section it is stated that "we found 6,263,913 Mendeley readers". This is a bit misleading. These 6 millions are events of the act of adding documents in their Mendeley libraries by an undetermined number of different Mendeley users. I recommend to revise the consistency of the vocabulary in this regard. This clarification is important for example to understand how the matrix of Mendeley readerships is constructed (see minor comment below).
The results presented are not very surprising. Basically around 86% of F1000Prime publications are saved by Biomedical users, which is what would be expected considering the nature of F1000Prime. So what is the added value of this analysis? Are F1000 recommended publications more interdisciplinary than other biomedical publications as captured by Mendeley users? The network (Figure 1) and ‘community’ analysis are also not very informative. What does it mean that Bio, Med, Eng, etc. belong to the same community? I don't see the reason why Psy is not in the same community as Med. The authors say that biology is related to chemistry while not to environmental sciences. I don't see the logic of this result. Why chemistry is more linked to biology that Environmental sciences? (which intuitively I would expect to be related to biology). What does it mean the “central location” of engineering, material sciences and computer and information science by having connections to many other disciplines? Does it mean that users from these areas are more multidisciplinary than other Mendeley users? I think a much stronger case needs to be made to explain the value of these analyses and results.

Other minor comments include:

The Kreiman & Maunsell reference is missing.
PubMedIDs and DOIs are used as the linking element. Although using PubMedIDs and DOIs is straightforward (and they have been used in other studies), problems with the metadata and ids recorded in Mendeley have been reported and need to be acknowledged (http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf).
It is stated that "a paper which is read by Mendeley users of different disciplines ... constitutes a connection between these disciplines". So, would this be a kind of "Mendeley readerships coupling"? For example Kraker et al. (2015) analyzed ‘co-readership’ networks and they briefly discussed the idea of bibliographic coupling and co-citation. In this paper a different approach as compared to Kraker and colleagues seems to be taken, i.e. here the focus seems to be more on readerships coupling. I think a discussion of the analytical approach would be necessary here.
When explaining the construction of the matrix for the network analysis, how are the links exactly determined? In other words, from a matrix point of view, if the disciplines are the columns and the rows, what is exactly counted in the cells? To clarify this point would be very helpful to the reader and also for other potential scholars interested in the methodology.

Competing Interests

No competing interests were disclosed.

Respond to this report

Responses (1)

Author Response

08 May 2015

Robin Haunschild, Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569, Germany

“The paper lacks in my view a solid justification of its research questions. The two research questions proposed in the paper (“are F1000Prime papers only read by people from biomedicine or are people from other disciplines also interested?”, and “Which disciplines read F1000Prime papers frequently or seldom together?”) are too general and basically the results reported suggest: “1. Yes, F1000 papers are mostly saved by biomedical Mendeley users” and “2. yes, there are reasonable connections between disciplines (e.g. Biology and Medicine) while some others are not easy to understand”. After all, the reader is left with the questions "why is the analysis of the disciplines of the Mendeley users of F1000Prime recommended publications relevant? What have I learnt from this paper?". For example, do Mendeley users link F1000 papers thematically in a special manner? Or, does the use of Mendeley readerships have a different characteristic in the thematic analysis of disciplines that wouldn't be possible with other methods (e.g. bibliographic coupling)? Is the Mendeley users ‘crowdsourced’ disciplinary classification valid/useful for the classification of F1000Prime papers?”

The discussion section has been extended. However, we abstained from having a very long discussion section, since the paper was intended as a shorter article.

“There are some methodological omissions. For example, what is the exact number of publications finally considered in the study? In the figshare dataset there are 147177 rows of data. If the article reports n=114582 papers (does this mean that Mendeley has a coverage of 78% of F1000Prime recommended papers?) Please, clarify this point. Also what are the publication years of the publications finally considered?”

We have revised the Methods section to clarify this point. The employed F1000Prime publication set consists of 114,582 unique papers. Each paper has at least one recommendation. The papers with multiple recommendations occur multiple times. Therefore, the F1000Prime publication set has 147,177 entries. Of course, we analyzed only the 114,582 unique papers.

“In the paper there is only a brief comment to the fact that Mendeley may not necessary measure actual "reads" (at the end of the introductory section). In fact there seems to be some confusion in what are "readers" and readerships (or simply the act of adding papers by Mendeley users). For example, in the results section it is stated that "we found 6,263,913 Mendeley readers". This is a bit misleading. These 6 millions are events of the act of adding documents in their Mendeley libraries by an undetermined number of different Mendeley users. I recommend to revise the consistency of the vocabulary in this regard. This clarification is important for example to understand how the matrix of Mendeley readerships is constructed (see minor comment below).”

We are aware of the fact that we measure reader counts (or bookmarks to papers) and not individual readers. We have revised the parts which might have given a different impression. Also, the results section acknowledges this fact in the revised version.

“The results presented are not very surprising. Basically around 86% of F1000Prime publications are saved by Biomedical users, which is what would be expected considering the nature of F1000Prime. So what is the added value of this analysis? Are F1000 recommended publications more interdisciplinary than other biomedical publications as captured by Mendeley users? The network (Figure 1) and ‘community’ analysis are also not very informative. What does it mean that Bio, Med, Eng, etc. belong to the same community? I don't see the reason why Psy is not in the same community as Med. The authors say that biology is related to chemistry while not to environmental sciences. I don't see the logic of this result. Why chemistry is more linked to biology that Environmental sciences? (which intuitively I would expect to be related to biology). What does it mean the “central location” of engineering, material sciences and computer and information science by having connections to many other disciplines? Does it mean that users from these areas are more multidisciplinary than other Mendeley users? I think a much stronger case needs to be made to explain the value of these analyses and results.”

First, a study is also valuable if an expected result is concluded. Second, this study also discovers unexpected readership connections. We added more explanations about the network analysis in the revised version of the paper.

“The Kreiman & Maunsell reference is missing.”

Unfortunately, many references got lost in a very late stage of the initial version of the manuscript. The Kreiman & Maunsell reference was cited in the text but did not appear in the reference list. We have recovered the lost references in the revised version.

“PubMedIDs and DOIs are used as the linking element. Although using PubMedIDs and DOIs is straightforward (and they have been used in other studies), problems with the metadata and ids recorded in Mendeley have been reported and need to be acknowledged (http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf).”

As this manuscript was intended as a short article we have included a brief note regarding this in the new version of the paper.

“It is stated that ‘a paper which is read by Mendeley users of different disciplines ... constitutes a connection between these disciplines’. So, would this be a kind of "Mendeley readerships coupling"? For example Kraker et al. (2015) analyzed ‘co-readership’ networks and they briefly discussed the idea of bibliographic coupling and co-citation. In this paper a different approach as compared to Kraker and colleagues seems to be taken, i.e. here the focus seems to be more on readerships coupling. I think a discussion of the analytical approach would be necessary here.”

We have included the reference and a corresponding discussion in the introduction of the revised version of the paper.

“When explaining the construction of the matrix for the network analysis, how are the links exactly determined? In other words, from a matrix point of view, if the disciplines are the columns and the rows, what is exactly counted in the cells? To clarify this point would be very helpful to the reader and also for other potential scholars interested in the methodology.”

We have extended the description of the network analysis in the revised version of the paper.

View more View less

Competing Interests

No competing interests were disclosed.

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] Adie E: Taking the Alternative Mainstream. Profesional De La Informacion. 2014; 23(4): 349–351. Publisher Full Text

[2] Bornmann L: Measuring the societal impact of research: research is less and less assessed on scientific impact alone--we should aim to quantify the increasingly important contributions of science to society. EMBO Reports. 2012; 13(8): 673–676. PubMed Abstract | Publisher Full Text | Free Full Text

[3] Bornmann L: What is societal impact of research and how can it be assessed? A literature survey. J Am Soc Inf Sci Technol. 2013; 64(2): 217–233. Publisher Full Text

[4] Colledge L: Snowball Metrics Recipe Book. Amsterdam, the Netherlands: Snowball Metrics program partners. 2014. Reference Source

[5] de Nooy W, Mrvar A, Batagelj V: Exploratory social network analysis with Pajek. New York, NY, USA: Cambridge University Press. 2011. Reference Source

[6] Dinsmore A, Allen L, Dolby K: Alternative Perspectives on Impact: The Potential of ALMs and Altmetrics to Inform Funders about Research Impact. PLoS Biol. 2014; 12(11): e1002003. PubMed Abstract | Publisher Full Text | Free Full Text

[7] Franceschini F, Maisano D, Mastrogiacomo L: Errors in DOI indexing by bibliometric databases. Scientometrics. 2015; 102(3): 2181–2186. Publisher Full Text

[8] Haunschild R, Bornmann L: Mendeley reader counts for F1000Prime papers. Figshare. 2014. Data Source

[9] Haunschild R, Bornmann L: Mendeley reader counts for F1000Prime papers. Figshare. 2015. Data Source

[10] Haunschild R, Stefaner M, Bornmann L: Who publishes, reads, and cites papers? An analysis of country information. in press. Reference Source

[11] Haustein S, Larivière V: A multidimensional analysis of Aslib proceedings – using everything but the impact factor. Aslib Journal of Information Management. 2014; 66(4): 358–380. Publisher Full Text

[12] Kamada T, Kawai S: An algorithm for drawing general undirected graphs. Inf Process Lett. 1989; 31(1): 7–15. Publisher Full Text

[13] King’s College London and Digital Science. The nature, scale and beneficiaries of research impact: An initial analysis of Research Excellence Framework (REF) 2014 impact case studies. London, UK: King’s College London. 2015. Reference Source

[14] Kraker P, Schlögl C, Jack K, et al.: Visualization of co-readership patterns from an online reference management system. J Inform. 2015; 9(1): 169–182. Publisher Full Text

[15] Kreiman G, Maunsell JH: Nine criteria for a measure of scientific output. Front Comput Neurosci. 2011; 5: 48. PubMed Abstract | Publisher Full Text | Free Full Text

[16] Milojević S: Network Analysis and Indicators. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring Scholarly Impact. Springer International Publishing. 2014; 57–82. Publisher Full Text

[17] Mohammadi E, Thelwall M: Mendeley Readership Altmetrics for the Social Sciences and Humanities: Research Evaluation and Knowledge Flows. J Assoc Inf Sci Technol. 2014; 65(8): 1627–1638. Publisher Full Text

[18] Priem J: Altmetrics. In B. Cronin & C. R. Sugimoto (Eds.), Beyond bibliometrics: harnessing multi-dimensional indicators of performance. Cambridge, MA, USA: MIT Press. 2014. Reference Source

[19] Thelwall M, Kousha K: Can Mendeley Bookmarks Reflect Readership? A Survey of User Motivations. J Assoc Inf Sci Technol. in press. Reference Source

[20] Thelwall M, Maflahi N: Are scholarly articles disproportionately read in their own country? An analysis of Mendeley readers. J Assoc Inf Sci Technol. in press. Publisher Full Text

[21] Van Noorden R: Online collaboration: Scientists and the social network. Nature. 2014; 512(7513): 126–129. Publisher Full Text

[22] Waltman L, van Eck NJ, Noyons ECM: A unified approach to mapping and clustering of bibliometric networks. J Inform. 2010; 4(4): 629–635. Publisher Full Text

[23] Weller K: Social Media and Altmetrics: An Overview of Current Alternative Approaches to Measuring Scholarly Impact. Incentives and Performance. In I. M. Welpe, J. Wollersheim, S. Ringelhan & M. Osterloh (Eds.), Incentives and Performance. Springer International Publishing. 2015; 261–276. Publisher Full Text

[24] Wets K, Weedon D, Velterop J: Post-publication filtering and evaluation: Faculty of 1000. Learned Publishing. 2003; 16(4): 249–258. Publisher Full Text

[25] Wouters P, Costas R: Users, narcissism and control–tracking the impact of scholarly publications in the 21st century. Utrecht, The Netherlands: SURFfoundation. 2012. Reference Source

F1000Prime: an analysis of discipline-specific reader data from Mendeley

Abstract

Keywords

Revised Amendments from Version 1

Introduction

Methods

Peer ratings provided by F1000Prime

Use of the Mendeley API

Network analysis

Results

Table 1. Mendeley users from different disciplines reading F1000Prime recommended papers.

Figure 1.

Discussion

Conclusions

Data availability

Author contributions

Competing interests

Grant information

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated