What is a predatory journal? A scoping review

Background: There is no standardized definition of what a predatory journal is, nor have the characteristics of these journals been delineated or agreed upon. In order to study the phenomenon precisely a definition of predatory journals is needed. The objective of this scoping review is to summarize the literature on predatory journals, describe its epidemiological characteristics, and to extract empirical descriptions of potential characteristics of predatory journals. Methods: We searched five bibliographic databases: Ovid MEDLINE, Embase Classic + Embase, ERIC, and PsycINFO, and Web of Science on January 2 nd, 2018. A related grey literature search was conducted March 27 th, 2018. Eligible studies were those published in English after 2012 that discuss predatory journals. Titles and abstracts of records obtained were screened. We extracted epidemiological characteristics from all search records discussing predatory journals. Subsequently, we extracted statements from the empirical studies describing empirically derived characteristics of predatory journals. These characteristics were then categorized and thematically grouped. Results: 920 records were obtained from the search. 344 of these records met our inclusion criteria. The majority of these records took the form of commentaries, viewpoints, letters, or editorials (78.44%), and just 38 records were empirical studies that reported empirically derived characteristics of predatory journals. We extracted 109 unique characteristics from these 38 studies, which we subsequently thematically grouped into six categories: journal operations, article, editorial and peer review, communication, article processing charges, and dissemination, indexing and archiving, and five descriptors. Conclusions: This work identified a corpus of potential characteristics of predatory journals. Limitations of the work include our restriction to English language articles, and the fact that the methodological quality of articles included in our extraction was not assessed. These results will be provided to attendees at a stakeholder meeting seeking to develop a standardized definition for what constitutes a predatory journal.

report report report report

Introduction
The term 'predatory journal' was coined less than a decade ago by Jeffrey Beall 1 . Predatory journals have since become a hot topic in the scholarly publishing landscape. A substantial body of literature discussing the problems created by predatory journals, and potential solutions to stop the flow of manuscripts to these journals, has rapidly accumulated [2][3][4][5][6] . Despite increased attention in the literature and related educational campaigns 7 , the number of predatory journals, and the number of articles these journals publish, continues to increase rapidly 8 . Some researchers may be tricked into submitting to predatory journals 9 , while others may do so dubiously to pad their curriculum vitae for career advancement 10 .
One factor that may be contributing to the rise of predatory journals is that there is currently no agreed upon definition of what constitutes a predatory journal. The characteristics of predatory journals have not been delineated, standardized, nor broadly accepted. In the absence of a clear definition, it is difficult for stakeholders such as funders and research institutions to establish explicit policies to safeguard work they support from being submitted to and published in predatory journals. Likewise, if characteristics of predatory journals have not been delineated and accepted, it is difficult to take an evidence-based approach towards educating researchers on how to avoid them. Establishing a consensus definition has the potential to inform policy and to significantly strengthen educational initiatives such as Think, Check, Submit 7 .
The challenge of defining predatory journals has been recognized 11 , and recent discussion in the literature highlights a variety of potential definitions. Early definitions by Beall describe predatory publishers as outlets "which publish counterfeit journals to exploit the open-access model in which the author pays" and publishers that were "dishonest and lack transparency" 1 .
Others have since suggested that we move away from using the term 'predatory journal', in part because the term neglects to adequately capture journals that fail to meet expected professional publishing standards, but do not intentionally act deceptively [12][13][14][15] . This latter view suggests that the rise of so-called predatory journals is not strictly associated with dubious journal operations that use the open-access publishing model (e.g., publishing virtually anything to earn an article processing charge (APC)), but represents a wider spectrum of problems. For example, there is the conundrum that some journals hailing from the global south may not have the knowledge, resources, or infrastructure to meet best practices in publishing although some of them have 'international' or 'global' in their title. Devaluing or black-listing such journals may be problematic as they serve an important function in ensuring the dissemination of research on topics of regional significance.
Other terms to denote predatory journals such as "illegitimate journals 9,16 ", "deceptive journals 15 ", "dark" journals 17 , and "journals operating in bad faith 13 " have appeared in the literature, but like the term "predatory journal" they are reductionist 11 and may not adequately reflect the varied spectrum of quality present in the scholarly publishing landscape and the distinction between low-quality and intentionally dubious journals. These terms have also not garnered widespread acceptance, and it is possible that the diversity in nomenclature leads to confusion for researchers and other stakeholders.
Here, we seek to address the question "what is a predatory journal?" by conducting a scoping review 18,19 of the literature. Scoping reviews are a type of knowledge synthesis that follow a systematic approach to map the literature on a topic, and identify the main concepts, theories and sources, and determine potential gaps in that literature. Guidance on their conduct is available 18-20 and guidance on their reporting is forthcoming. Our aims are twofold. Firstly, in an effort to provide an overview of the literature on the topic, we seek to describe epidemiological characteristics of all records discussing predatory journals. Secondly, we seek to synthesize the existing empirically derived characteristics of predatory journals. The impetus for this work is to establish a list of evidence-based traits and characteristics of predatory journals. This corpus of possible characteristics of predatory journals is one source that could be considered by an international stakeholders meeting to generate a consensus definition of predatory journals. Other sources will be included (e.g., 8 ).

Transparency statement
Prior to initiating this study, we drafted a protocol that was posted on the Open Science Framework prior to data analysis (please see: https://osf.io/gfmwr/). We did not register our review with PROSPERO as the registry does not accept scoping reviews. Other than the protocol deviations described below, the authors affirm that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that discrepancies Amendments from Version 1 • We have changed journals to "publishers" in the introduction.
• We have noted the global south issue of journals using "international" or "global" in their titles.
• We have more cleared described scoping reviews and added an additional reference.
• We believe we have given some examples in Table 3. For example, in response to the query as to the use and meaning of "persuasive language", we state "Language that targets; Language that attempts to convince the author to do or believe something".
• We have made some modifications to the limitations section of our paper. We now state "Thirdly, our focus was on the biomedical literature. Whether the publication (e.g., having an IMRAD (Introduction Methods Results And Discussion) and peer review norms we've used apply across other disciplines is likely an important topic for further investigation." • We have further indicated the limitations of Beall's lists for this type of research.
• We have fixed the broken link to the full search strategy (Supplementary File 1).

REVISED
from the study as planned have been explained. We briefly re-state our study methods here. Large sections of the methods described here are taken directly from the original protocol. We used the PRISMA statement 21 to guide our reporting of this scoping review.

Search strategy
For our full search strategy please see Supplementary File 1. An experienced medical information specialist (BS) developed and tested the search strategy using an iterative process in consultation with the review team. Another senior information specialist peer reviewed the strategy prior to execution using the PRESS Checklist 22 . We searched a range of databases in order to achieve cross-disciplinary coverage. There were no suitable controlled vocabulary terms for this topic in any of the databases. We used various free-text phrases to search, including multiple variations of root words related to publishing (e.g., edit, journal, publication) and predatory practices (e.g., bogus, exploit, sham). We adjusted vocabulary and syntax across the databases. We limited results to the publication years 2012 to the present, since 2012 is the year in which the term "predatory journal" reached the mainstream literature 1 .
We also searched abstracts of relevant conferences (e.g., The Lancet series and conference "Increasing Value, Reducing Waste", International Congresses on Peer Review and Scientific Publication) and Google Scholar to identify grey literature. For the purposes of our Google Scholar search, we conducted an advanced search (on March 27, 2018) using the keywords: predatory, journal, and publisher. We restricted this search to content published from 2012 onward. A single reviewer (KDC) reviewed the first 100 hits and extracted all potentially relevant literature encountered for review, based on title. We did not review content from file sources that were from mainstream publishers (e.g., Sage, BMJ, Wiley), as we expected these to be captured in our broader search strategy.

Study population and eligibility criteria
Our study population included articles, reports, and other digital documents that discuss, characterize, or describe predatory journals. We included all study designs from any discipline captured by our search that were reported in English. This included experimental and observational research, as well as commentaries, editorials and narrative summaries in our epidemiological extraction. For extraction of characteristics of predatory journals we restricted our sample to studies that specifically provided empirically derived characteristics of predatory journals.
Screening and data extraction Data extraction forms were developed and piloted prior to data extraction. Details of the forms used are provided in the Open Science Framework, see here: https://osf.io/p5y2k/. We first screened titles and abstracts against the inclusion criteria. We verified full-text articles met the inclusion criteria and we extracted information on corresponding author name, corresponding author country, year of publication (we selected the most recent date stated), study design (as assessed by the reviewers), and journal name. We also extracted whether or not the paper provided a definition of a predatory journal. This was coded as yes/ no and included both explicit definitions (e.g. "Predatory journals are…") as well as implicit definitions.
When extracting data, we restricted our sample of articles to those that provided a definition of predatory journals, or described characteristics of predatory journals, based on empirical work (i.e., not opinion, not definitions which referenced previous work). Specifically, we restricted our sample of articles to those classed as having an empirical study design and then re-vetted each article to ensure that the study addressed defining predatory journals or their characteristics. For those articles included, we extracted sections of text statements describing the traits/ characteristics of predatory journals. Extraction was done by a single reviewer, with verification conducted by a second reviewer. Conflicts were resolved via consensus. In instances where an empirically derived trait/characteristic of predatory journals was mentioned in several sections of the article, we extracted only a single representative statement.

Data analysis
Our data analysis involved both quantitative (i.e., frequencies and percentages) and qualitative (i.e., thematic analysis) methods. First, a list of potential characteristics of predatory journals was generated collaboratively by the two reviewers who conducted data extraction (KDC, NA). Subsequently, each of the statements describing characteristics of predatory journals that were extracted from the included articles were categorized using the list generated. During the categorization of the extracted statements, if a statement did not apply to a category already on the list, a new category was added. Where duplicate statements were inadvertently extracted from a single record we categorized these only once. During the categorization and grouping process, details on the specific wording of statements from specific included records were not retained (i.e., our categories and our themes do not preserve the original wording of the extracted text).
Subsequently, in line with Galipeau and colleagues 23 , after this initial categorization, we collated overlapping or duplicate categories into themes. Then, two reviewers (KDC, AG) evaluated recurring themes in the work to synthesize the data. A coding framework was iteratively developed by KDC and AG by coding each characteristic statement independently and inductively (i.e., without using a theory or framework a priori). The two reviewers met to discuss these codes, and through consensus decided on the final themes and their definitions. The reviewers then went back to the data and recoded with the agreed-upon themes. Lastly, the reviewers met to compare assignment of themes to statements. Discrepancies were resolved by consensus. Two types of themes emerged: categories (i.e., features of predatory journals to which the statements referred) and descriptors (i.e., statements which described these features, usually with either a positive or negative value).

Deviations from study protocol
We conducted data extraction of epidemiological characteristics of papers discussing predatory journals in duplicate. The original protocol indicated this would be done by a single reviewer with verification. The original protocol stated we would extract information on the discipline of the journals publishing our articles included for epidemiological data extraction (as defined by MEDLINE). Instead, we used SCIMAGOJR (SJR) (https:// www.scimagojr.com/) to determine journal subject areas post-hoc and only extracted this information for the included empirical articles describing empirically derived characteristics of predatory journals. For included articles, post-hoc, we decided to extract information on whether or not the record reported on funding.

Results
Search results and epidemiological characteristics Please see Figure 1 for record and article flow during the review. The original search captured 920 records. We excluded 19 records from initial screening because they were not in English (N = 13), we could not access a full-text document (N = 5; of which one was behind a paywall at a cost of greater than $25 CAD), or the reference referred to a conference proceeding containing multiple documents (N = 1).
We screened a total of 901 title and abstract records obtained from the search strategy. Of these, 402 were included for full-text screening. 499 records were excluded for not meeting our study inclusion criteria. After full-text screening of the 402 studies, 334 were determined to have full texts and to discuss predatory journals. The remaining 68 records were excluded because: they were not about predatory journals (N = 36), did not have full texts (N = 19), were abstracts (N = 12), or were published in a language other than English (N = 1). The 334 articles included for epidemiological data extraction were published between 2012 and 2018 with corresponding authors from 43 countries. The number of publications mentioning predatory journals increased each year from 2012 to 2017 (See Table 1). The vast majority of Of the articles discussing predatory journals, only 38 specifically described a study that reported empirically derived characteristics or traits of predatory journals. These studies were published between 2014 and 2018 and produced by corresponding authors from 19 countries. The majority of these included studies were observational studies (26/38; 68.4%) (See Table 1 and Table 2).
Five additional records obtained from the grey literature search were excluded. These records were either duplicates of studies captured in the main search or they did not provide empirically derived characteristics of predatory journals.

Mapping the data into emergent themes
The list generated to categorize the extracted statements describing characteristics of predatory journals had 109 categories. Two types of themes were identified using qualitative thematic analysis: categories and descriptors. Each statement addressed at least one of the following categories: journal operations, article, editorial and peer review, communication, article processing charges, and dissemination, indexing, and archiving. Within these categories, statements used descriptors including: deceptive or lacking transparency, unethical research or publication practices, persuasive language (), poor quality standards, or high quality standards. Statements that did not include a descriptive component (i.e., were neutral) were coded as not applicable (See Table 3 for themes and definitions). Statements addressing more than one category or using more than one descriptor were coded multiple times. Below we briefly summarize the qualitative findings by category (For full results, see Table 4).
Journal Operations. Predatory journal operations were described as: being deceptive or lacking transparency (19 statements), demonstrating poor quality standards (17 statements), demonstrating unethical research or publication practices (14 statements), using persuasive language (two statements). Five statements were neutral or non-descriptive. The most common characteristics of the journal operations category were "Journals display low levels of transparency, integrity, poor quality practices of journal operations" (N=14 articles); "Contact details of publisher absent or not easily verified" (N=11 articles); and "Journals are published by/in predominantly by authors from specific countries" (N=10 articles).
Article. Articles in predatory journals were described as: demonstrating poor quality standards (six statements), demonstrating high quality standards (two statements), being deceptive or lacking transparency (three statements), and demonstrating unethical research of publication practices (three statements). Four statements were neutral or non-descriptive. The most common characteristics of the article category were: "Journals are published by/in predominantly by authors from specific countries" (N=10 articles); "Quality of articles rated as poor" (N=5 articles); and "Articles are poorly cited" (N=5 articles).
Editorial and Peer Review. The editorial and peer review process was described as: demonstrating unethical or research practices (eight statements), being deceptive or lacking transparency (seven statements), demonstrating poor quality standards (five statements), demonstrating high quality standards (two statements), and using persuasive language (one statement). Two statements were neutral or non-descriptive. The most common characteristics of the editorial and peer review category were: "Journals conduct poor quality peer review" (N=8 articles) and "Journals have short peer review times"; "Editorial board is not stated or incomplete"; "Editorial broad lacks legitimacy (appointed without knowledge, wrong skillset)" (N=7 articles each).

Article
Features related to articles appearing in the journal

Editorial and Peer Review
Any aspect of the internal or external review of submitted articles and decisions on what to publish

Communication
How the journal interacts with (potential) authors, editors, and readers

Article Processing Charges
Fees taken in by journal as part of their business model

Dissemination, Indexing, and Archiving
Information on how the journal disseminates articles and use of indexing and archiving tools Descriptor

Deceptive or Lacking Transparency
Intentionally deceitful practice; Practices or processes that are not made clear to the reader; Missing information

Unethical Research or Publication Practices
Violations of accepted publication and research ethics standards (e.g., Committee on Publication Ethics guidelines)

Persuasive Language
Language that targets; Language that attempts to convince the author to do or believe something

Poor Quality Standards
Lack of rigour in journal operations; Lack of professional standards/ practices; missing information; Poor quality writing or presentation (e.g., grammatical or spelling errors)

High Quality Standards
Evidence of rigour in journal operations; Evidence that professional standards/practices are being met; Clear information

Not Applicable
Neutral or non-descriptive statement Communication. Communication by predatory journals was described as: using persuasive language (12 statements), demonstrating poor quality standards (four statements), being deceptive or lacking transparency (four statements), and demonstrating high quality standards (one statement). All communication statements were descriptive. The most common characteristic of the communications category was: "Journals solicit papers via aggressive e-mail tactics" (N=13 articles).

Article Processing Charges.
Article processing charges in predatory journals were described as: being deceptive or lacking transparency (three statements), using persuasive language (two statements), demonstrating poor quality standards (one statement), demonstrating unethical research or publication practices (one statement), and demonstrating high quality standards (one statement). Two statements were neutral or non-descriptive. The most common characteristics of the article processing charges category were: "APCs are lower than at legitimate journals"; "Journal does not specify APCs"; and "Journal has hidden APCs or hidden information on APCs" (N=9 articles each).
Dissemination, Indexing, and Archiving. Dissemination, indexing, and archiving were described as: demonstrating poor quality standards (five statements), demonstrating unethical research or publication practices (one statement), and as being deceptive or lacking transparency (one statement). Seven statements were neutral or non-descriptive. The most common characteristics of the dissemination, indexing, and archiving category were: "Journals state they are open access" (N=11 articles); "Journal may be listed in DOAJ" (N=8 articles); and "Journals are not indexed" (N=7 articles).

Discussion
This scoping review identified 334 articles mentioning predatory journals, with corresponding authors from more than 40 countries. The trajectory of articles on this topic is increasing rapidly.
As an example, our search captured five articles from 2012 and 140 articles from 2017. The majority of articles captured took the form of a commentary, editorial or letter; just 38 had relevant empirically derived characteristics of predatory journals. One possibility for why there is little empirical work on this topic may be that most funding agencies have not set aside funding for journalology or a related field of enquiry-research on research.
There are recent exceptions to this 24 , but in general such funds are not widely available. Of the 38 studies from which we extracted data, post-hoc we examined the percentage that reported funding, and found that just 13.16% (5/38) did, 21.05% (8/38) did not, and 65.79% (25/38) did not report information on funding. Even among the five studies that reported funding, several of these were not project funding specific to the research, but rather broader university chair or fellowship support.
A total of 109 unique characteristics were extracted from the 38 empirical articles. When examining these unique characteristics some clear contrasts emerge. For example, we extracted the characteristic "Journal APCs clearly stated" (N = 4 articles) as well as the characteristics "Journal does not specify APCs" (N = 9 articles) and "Journal has hidden APCs or hidden information on APCs" (N = 9 articles). Potential inconsistencies of the importance of epidemiological characteristics will make it difficult to define predatory journals. Without a (consensus) definition it will be difficult to study the construct in a meaningful manner. It also makes policy initiatives and educational outreach imprecise and potentially less effective.
We believe a cogent next move is to invite a broad spectrum of stakeholders to a summit. Possible objectives could be to develop a consensus definition of a predatory journal, discuss how best to examine the longitudinal impact of predatory journals, and develop collaborative policy and educational outreach to minimize the impact of predatory publishers on the research community. As a starting point for defining predatory journals, those involved in a global stakeholder meeting to establish a definition for predatory journals may wish to exclude all characteristics that are common to legitimate journals. Further, one could exclude all characteristics that are conflicting, or which directly oppose one another. Another fruitful approach may be to focus on characteristics that can easily be audited to determine if journals do or do not meet the expected standards.
The unique characteristics we extracted were thematically grouped into six categories and five descriptors. Although we did identify one positive descriptor, high quality standards, the majority of descriptors were negative. Most categories (all but 'Communication') also included neutral or non-descriptive statements. The presence of both positive and neutral descriptors points to an overlap between characteristics that describe predatory journals and those that are viewed as 'legitimate', further emphasizing the challenges in defining predatory journals. The category with the most statements was 'Journal Operations' with 19 statements describing operations as deceptive or lacking transparency. The 'Communication' category had the most statements described as persuasive (11 statements), highlighting the targeted language predatory journals may use to convince the reader toward a certain action. Unethical or unprofessional publication practices described statements in all but the 'Communication' category and were most frequent in 'Journal Operations' and 'Editorial and Peer Review'. These findings point to issues of great concern in research and publishing and an urgency to develop interventions and education to protect researchers, funders, and knowledge users.
There are a number of relevant limitations of this work that should be acknowledged. Firstly, while we endeavoured to ensure our systematic search and grey literature appraisal was comprehensive, it is possible that we missed some relevant documents that would have contributed additional empirically derived characteristics of predatory journals. As an example, several authors of this manuscript recently published a paper containing relevant empirical data and predatory characteristics 2 ; however, because this work was published in a commentary format, which did not include an abstract or use the search terms in the article title, it was not picked up in our search. Indeed, part of the challenge of systematically searching on this topic is the lack of agreement and diversity of terms used to describe predatory journals. Further, reviewers deciding which articles to include based on our inclusion criteria had to make judgements on study designs and methods used. Due to inconsistent reporting and terminology, this was not always straightforward and may have resulted in inadvertent exclusions. Secondly, in keeping with accepted scoping review methodology, we did not appraise the methodological quality of the articles that were included in our extraction. This means that the characteristics extracted have not been considered in context to the study design or methodological rigour of the work. In addition, we only extracted definitions from empirical studies describing characteristics of predatory journals. It is possible that further characteristics would have been included in our results if non-empirical research articles were not excluded. We chose to exclude these types of articles as they are more likely to be based on opinion or individual experience rather than evidence. Thirdly, our focus was on the biomedical literature. Whether the publication (e.g., having an IMRAD (Introduction Methods Results And Discussion) and peer review norms we've used apply across other disciplines is likely an important topic for further investigation. Fourthly, some of the studies included in our review are confounded by being identified through Beall's lists, and journal publisher websites, which are considered controversial. Finally, we limited our study to English articles. It is possible that work published in other languages may have provided additional characteristics of predatory journals.
Reaching a consensus on what defines predatory journals, and what features reflect these, may be particularly useful to stakeholders (e.g., funders, research institutions) with a goal of establishing a list of vetted journals to recommend to their researchers. Such lists could be updated annually. Lists which attempt to curate predatory journals rather than legitimate journals are unlikely to achieve success given the reactive nature of this type of curation and the issue that new journals cannot easily be systematically discovered for evaluation 25 . The development and use of digital technologies to provide information about journal publication practices (e.g., membership in the Committee on Publication Ethics (https://publicationethics.org/), listing in the Directory of Open Access Journals (https://doaj.org/)) may also prove to be a fruitful approach in reducing researchers' submissions to predatory journals; empowering authors with knowledge is an important step in decision-making. Currently, researchers receive little education or support about navigating journal selection and submission processes. We envision a plug-in tool that researchers could click to get immediate feedback about a journal page they are visiting and whether it has characteristics of predatory journals. This feedback could provide them with the relevant information to determine if the journal suits their needs and/or meets any policy requirements to which they must adhere (e.g., digital preservation, indexing).

Data availability
Study data and tables are available on the Open Science Framework, see: https://osf.io/4zm3t/. It is clear from the response of the authors that they disagree that my criticism is relevant. And therefore they have chosen not to change their paper to accommodate these. In fairness this would have required a substantial rewrite. I believe my comments remain appropriate. However, I have also taken into account the more positive responses from the other reviewers. I have hence decided to agree that the paper can be indexed -even though it makes less of a contribution than what I would have liked.

Open Peer Review
No competing interests were disclosed.

Competing Interests:
I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Version 1
13 August 2018 Referee Report doi:10.5256/f1000research.16618.r36291 Joanna Chataway SPRU-Science Policy Research Unit, University of Sussex, Brighton, UK This is an interesting and very useful article on a subject which, as the authors note, is widely discussed but under researched.
The article sets out to examine data derived from a scoping review in an effort to contribute to a definition of the term 'predatory journal'. Disagreements about whether the term should be used at all are summarised early on in the article and this provides a useful backdrop to the multiple difficulties involved in defining the term.

Questions and issues raised by the article
In considering the issue of predatory journals, the authors raise questions of what might be considered In considering the issue of predatory journals, the authors raise questions of what might be considered characteristics of a legitimate journal might. The article discusses this only partially and mainly in relation to the difficulties of distinguishing between journals which set out to mislead and which abandon the aims of publish high quality science entirely and, on the other hand, those which are poorly managed and run. It is not necessary I think to give this further and detailed consideration here, but it is important to note that there may well other issues to consider here. For example, there may well be complex relationships between the practices of legitimate journals, and the unintended consequences of impact factor metrics (as noted in The Lancet special issue on 'Increasing value, reducing waste' cited by the authors for example) and the expansion of bad as well as good journals and publication platforms which offer alternatives. The Lancet and other critiques point to intense competition involved in publishing in high impact journals, the need to publish for promotion and employment and so on as factors which drive bad practice in general and may also play a role in the rise of predatory journals.
Another issue which is only briefly mentioned in the article is whether the norms of publishing and peer review differ across different disciplines. Perhaps give the characteristics of existing literature it is not possible to say much about this currently, but the authors could raise more clearly this as an issue to be considered in future research. And I think the point should be made that whilst it is common for health research articles to follow the reporting convention of 'Introduction, methods, results, discussion', this is not the case in other fields. Thus having this as a criteria for judging the quality of a journal could be misleading.

Clarification of terminology
I would encourage the authors to explain terms such as 'epidemiological characteristics' and 'scoping review' which may be familiar to those who work in health research but not perhaps to others.

Some examples?
Some of the results would have been clearer to me if examples had been included. This is particularly the case with regard to 'persuasive language'. It is unclear to me what is being referred to by that term.

Missing link?
I couldn't get the link to further details about the search strategy to work. That accounts for the 'partial' score for source data question but that may just be a problem for me and not for others.

If applicable, is the statistical analysis and its interpretation appropriate? Not applicable
Are all the source data underlying the results available to ensure full reproducibility?

Are the conclusions drawn adequately supported by the results? Yes
No competing interests were disclosed.

Competing Interests:
Referee Expertise: Science policy I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. In response to Joanna Chataway's review: We have made some modifications to the limitations section of our paper. We now state (version 2) "Thirdly, our focus was on the biomedical literature. Whether the publication (e.g., having an IMRAD (Introduction Methods Results And Discussion) and peer review norms we've used apply across other disciplines is likely an important topic for further investigation." We have more cleared described scoping reviews (version 2). We believe we have given some examples in Table 3. For example, in response to the query as to the use and meaning of "persuasive language", we state (Version 1 and version 2) "Language that targets; Language that attempts to convince the author to do or believe something". We have fixed the broken link to the full search strategy (version 2).
We have no competing interests to declare. The paper ultimately promises more than what it delivers. It presents the results of an analysis which has resulted in a set of characteristics of predatory journals derived from a scoping review of recent studies. However, the final discussion section is extremely disappointing. There is no attempt by the authors to add much value to the rather fragmented results found through the review. Part of the problem is that the characteristics listed are treated as equally weighted. Most of the authors who have written on the phenomenon of predatory journals in recent years have attempted to end up with a set of fairly authoritative and even 'objective' criteria that would by themselves be sufficient to classify a journal as predatory. Some of these characteristics would include referencing fake indexing, fake impact metrics, predatory. Some of these characteristics would include referencing fake indexing, fake impact metrics, not being indexed in the DOAJ's and a few more. In order to get to a 'consensus' view of what are the key characteristics of a predatory journal, a simple listing of all possible characteristics will not take us much further. It is perhaps then not surprising that their recommendation is for a consensus type meeting where experts could work towards a consensus definition.
More to the point: in my view to get to the kind of end goal of a consensus or more widely acceptable definition, would require a more theoretical or at least conceptual framework that is embedded in some of the work on scientific communication and publishing which stipulates what good practices in (journal) publishing are.
Unfortunately this paper does not help us much on the way to this goal.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
I have read this submission. I believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above. We are sorry to disappoint Johann Mouton in our scoping review. We believe that a scoping review is a reasonable way to attempt to map the literature. Scoping reviews do not typically weight included studies. We believe our review highlights some of the disagreements in the literature about presumed relevant characteristics of presumed predatory journals.
We have no competing interests to declare.

Competing Interests:
hand: non-biomedical readers may be unsure what exactly is meant.
Lastly, as much as it is very helpful to identify characteristics of predatory journals as drawn from the literature, it seems somewhat positivist to use this very limited body of literature which is by its heavy use of Beall's List data as a means to "generate a consensus definition of predatory journals." Until there is more qualitative research and more multidisciplinary and longitudinal research as was done by Shen and Bjork, there are lacunae in the research literature. The recent articles based on the research by this team is, groundbreaking but largely limits scope to biomedical literature.

Screening and data extraction
The use of implicit and explicit definitions is very important and valuable.

Search strategy
It is possible that some research from librarians and information science scholars might have been missed. There is also some concern that if the articles are open access, they may have not been indexed in traditional databases. This concern relates to the section as well since

Data Analysis
newer and open access journals may not have a Journal Impact Factor and be excluded from SCIMAGO. smaller

Mapping the data into emergent themes
Under the descriptor "persuasive language," the language of predatory journals targets authors and not readers. This should be explained.
What is somewhat confusing to me is separating characteristics in the literature based on the authors' perceptions or evaluations of the journals and publishers versus the actual data drawn from the journals and publishers' emails, journals, articles, and websites. Whether or not the author of the underlying articles performed cross-checking is also important.

Table 4 Characteristics
The characteristic isn't mapped to a descriptor but it JOURNALS HAVE SHORT PEER REVIEW TIMES is a very most important common characteristic of publisher appeals to authors. It typically maps to Poor Quality Standards although not in an absolute manner since obviously large, quality journals can also have quick turnaround. It is unclear to me if because this characteristic lacks a descriptor, it may lose weight in the analysis. I note that is also a NA JOURNALS HAVE SHORT/RAPID PUBLICATION TIMES descriptor. These two facets are closely related. this may be a signal of poor standards but is often more a ARTICLE SUBMISSION OCCURS VIA EMAIL reflection of low budgets and the many amateurish journals that have been lumped into Beall's List. the high number of predatory journals without articles JOURNALS DO NOT CONTAIN ANY ARTICLES is a very important data point that should be emphasized.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
No competing interests were disclosed.

Competing Interests:
Referee Expertise: My knowledge as a scholarly communications librarian who has devoted considerable effort to writing about the topic at hand is extensive but I am not a trained researcher in either the sciences or social sciences so my ability to genuinely judge the methodology is limited. We thank Monica Berger for her thoughtful peer review of our manuscript. We have made revisions throughout (version 2):

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
We have further indicated the limitations of Beall's lists for this type of research; We have noted the global south issue of journals using "international" or "global" in their titles; we have provided some clarity as to scoping reviews; i It is possible we've missed some relevant literature from our review (as is the potential in any review exercise) although we believe in its current form it is both broad and multidisciplinary. As a follow-up exercise we will reach out to library/expert listservs related to this field of enquiry; We agree that the two statements without a descriptor are important, however, the length of time for a peer-review or publication cannot be classified as either a positive or negative statement and hence were not given a descriptor term. While it could be mapped to Poor Quality Standards, we cannot assume that a short peer-review time is indicative of poor quality.
I have no competing interests to declare. In response to Valerie Ann Matarese's comment we have changed two words in the introduction (version 2).
Ross Mounce is misinformed. This is not a "literature review of opinion, and as such, one wonders what the value of the exercise is.". As stated in the screening and data extortion section of the Methods of this scoping review (version 1) "we restricted our sample of articles to those that provided a definition of predatory journals, or described characteristics of predatory journals, based on empirical work (i.e., not opinion, not definitions which referenced previous work).".
We thank Edgardo Rolla for his comments on our scoping review (version 1).
I have no competing interests to declare. Competing Interests: The definition of predatory journals attributed, in the Introduction, to Beall ( 2012) is imprecise. In Nature that article, Beall defined predatory , not journals, and he used the term "predatory journal" just publishers once.

Discuss this Article
The correct quotation is "... predatory publishers ... publish counterfeit journals to exploit the open-access model in which the author pays. These predatory publishers are dishonest and lack transparency." This is a single definition, not "definitions" as suggested here.
Given that the purpose of this new article is to define predatory journals, it is important that the starting point be correct. I hope this change will be made in the revised version.
No competing interests were disclosed.

Competing Interests:
Reader Comment 12 Jul 2018 , University of Cambridge, UK

Ross Mounce
This is a literature review of opinion, and as such, one wonders what the value of the exercise is. Surely it would be better to examine the phenomenon itself, not just a synthesis of many studies of deeply variable quality about it?
One should not just treat all papers equally. The quality of evidence offered by some of the included 38 papers (Supp file 2) is extremely poor if one actually reads them. Yet they appear to be all equally papers (Supp file 2) is extremely poor if one actually reads them. Yet they appear to be all equally weighted in terms of their evidentiary contribution. It would be tremendously interesting to examine how many of the 38 papers were actually peer-reviewed -some clearly are just correspondances and editorials. Others such Sorowski et al. (2017) are published at a well-establised commercial journal in a "commission-only section" (source: ) where the https://www.nature.com/nature/for-authors/other-subs publisher has a clear commercial conflict-of-interest in sowing fear about publishing in less well known journals and has published a whole series of commissioned, not-peer-reviewed, pieces doing just that. The claim that predatory publishers have made approximately $75 million US dollars in 2015 (alone) simply isn't credible but it is indicative of the kind of wild extrapolation and rumour that is undertaken when some people write about predatory publishing. The figure of 75 million I presume derives from Shen & Bjork (2015) in which they write "Using our [estimated] data for the number of articles and average APC for 2014, our estimate for the size of the market is 74 million USD". This is an estimate from extrapolation and in no way indicative of 74 million USD actually being paid out to predatory publishers -that distinction is subtle but important. Shen & Bjork (2015) did not claim that publishers made approximately $75 million US dollars in 2015, yet through poor peer review at "non-predatory" journals the trumped-up form of the claim still enters the literature and is misleadingly repeated in headlines e.g. "Predatory publishers earned $75 million last year, study finds" (Bohannon 2015). Shen & Bjork (2015) itself was in many ways a good, well-reported study, with transparent open peer review, but even here if one digs into the details there are problems. The most notable is that for their definition of predatory publishers and inclusion in the study, Shen & Bjork (2015) used Beall's List at a particular point in time when the publisher MDPI (http://www.mdpi.com/) was on the list. In subsequent years (MDPI was removed from Mr. Beall's list on 28 October 2015), the world and Beall himself realised that MDPI was not a predatory publisher, despite one or two papers with peer review problems -most journals have famous 'clangers' even Nature, Science and PNAS. MDPI unfortunately happens to be a large publisher in terms of article volume with 'high' APCs relative to publishers that are genuinely are 'predatory' by most people's understanding. This therefore explains well how Shen & Bjork (2015) came to their astoundingly high 74 million USD estimate (which included MDPI) but similarly shows that the study is flawed because of that and needs to be revisited with MDPI and other now understood -to-be-predatory not publishers taken out of the analysis. Specific differences in minutiae like this between the 38 studies might make each of them completely incomparable to each other! More detailed work is needed to assess comparability and quality of evidence before a synthesis is made, otherwise the analysis is garbage-in, garbage-out.
The irony is that much of the literature writing about predatory publishing is of itself also poor quality, as the examples demonstrate. The issue at hand here is poor or non-existent peer review and unfortunately this is rather more widespread than many would like to admit.
There also seems to be no acknowledgement whatsoever in this manuscript that 'predatory journals' and