ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Note
Revised

Investigation of gut microbiome association with inflammatory bowel disease and depression: a machine learning approach

[version 2; peer review: 2 approved with reservations]
PUBLISHED 17 Apr 2019
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Bioinformatics gateway.

Abstract

Background: Inflammatory bowel disease (IBD) is a group of chronic diseases related to inflammatory processes in the digestive tract generally associated with an immune response to an altered gut microbiome in genetically predisposed subjects. For years, both researchers and clinicians have been reporting increased rates of anxiety and depression disorders in IBD, and these disorders have also been linked to an altered microbiome. However, the underlying pathophysiological mechanisms of comorbidity are poorly understood at the gut microbiome level.
Methods: Metagenomic and metatranscriptomic data were retrieved from the Inflammatory Bowel Disease Multi-Omics Database. Samples from 70 individuals that had answered to a self-reported depression and anxiety questionnaire were selected and classified by their IBD diagnosis and their questionnaire results, creating six different groups. The cross-validation random forest algorithm was used in 90% of the individuals (training set) to retain the most important species involved in discriminating the samples without losing predictive power. The validation set that represented the remaining 10% of the samples equally distributed across the six groups was used to train a random forest using only the species selected in order to evaluate their predictive power.
Results: A total of 24 species were identified as the most informative in discriminating the 6 groups. Several of these species were frequently described in dysbiosis cases, such as species from the genus Bacteroides and Faecalibacterium prausnitzii. Despite the different compositions among the groups, no common patterns were found between samples classified as depressed. However, distinct taxonomic profiles within patients of IBD depending on their depression status were detected.
Conclusions: The machine learning approach is a promising approach for investigating the role of microbiome in IBD and depression. Abundance and functional changes in these species suggest that depression should be considered as a factor in future research on IBD.

Keywords

Inflammatory Bowel Disease, Depression, Microbiome, Machine Learning, Random Forest, Metagenomic, Metatranscriptomic.

Revised Amendments from Version 1

Main difference from previous version and the version 2 is that as per Reviewer 1’s comments, we have added a more thorough description of the dataset and a workflow illustrating the k-Fold Cross Validation approach for Random Forest (new Figure 1). As per the comments of Reviewer 2, we have rerun the species identification considering all IBD samples as a group and then classified by a depressed or not depressed state. We have furthermore added a small description of the set of species and expanded the introduction with some of the citations suggested. However, we cannot add covariation analysis as the metadata is quite incomplete and heterogeneous and likely to give misleading results - hence this was not performed.

See the authors' detailed response to the review by Yasir Suhail

Introduction

Increased depression rates have been frequently reported on patients with inflammatory bowel disease (IBD) (Graff et al., 2009), which is a big concern from a clinical standpoint, since increased levels of stress and anxiety are major drivers of IBD relapse and severity (Mawdsley & Rampton, 2006). Both IBD and depression are heavily influenced by the gut microbiome structure, which controls anti-inflammatory processes and permeability in the gut, and communicates with the brain by a complex and close relationship with the Autonomous Nervous System that is known as the brain-gut axis (Foster & McVey Neufeld, 2013; Luna & Foster, 2015; Martin et al., 2018).

Altered microbiomes can have big impacts on the health and development of both the gut and brain, and alterations in the ecology of this microbiome, a process known as dysbiosis, have been separately linked to both depression and IBD (Kaur et al., 2011; Rogers et al., 2016). While IBD has become one of the main focus on microbiome research for its clinical relevance and complex relationship with metabolic, immune and neurological processes (Huttenhower et al., 2014), research on the effect of the microbiome in mental health are comparatively scarce, but have already shown promising results reducing anxiety and depression symptoms using probiotics (Bravo et al., 2011; Pinto-Sanchez et al., 2017). However, the relationship between different microbiome population structures and these conditions is still poorly understood.

The availability of the large amount of data derived from the recent explosion in metagenomics and metatranscriptomics provides unique opportunities for investigation. However, it is sometimes difficult to identify informative species. Recently, machine learning algorithms have been successfully applied because they allow the identification of patterns in situations where large, multi-dimensional and heterogeneous datasets are available.

Among the several machine learning approaches available, random forest is an algorithm used for classification and regression based on an ensemble that builds a population of decision tree classifiers, such that the result of a prediction from a given set of features is the most frequent result from the different trees of the “forest” (Breiman, 2001). This is an efficient and generalist algorithm that has already been applied in several metagenomic investigations in human diseases, such as IBS (Saulnier et al., 2011).

The aim of this work was to apply the random forest approach to identify the microbiome species that may be mostly involved in IBD and depression outcomes and that are responsible for the most relevant changes in the population structure between IBD, depression and patients comorbid for both conditions, and to provide insights on how the microbiome is involved in this comorbidity.

Methods

Database generation

The datasets used for the analyses were retrieved from the Inflammatory Bowel Disease Multi-Omics Database (IBDMDB) (Schirmer et al., 2018), which is part of the Integrative Human Microbiome Project (NIH HMP Working Group et al., 2009). The IBDMDB database contains a wide array of omics data (e.g., 16S and shotgun metagenomic, metatranscriptomic, proteomic and host genomes) of 132 individuals classified by IBD diagnostic in ulcerative colitis, Crohn’s disease and controls. Participants provided bi-weekly stool samples at five hospitals in the United States. Metagenomic and metatranscriptomic data was processed as described in Schirmer et al., 2018 (Abubucker et al., 2012; Truong et al., 2015).

Subject selection

From this dataset, the 70 unique participants who answered an additional self-reported depression and anxiety questionnaire during registration (the answers to which are listed in the HMP2 metadata, column EC to EL) were selected. As the questionnaire model was not specified, only individuals with raw scores over 6 on this test was considered as showing “signs of depression”. To calculate the raw scores, a severity scale was generated, with the following scores: 0, never; 1, rarely; 2, sometimes; 3, often; 4, always. The scores were then summed to give a final total. In the case of individuals undergoing multiple tests, the lower score was used. We selected a low threshold in order to be able to identify putative dysbiotic individuals that were not experiencing severe depression symptoms. All the others were classified as “no sign of depression”. The combination between the test and the IBD diagnosis divided the dataset in six groups: Crohn’s disease with no detectable sign of depression (CD; n=15), Crohn’s disease with signs of depression (CDD; n=20), ulcerative colitis with no sign of depression (UC; n=4), ulcerative colitis with signs of depression (UCD, n=11), signs of depression but no inflammation (nonIBDD; n=7) and the control group: no inflammation/no depression (nonIBD; n=13). As the experimental design of the IBDMDB consisted of a longitudinal study, each subject contributed several times to this study, and all the samples used for this analysis were sequenced by shotgun sequencing as described in Schirmer et al. The resulting datasets for metagenomic and metatranscriptomic consist of 1084 and 566 samples, respectively. The final tables after pre-processing consist of 1486 columns, including Participant ID, data type, diagnostic, sex, mental score, and nested columns on the relative values of the different taxa.

Data analysis

For each of the six groups, abundance matrices of the metagenomic data, metatranscriptomic data, and the combination of metagenomics and metatranscriptomics were used for random forest classification. Each of the datasets was divided randomly into a training set (90% of the individuals) and a validation set (10% of the individuals). Random forest analysis were performed using the library Scikit-learn 0.19.1 (Pedregosa et al., 2011) on the training sets to identify the most important species involved in discriminating the samples without losing predicting power. A 1000-fold cross-validation for the combined dataset, and 500-fold for metagenomic and metatranscriptomic data (see Figure 1), considering one model for each iteration was performed and only the most important species in the construction of this model was retained. Only models with a precision classification >80% were considered, and among the considered models, only species that appeared more in more than one were selected. Afterwards, the validation sets were run with the selected species only to measure the possible loss of predictive capability and computed the area under the receiver operating characteristic (auROC) curve for the prediction of the validation set classes as a performance metric.

e16d0b6d-9a00-4bc0-8258-c0a5512a4386_figure1.gif

Figure 1. Workflow of the k-Fold Cross Validation approach for Random Forest.

First the data gets split into train and validation sets (A). The train dataset will be iterated by the Cross Validation algorithm (B), while the validation set will be spared to test the model trained only with the reduced feature list (C).

Statistical analysis

In order to assess the significance of the differences between the abundances of the selected species, we performed a one-way ANOVA (Scipy 1.0.0, Jones et al., 2001) with a Tukey’s honest significant difference (HSD) post-hoc test. This test makes pair-wise comparisons between the different means to see which classes are different. For clarity, confidence intervals for Tukey’s HSD test can be found in Supplementary Materials (Supplementary Figure 1 and Supplementary Figure 2).

The functional activity of the selected species was retrieved from the HUMAnN metatranscriptomic analyses described above. Only the pathways in which the selected species are involved and those that were different between the groups from the ANOVA test were selected and the correlation between these species was calculated using Spearman’s correlation coefficient. A significance level of 0.05 was applied for all statistical tests.

Results and discussion

Species selection and model validation

The random forest cross-validation selection of the most informative species showed a combined list of 24 species, as can be seen in Figure 2. The validation models for DNA, RNA and the combined dataset shows micro-averaged auROC values of 0.96, 0.91 and 0.99, respectively (Supplementary Figure 3Supplementary Figure 5). This small loss of information suggest a relevant role of the selected species in the interaction of both conditions, while the capability of the model to classify the validation data with with great accuracy shows that our model can generalize its results and it’s not overfitting.

e16d0b6d-9a00-4bc0-8258-c0a5512a4386_figure2.gif

Figure 2. Venn diagram for the species selected for each dataset.

All species exhibited differences in at least one group in a one-way ANOVA (alpha=0.05, Supplementary Table 1), and no significant differences were found between DNA and RNA abundances for these species (Supplementary Table 2). This list of putative species pretends to be a trade-off between the all-relevant and minimal informative approaches. We chose this approach ir order to get as broad of a list as possible while avoiding artifacts related to the longitudinal nature of the dataset.

In order to assess the effect of the small sample size of group UC, the same procedure was made grouping all samples with IBD together. As expected, we see some difference in the species selected. However, the species that showed stronger differences in the previous classification were also the stronger ones, with most of the species overlapping. The interesting exception is Faecalibacterium prausnitzii that was absent.

The non-dysbiotic microbiome

The analyses showed an increase in the number of species from the genus Bacteroides in dysbiotic groups compared with the control (nonIBD) (Figure 3), as has been reported in other dysbiotic samples (Bloom et al., 2011), with the exception of Bacteroides dorei, which is more abundant in nonIBD than in any other group. Aside from Bacteroides dorei, nonIBD samples had a higher abundance of Alistipes shahii and Ruminococcus bromii, while a typical species associated with nonIBD, Faecalibacterium prausnitzii, was significantly decreased in nonIBDD and CD.

e16d0b6d-9a00-4bc0-8258-c0a5512a4386_figure3.gif

Figure 3.

DNA (A) and RNA (B) taxonomic abundances for the selected species. Abundances were quantified by the relative abundances of their sequences, and for each level they should sum to 1 (including unclassified sequences).

Crohn’s disease abundance changes in depression

Both of the Crohn’s disease-related groups (CD and CDD) showed higher abundances of Bacteroides ovatus and Bacteroides uniformis. However, CD samples exhibited higher abundances for several specific species, including Bacteroides xylanisolvens, Parasutterella excrementihominis and Bacteroides fragilis, compared with CDD, but decreased abundance of Faecalibacterium prausnitzii, which did not differ significantly in abundance between nonIBD and CDD groups.

Ulcerative colitis changes in depression

Ulcerative colitis samples had the most distinctive microbiome profile. Several species, including Burkholderiales bacterium 1_1_47, Bacteroides eggerthii and Bacteroides finegoldii were characteristic of this group, and absent in the others, except for B. finegoldii, which was also present in a lower abundance in nonIBD samples. Only UCD samples exhibited an increased abundance of Bacteroides fragilis, Bacteroides vulgatus and Haemophilus pittmaniae, this last species being almost exclusive to the UCD group.

Non-IBD changes in depression

The nonIBDD was the group with the highest number of changes in microbiome diversity when compared with its non-depressed counterpart (Table 1). However, most of those changes followed a similar pattern in other dysbiotic groups.

Table 1. Changes between Crohn’s disease (CD), ulcerative colitis (UC) and control (nonIBD) in depressed compared with non-depressed subjects.

Increases/decreases shown are statistically significant.

SpeciesCDUCnonIBD
Alistipes shahii--Increase
Bacteroides ovatus--Increase
Subdunigranulum sp.-Decrease-
Bacteroides xylanisolvensDecrease-Increase
Parasutterella excrementihominisDecrease--
Burkholderiales bacterium 1_1_47-Decrease-
Alistipes putredinis-DecreaseDecrease
Bacteroides stercoris--Increase
Faecalibacterium prausnitziiIncrease-Decrease
Bacteroides uniformisDecrease-Increase
Bacteroides fragilisDecreaseIncrease-
Lachnospiraceae bacterium 7_1_58Increase--
Bacteroides dorei--Decrease
Bacteroides vulugatus-IncreaseIncrease
Ruminoccocus bromii--Decrease
Bacteroides finegoldiiDecreaseDecrease-
Bacteroides eggerthii-DecreaseIncrease
Parabacteroides goldsteinii--Increase
Haemophilus pittmaniae-Increase-

A notable change was observed in Faecalibacterium prausnitzii, which was present in almost the same abundances in nonIBD, UCD and CDD samples, and a high variability in UC while being significantly lower in CD and nonIBDD (Supplementary Table 3 and Supplementary Table 4). This is particularly interesting, since this species is considered to have anti-inflammatory activity. It seems counterintuitive to find a depleted population of one of the species most associated in the literature with a healthy microbiome compared to an IBD one in a group that doesn’t show any inflammatory process. However, Parabacteroides goldsteinii was increased in nonIBDD and was depleted in all IBD groups in comparison with control samples. The Parabacteroides genre have been associated previously with anti-inflammatory activity (Neff et al., 2016; Schirmer et al., 2016), so the increase in abundance of this bacteria may explain why the nonIBDD microbiome is not associated with inflammation in the gut.

Other than Parabacteroides goldsteinii, nonIBDD samples did not contain other characteristic groups, and, more notably, none of the selected species was specific for depressed or non-depressed phenotypes.

Microbial functional activity

Regarding the functional activity of these species, seven pathways that were more abundant in dysbiotic groups than in nonIBD were identified (Supplementary Figure 1) and were correlated between each other and inversely correlated with most of the others (Supplementary Figure 2 and Supplementary Table 5). Those pathways are folate transformations II, N10-formyl-tetrahydrofolate biosynthesis, de novo L-ornithine biosynthesis, superpathway of pyridoxal 5’phosphate biosynthesis and salvage, phosphopantothenate biosynthesis I, preQ0 biosynthesis and queuosine biosynthesis. Folate (vitamin B9) and pyroxidal 5’-phosphate (vitamin B6) deficiencies have been linked both to depression (Coppen & Bolander-Gouaille, 2005; Hvas et al., 2004; Mitchell et al., 2014), as they are key for the synthesis of several neurotransmitters, and IBD (Pan et al., 2017; Yakut et al., 2010), although this association is not well understood and does not seem to be evidence of causation. Increased levels of L-ornithine derivatives have also been linked to depression (Zheng et al., 2010). However, even if nonIBDD have the highest activity for almost all of these pathways, CD and UC were also significantly increased, while functional activity in CDD was generally lower and non-significant in some pathways. Moreover, UCD did not differ from nonIBD in any of them.

This difference in functional activity again highlights the lack of a concrete pattern of gut microbiome abundance between depressed groups.

Conclusions

The random forest approach was able to successfully identify informative changes in abundance at the species level, revealing specific patterns for the depressed and non-depressed groups without losing predictive power. We believe that this approach, and Machine Learning in general, can be really useful in a field of research were high dimensionality is always an issue.

This work provided, to our knowledge for the first time, an overview about the difference in the bacterial communities of patients with signs of depression and the combination with depression and inflammatory bowel disease. Our findings suggest a complex landscape of microbiome interactions, both at population structure and functional activity levels. However, the results showed that there are distinct taxonomic profiles within patients of IBD depending on their depression status, providing further input for future investigations.

Data availability

The datasets used for the analyses were retrieved from the Inflammatory Bowel Disease Multi-Omics Database (IBDMDB) (Schirmer et al., 2018), a part of the Integrative Human Microbiome Project (NIH HMP Working Group et al., 2009).

Comments on this article Comments (4)

Version 2
VERSION 2 PUBLISHED 17 Apr 2019
Revised
  • Reviewer Response 02 May 2019
    Yasir Suhail, Department of Biomedical Engineering,  Yale University, Storrs, USA
    02 May 2019
    Reviewer Response
    I see the new figure added in Version 2, but I am still not sure how 1000-fold cross validation was performed on a dataset of 70 individuals. Even if only ... Continue reading
Version 1
VERSION 1 PUBLISHED 05 Jun 2018
Discussion is closed on this version, please comment on the latest version above.
  • Reviewer Response 26 Jun 2018
    Yasir Suhail, Department of Biomedical Engineering,  Yale University, Storrs, USA
    26 Jun 2018
    Reviewer Response
    Expanding on Prof. Waldron's comment, the article states "Only models with a precision classification >80% were considered". If this precision is evaluated on the test/validation set then the validation set ... Continue reading
  • Author Response 14 Jun 2018
    Pedro Morell Miranda, Department of Bio and Health Informatics, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
    14 Jun 2018
    Author Response
    Dear Prof. Waldron,

    The "validation set" was only used for prediction. The validation model was still trained with the "training set". We did this in order to see how ... Continue reading
  • Reader Comment 13 Jun 2018
    Levi Waldron, Epidemiology and Biostatistics, CUNY Graduate School of Public Health and Health Policy, New York, USA
    13 Jun 2018
    Reader Comment
    Please note that by training a random forest model in your 10% "validation set", this becomes a second training set, leaving you no way to estimate the predictive accuracy of ... Continue reading
  • Discussion is closed on this version, please comment on the latest version above.
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Morell Miranda P, Bertolini F and Kadarmideen HN. Investigation of gut microbiome association with inflammatory bowel disease and depression: a machine learning approach [version 2; peer review: 2 approved with reservations]. F1000Research 2019, 7:702 (https://doi.org/10.12688/f1000research.15091.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 17 Apr 2019
Revised
Views
9
Cite
Reviewer Report 27 Jun 2019
Yasir Suhail, Department of Biomedical Engineering,  Yale University, Storrs, CT, USA 
Approved with Reservations
VIEWS 9
I think the comments raised in the previous review have not been sufficiently addressed. Perhaps a point by point discussion and reply by the authors will be helpful. Specifically:
  1. It is still not clear as to
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Suhail Y. Reviewer Report For: Investigation of gut microbiome association with inflammatory bowel disease and depression: a machine learning approach [version 2; peer review: 2 approved with reservations]. F1000Research 2019, 7:702 (https://doi.org/10.5256/f1000research.20778.r47383)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
12
Cite
Reviewer Report 26 Jun 2019
Manikandan Narayanan, Department of Computer Science and Engineering (CSE) and Initiative for Biological Systems Engineering (IBSE), Robert Bosch Centre for Data Science and Artificial Intelligence (RBC-DSAI), Indian Institute of Technology (IIT) Madras, Chennai, Tamil Nadu, India 
Approved with Reservations
VIEWS 12
It is great that authors have attempted to address major concerns from both reviewers, with additional analysis, figure and clarifications to text.

A couple follow-up points regarding the additions in version 2 remain to be addressed (as ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Narayanan M. Reviewer Report For: Investigation of gut microbiome association with inflammatory bowel disease and depression: a machine learning approach [version 2; peer review: 2 approved with reservations]. F1000Research 2019, 7:702 (https://doi.org/10.5256/f1000research.20778.r47384)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 05 Jun 2018
Views
32
Cite
Reviewer Report 04 Oct 2018
Manikandan Narayanan, Department of Computer Science and Engineering (CSE) and Initiative for Biological Systems Engineering (IBSE), Robert Bosch Centre for Data Science and Artificial Intelligence (RBC-DSAI), Indian Institute of Technology (IIT) Madras, Chennai, Tamil Nadu, India 
Approved with Reservations
VIEWS 32
Review process:
The open review model calls for a special reviewing strategy since previous reviews/comments are already available. I read the paper and formed my independent opinions before looking at the previous reviews. Then I wrote this report to express ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Narayanan M. Reviewer Report For: Investigation of gut microbiome association with inflammatory bowel disease and depression: a machine learning approach [version 2; peer review: 2 approved with reservations]. F1000Research 2019, 7:702 (https://doi.org/10.5256/f1000research.16435.r37362)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
54
Cite
Reviewer Report 25 Jun 2018
Yasir Suhail, Department of Biomedical Engineering,  Yale University, Storrs, CT, USA 
Approved with Reservations
VIEWS 54
Overview

The paper's central thesis of the correlation, or putative causative mechanism of the gut microbiome on IBD and depression, is important. The application of machine learning techniques may be suitable because of the high dimensional structure of ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Suhail Y. Reviewer Report For: Investigation of gut microbiome association with inflammatory bowel disease and depression: a machine learning approach [version 2; peer review: 2 approved with reservations]. F1000Research 2019, 7:702 (https://doi.org/10.5256/f1000research.16435.r35101)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 17 Sep 2018
    Pedro Morell Miranda, Department of Bio and Health Informatics, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
    17 Sep 2018
    Author Response
    The precision metric was used for each Cross Validation model, so each precision metric was calculated using the test split for that iteration from the original train dataset. The validation ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 17 Sep 2018
    Pedro Morell Miranda, Department of Bio and Health Informatics, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
    17 Sep 2018
    Author Response
    The precision metric was used for each Cross Validation model, so each precision metric was calculated using the test split for that iteration from the original train dataset. The validation ... Continue reading

Comments on this article Comments (4)

Version 2
VERSION 2 PUBLISHED 17 Apr 2019
Revised
  • Reviewer Response 02 May 2019
    Yasir Suhail, Department of Biomedical Engineering,  Yale University, Storrs, USA
    02 May 2019
    Reviewer Response
    I see the new figure added in Version 2, but I am still not sure how 1000-fold cross validation was performed on a dataset of 70 individuals. Even if only ... Continue reading
Version 1
VERSION 1 PUBLISHED 05 Jun 2018
Discussion is closed on this version, please comment on the latest version above.
  • Reviewer Response 26 Jun 2018
    Yasir Suhail, Department of Biomedical Engineering,  Yale University, Storrs, USA
    26 Jun 2018
    Reviewer Response
    Expanding on Prof. Waldron's comment, the article states "Only models with a precision classification >80% were considered". If this precision is evaluated on the test/validation set then the validation set ... Continue reading
  • Author Response 14 Jun 2018
    Pedro Morell Miranda, Department of Bio and Health Informatics, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
    14 Jun 2018
    Author Response
    Dear Prof. Waldron,

    The "validation set" was only used for prediction. The validation model was still trained with the "training set". We did this in order to see how ... Continue reading
  • Reader Comment 13 Jun 2018
    Levi Waldron, Epidemiology and Biostatistics, CUNY Graduate School of Public Health and Health Policy, New York, USA
    13 Jun 2018
    Reader Comment
    Please note that by training a random forest model in your 10% "validation set", this becomes a second training set, leaving you no way to estimate the predictive accuracy of ... Continue reading
  • Discussion is closed on this version, please comment on the latest version above.
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.