ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Note

Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) Study by Biochemically-inspired Machine Learning

[version 1; peer review: 2 approved with reservations]
PUBLISHED 31 Aug 2016
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Bioinformatics gateway.

This article is included in the Machine learning: life sciences collection.

Abstract

Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemotherapy (CT) agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients, was also used to derive gene signatures of other HT  (tamoxifen) and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil) used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM) model of paclitaxel response containing the ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, TUBB4B genes was 78.6% accurate in 84 patients treated with both HT and CT (median survival ≥ 4.4 yr). Accuracy was lower (73.4%) in 304 untreated patients. The performance of other machine learning approaches were also evaluated at different survival thresholds. Minimum redundancy maximum relevance feature selection of a paclitaxel-based SVM classifier based on expression of ABCB11, ABCC1, BAD, BBC3 and BCL2L1 was 79% accurate in 53 CT patients. A random forest (RF) classifier produced a gene signature (ABCB11, ABCC1, BAD, BCL2, CYP2C8, CYP3A4, MAP4, MAPT, NR1I2, TUBB1, GBP1, OPRK1) that predicted >3 year survival with 82.4% accuracy in 420 HT patients. A similar RF gene signature showed 79.6% accuracy in 504 patients treated with CT and/or HT. These results suggest that tumor gene expression signatures refined by machine learning techniques can be useful for predicting survival after drug therapies.

Keywords

Gene expression signatures, breast cancer, chemotherapy resistance, hormone therapy, machine learning, support vector machine, random forest

Introduction

Current pharmacogenetic analysis of chemotherapy makes qualitative decisions about drug efficacy in patients (determination of good, intermediate or poor metabolizer phenotypes) based on variants present in genes involved in the transport, biotransformation, or disposition of a drug. We have applied a supervised ML approach to derive accurate gene signatures, based on the biochemically-guided response to chemotherapies with breast cancer cell lines1, which show variable responses to growth inhibition by paclitaxel and gemcitabine therapies2,3. We analyzed stable4 and linked unstable genes in pathways that determine their disposition. This involved investigating the correspondence between 50% growth inhibitory concentrations (GI50) of paclitaxel and gemcitabine and gene copy number, mutation, and expression first in breast cancer cell lines and then in patients1. Genes encoding direct targets of these drugs, metabolizing enzymes, transporters, and those previously associated with chemo-resistance to paclitaxel (n=31 genes) were then pruned by multiple factor analysis (MFA), which indicated expression of ABCC10, BCL2, BCL2L1, BIRC5, BMF, FGF2, FN1, MAP4, MAPT, NKFB2, SLCO1B3, TLR6, TMEM243, TWIST1, and CSAG2 could predict sensitivity in breast cancer cell lines with 84% accuracy. The cell line-based paclitaxel-gene signature predicted sensitivity in 84% of patients with no or minimal residual disease (n=56; data from 5). The present study derives related gene signatures with ML approaches that predict outcome of hormone- and chemotherapies in the large METABRIC breast cancer cohort6.

Methods

SVM learning: Previously, paclitaxel-related response genes were identified from peer-reviewed literature, and their expression and copy number in breast cancer cell lines were analyzed by multiple factor analysis of GI50 values of these lines2 (Figure 1). Genes with expression levels related to GI50 were used to derive SVMs by backwards feature selection for paclitaxel, tamoxifen, methotrexate, 5-fluorouracil, epirubicin, and doxorubicin (trained using the function fitcsvm in MATLAB R2014a7 and tested with either leave-one-out or 9 fold cross-validation). These SVMs were then assessed for their ability to predict patient outcomes based on available metadata (see Figure 1 and reference 1). Interactive prediction using normalized expression values as input is available at http://chemotherapy.cytognomix.com.

8e512927-e27d-4ce6-addc-b558b8993d4b_figure1.gif

Figure 1. Biochemically-inspired SVM gene signature derivation workflow.

The initial set of genes is carefully selected through the understanding of the drug and the pathways associated with it. A multiple factor analysis of the GI50 values of a training set of breast cancer cell lines and the corresponding expression levels of each gene in the initial set reduces the list of genes. Given this expression levels of each gene the reduced set for each cell line, the method finds the optimal gene subset and the SVM that minimizes the misclassification rate by cross-validation. The SVM is evaluated on patients by classifying those with shorter survival time as resistant and longer survival as sensitive to hormone and/or chemotherapy. The Gaussian kernel SVM requires manual selection of two different parameters, C and sigma; these parameters determine how strictly the SVM learns the training set, thus if not selected properly can lead to overfitting. A grid search evaluates a wide range of combinations of these values by parallelization. The algorithm selects the C and sigma combination that lead to the lowest cross-validation misclassification rate. A backwards feature selection (greedy) algorithm is used, in which one gene of the set is left out in a reduced gene set and the classification is then assessed; genes that maintain or lower the misclassification rate are kept in the signature. The procedure is repeated until the subset with the lowest misclassification rate is selected as the optimal subset of genes.

RF learning: RF was trained using the WEKA 3.78 data mining tool. This classifier uses multiple random trees for classification, which are combined via a voting scheme to make a decision on the given input gene set. Figure 2 depicts the therapy outcome prediction process of a given patient using a RF consisting of a series of decision trees derived from different subsets of paclitaxel-related genes.

8e512927-e27d-4ce6-addc-b558b8993d4b_figure2.gif

Figure 2. RF decision tree diagram depicts the therapy outcome prediction process of a given patient, using a RF consisting of k decision trees.

Several DTs are built using different subsets of paclitaxel-related genes. The process starts from the root of each tree and if the expression of the gene corresponding to that node is greater than a specific value, the process continues through the right branch, otherwise it continues through the left branch until it reaches a leaf node; that leaf represents the prediction of the tree for that specific input. The decisions of all trees are considered and the one with the largest number of votes is selected as the patient outcome.

Augmented Gene Selection: The most relevant genes (features) for therapy outcome prediction were found using the minimum redundancy and maximum relevance (mRMR) approach9. mRMR is a wrapper that incrementally selects genes by maximizing the average mutual information between gene expression features and classes, while minimizing their redundancies:

mRMR=maxs[1|s|fiSI(fi,C)1|s|2fi,fjSI(fi,fj)]

where fi corresponds to a feature in gene set S, I(fi,C) is the mutual information between fi and class C, and I(fi,fj) is the mutual information between features fi and fj.

For this experiment, we used a 26-gene signature (genes ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, TUBB4B, FGF2, FN1, GBP1, NFKB2, OPRK1, TLR6, TWIST1) as the base feature set. These genes were selected (in Ref. 1) based either on their known involvement in paclitaxel metabolism, or evidence that their expression levels and/or copy numbers correlate with paclitaxel GI50 values (Table 3). mRMR and SVM were combined to obtain a subset of genes that can accurately predict patient survival outcome; here, we considered 3, 4 and 5 years as survival thresholds for breast cancer patients (Table 3).

Results and discussion

Supplemental Dataset: Predicted Treatment Response for Each Individual METABRIC Patient
Predicted Response with SVM Analysis Methods (described in Table 1)Predicted Response with Random Forest (Described in Table 2)Predicted Response with application of mRMR and SVM (Described in Table 3)
METABRIC_IDLiving StatusTreatment (CT - Chemotherapy HT - Hormone RT - Radiation)Time Since Treatment (Years)Observed Response Median Survival Threshold of 4.4 years (Sens./Resist. - 0/1)CT and HT Only Both Datasets (Paclitaxel)CT and HT Only Both Datasets (Tamoxifen)CT and HT Only Both Datasets (Methotrexate)CT and HT Only Both Datasets (Epirubicin)CT and HT Only Both Datasets (Doxorubicin)CT and HT Only Both Datasets (5-Fluorouracil)CT and/or HT 'Discovery' DatasetDeceased (CT and/or HT) 'Discovery' DatasetNo Treatment Both DatasetsCT Observed Response >3 year CT Predicted Response >3 year CT Observed Response >4 year CT Predicted Response >4 year CT Observed Response >5 year CT Predicted Response >5 year HT Observed Response >3 year HT Predicted Response >3 year HT Observed Response >4 year HT Predicted Response >4 year HT Observed Response >5 year HT Predicted Response >5 year CT and/or HT Observed Response >3 year CT and/or HT Predicted Response >3 year CT and/or HT Observed Response >4 year CT and/or HT Predicted Response >4 year CT and/or HT Observed Response >5 year CT and/or HT Predicted Response >5 year CT Observed Response >3 year CT Predicted Response >3 year CT Observed Response >4 year CT Predicted Response >4 year CT Observed Response >5 year CT Predicted Response >5 year HT Observed Response >3 year HT Predicted Response >3 year HT Observed Response >4 year HT Predicted Response >4 year HT Observed Response >5 year HT Predicted Response >5 year CT and/or HT Observed Response >3 year CT and/or HT Predicted Response >3 year CT and/or HT Observed Response >4 year CT and/or HT Predicted Response >4 year CT and/or HT Observed Response >5 year CT and/or HT Predicted Response >5 year
MB-0220dCT3.27111100101-001111------001010001111------001011
MB-0259dCT0.971111111---101011------101010111111------111010
MB-0272dCT10.02110000100-010101------000001000101------010000
MB-0278dCT0.481111111---101011------111010101011------101010
MB-0333dCT2.181111011---101011------101010111011------101010
MB-0346dCT1.68111111111-101111------101010111111------101110
MB-0354dCT0.95110111100-101011------101010101111------111111
MB-0361dCT1.24111010111-101111------101011101110------101010
MB-0400dCT1.85111111101-111111------101010111111------101010
MB-0446aCT4.8400011011--010111------000010000011------000010
MB-0467aCT3.7900001101--001111------001010001110------001010
MB-0564dCT3.12111011111-001111------011011001110------001011
MB-0663aCT4.550010001---000011------010010000010------000010
MB-2614d-d.s.CT5.33111111100-000101------000000000101------000001
MB-2718aCT13.300011011---000000------000000000001------000000
MB-2758d-d.s.CT2.871110111---101010------101010101011------101010
MB-4601d-d.s.CT4.621101010---000010------000010000011------000010
MB-4621aCT11.660111010---000101------000000010100------010000
MB-4715d-d.s.CT2.89111011111-111111------101010111011------101010
MB-4731d-d.s.CT3.691111011---001011------001010001111------011010
MB-4746aCT18.320100000---000001------000000000000------000000
MB-4757d-d.s.CT2.25101111111-101111------101010111111------101010
MB-4886d-d.s.CT3.96111111111-011111------001010011111------011011
MB-4893aCT14.510010100---010101------000000000000------000000
MB-4945d-d.s.CT1.65111011110-111111------101010111111------101010
MB-5070aCT15.6901100001--000101------000000000001------000000
MB-5072d-d.s.CT4.13111100001-010111------000010000111------000010
MB-5229d-d.s.CT2.79111111101-101111------101010101111------101010
MB-5298d-d.s.CT6.971111111---000001------000000010000------000000
MB-5312d-d.s.CT3.11111011011-011111------001011011111------001010
MB-5348aCT13.9400101100--000101------000000010001------010000
MB-5411d-d.s.CT2.14111111111-101111------101010111111------111010
MB-5458aCT13.390100010---010100------000000000100------000000
MB-5474aCT13.9300000001--010101------000000000001------000000
MB-5558aCT2.060101110---111111------101010111110------101010
MB-5625d-d.s.CT2.65100011011-101011------101010111111------101010
MB-6058d-d.s.CT1.861111111---101111------101010101111------111010
MB-6059aCT2.350100001---101111------101110111011------101010
MB-6063d-d.s.CT1.281101110---101010------101010111011------101010
MB-6082d-d.s.CT8.421111010---000101------000000010000------000000
MB-6098d-d.s.CT0.871111010---101110------101010111110------111010
MB-6113d-d.s.CT10.231101111---000001------010101000000------000000
MB-6114aCT5.490011010---000001------000000000000------000000
MB-6122aCT4.390000011---000010------000010000011------010010
MB-6160aCT10.270001010---000001------000000000000------000000
MB-6246dCT4.341111111---000010------000010000011------000010
MB-6273aCT8.620101100---000000------000000000000------000000
MB-7020d-d.s.CT5.871111111---000001------000000000000------000000
MB-7073aCT9.520001000---000001------000000000000------000000
MB-7112d-d.s.CT5.311111100---000000------000000000000------000000
MB-7158d-d.s.CT3.811111001---001011------001010011111------001010
MB-7270aCT14.390000111---000001------000000000000------000000
MB-7275d-d.s.CT3.651111111---001011------001011011111------001010
MB-0005aCT/HT8.360000101---------------000000------------000000
MB-0158aCT/HT4.7300111001--------------000010------------010010
MB-0288d-d.s.CT/HT3.241111111---------------001010------------001010
MB-0291dCT/HT3.22111110111-------------001010------------001010
MB-0305dCT/HT5.221011111---------------000000------------010000
MB-0321aCT/HT5.2600000011--------------000000------------000001
MB-0335aCT/HT9.860001000---------------000000------------010000
MB-0345aCT/HT0.5500001000--------------101010------------111011
MB-0392d-d.s.CT/HT2.71111111101-------------101010------------111111
MB-0407aCT/HT5.800000000---------------000000------------000000
MB-0434d-d.s.CT/HT3.74101111111-------------001010------------001111
MB-0489aCT/HT5.350100000---------------000000------------000000
MB-0524aCT/HT6.6401001000--------------000000------------000000
MB-0529dCT/HT3.99111111101-------------001010------------001111
MB-0559aCT/HT9.8600000001--------------000000------------000000
MB-0609aCT/HT0.9900100000--------------101011------------101010
MB-0614aCT/HT0.7000000101--------------111010------------101011
MB-0899aCT/HT4.5300001010--------------000010------------010010
MB-4935aCT/HT15.9500111011--------------000000------------000001
MB-5299d-d.s.CT/HT1.37100010111-------------101010------------101010
MB-6085aCT/HT2.000001000---------------101010------------101010
MB-6184aCT/HT13.130111001---------------000000------------000000
MB-6217aCT/HT12.290110101---------------000000------------000000
MB-7054d-d.s.CT/HT2.501110111---------------101010------------101010
MB-7057d-d.s.CT/HT3.601111110---------------001010------------011010
MB-7093aCT/HT8.390101000---------------000000------------000000
MB-7097d-d.s.CT/HT5.351111110---------------000000------------000000
MB-7163d-d.s.CT/HT4.321001110---------------000010------------000010
MB-7182d-d.s.CT/HT4.061111101---------------000010------------000010
MB-7194d-d.s.CT/HT3.831111110---------------001010------------001010
MB-7226d-d.s.CT/HT4.381111101---------------000010------------000010
MB-0006aCT/HT/RT4.710---------------------------------------------
Dataset 1.Predicted treatment response for each individual METABRIC patient.
The predicted and expected response to treatment for each individual METABRIC patient for each analyses listed in Table 1, Table 2 and Table 3 are indexed. Patients sensitive to treatment are labeled with ‘0’ while resistant patients are labeled ‘1’.

Table 1. SVM gene expression signature performance on METABRIC patients.

Patient treatment# of
patients
Agent: final gene signatureAccuracy (%)
CT and CT/HT
combination
(without radiation
therapy)1
84Paclitaxel: ABCC1, ABCC10, BAD, BIRC5, FN1, GBP1, MAPT, SLCO1B3,
TMEM243, TUBB3, TUBB4B
78.6
Tamoxifen: ABCC2, ALB, CCNA2, E2F7, FLAD1, FMO1, NCOA2, NR1I2,
PIAS4, SULT1E1
76.2
Methotrexate: ABCC2, ABCG2, CDK2, DHFRL171.3
Epirubicin: ABCB1, CDA, CYP1B1, ERBB3, ERCC1, MTHFR, PON1,
SEMA4D, TFDP2
72.6
Doxorubicin: ABCC2, ABCD3, CBR1, FTH1, GPX1, NCF4, RAC2, TXNRD175.0
5-Fluorouracil: ABCB1, ABCC3, MTHFR, TP5371.4
CT and/or HT1,2,3,4735Paclitaxel: BAD, BCAP29, BCL2, BMF, CNGA3, CYP2C8, CYP3A4, FGF2,
FN1, NFKB2, NR1I2, OPRK1, SLCO1B3, TLR6, TUBB1, TUBB3, TUBB4A,
TUBB4B, TWIST1
66.1
Deceased only2,4,5
(CT and/or HT)
327Paclitaxel: ABCB11, BAD, BBC3, BCL2, BCL2L1, BIRC5, CYP2C8, FGF2,
FN1, GBP1, MAPT, NFKB2, OPRK1, SLCO1B3, TMEM243
75.2
No treatment1304Paclitaxel: ABCB1, ABCB11, BBC3, BCL2L1, BMF, CYP3A4, FGF2, GBP1,
MAP4, MAPT, NR1I2, OPRK1, SLCO1B3, TUBB4A, TUBB4B, TWIST2
73.4

Initial gene sets preceding feature selection: Paclitaxel - ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCAP29, BCL2, BCL2L1, BIRC5, BMF, CNGA3, CYP2C8, CYP3A4, FGF2, FN1, GBP1, MAP2, MAP4, MAPT, NFKB2, NR1I2, OPRK1, SLCO1B3, TLR6, TUBB1, TWIST1. Tamoxifen - ABCB1, ABCC2, ALB, C10ORF11, CCNA2, CYP3A4, E2F7, F5, FLAD1, FMO1, IGF1, IGFBP3, IRS2, NCOA2, NR1H4, NR1I2, PIAS4, PPARA, PROC, RXRA, SMARCD3, SULT1B1, SULT1E1, SULT2A1. Methotrexate - ABCB1, ABCC2, ABCG2, CDK18, CDK2, CDK6, CDK8, CENPA, DHFRL1. Epirubicin - ABCB1, CDA, CYP1B1, ERBB3, ERCC1, GSTP1, MTHFR, NOS3, ODC1, PON1, RAD50, SEMA4D, TFDP2. Doxorubicin - ABCB1, ABCC2, ABCD3, AKR1B1, AKR1C1, CBR1, CYBA, FTH1, FTL, GPX1, MT2A, NCF4, RAC2, SLC22A16, TXNRD1. 5-Fluorouracil - ABCB1, ABCC3, CFLAR, IL6, MTHFR, TP53, UCK2.

1 Surviving patients; 2 Analysis included patients in the METABRIC ‘discovery’ dataset only; 3 SVMs tested with 9 fold cross-validation, all others tested with leave-one-out cross-validation; 4 Includes all patients treated with HT,CT, combination CT/HT, either with or without combination radiotherapy; 5 Median time after treatment until death (> 4.4 years) was used to distinguish favorable outcome, ie. sensitivity to therapy.

Table 2. Results of applying RF to predict outcome of paclitaxel therapy.

Type of
treatment
Survival years
(as threshold)
#
Patients
Final gene
expression signature
Accuracy
(%)
AUC1
Chemotherapy
(CT)
353ABCB1, ABCB11,
ABCC1, ABCC10,
BAD, BBC3, BCL2,
BCL2L1, BMF,
CYP2C8, CYP3A4,
MAP2, MAP4, MAPT,
NR1I2, SLCO1B3,
TUBB1, TUBB4A,
TUBB4B
56.60.534
460.40.645
558.50.645
Hormone
therapy
(HT)
342082.40.641
477.40.555
570.70.648
CT and/or HT350479.60.564
473.60.527
563.50.563

1AUC: Area under receiver operating curve; both Discovery and Validation patient datasets analyzed

Table 3. Results of mRMR feature selection for an SVM for predicting outcome of paclitaxel therapy.

Type of treatmentSurvival
years (as
threshold)
Number
of
patients1
Gene signature selected by mRMRAccuracy
(%)
AUC
Chemotherapy
(CT)
353BCL2 , TWIST173.60.724
4ABCB11, ABCC1, BAD, BBC3, BCL2L179.20.793
5ABCB11, BAD, CYP2C8, CYP3A4, MAP2,
MAPT, FGF2
77.40.759
Hormone therapy
(HT)
3420ABCB11, ABCC1, BAD, BCL2, CYP2C8,
CYP3A4, MAP4, MAPT, NR1I2, TUBB1,
GBP1, OPRK1
84.00.532
4BBC3, MAP4, FGF2, OPRK179.30.519
5ABCC10, MAPT, TUBB172.60.520
CT and/or HT3504ABCB11, ABCC1, BBC3, BCL2, BCL2L1,
CYP2C8, CYP3A4, MAP2, MAPT, TLR6,
TWIST1
74.60.565
4ABCB11, BAD, BCL2, CYP3A4, MAP2,
MAP4, NR1I2, OPRK1, TWIST1
75.00.535
5TWIST1, BCL2, BMF, CYP2C8, CYP3A4,
BCL2L1, BBC3, TLR6, BAD, MAP4,
NR1I2, GBP1, NFKB2
65.90.525

1Predicted treatment responses for individual METABRIC patients using the described ML techniques are provided in Dataset 1.

The performance of several ML techniques have been compared that distinguish paclitaxel sensitivity and resistance in METABRIC patients using its tumour gene expression datasets. SVMs have generated gene signatures, indicating which genes are important for treatment response in METABRIC patients. These models are more accurate for prediction of outcomes in patients receiving HT and/or CT compared to other patient groups.

SVMs and RF were trained using expression of genes associated with paclitaxel response, mechanism of action and stable genes in the biological pathways of these targets (Figure 3). SVM models for drugs used to treat these patients were derived by backwards feature selection on patient subsets stratified by treatment or outcome (Table 1). The highest SVM accuracy was found for the paclitaxel signature in patients treated with HT and/or adjuvant chemotherapy (78.6%).

8e512927-e27d-4ce6-addc-b558b8993d4b_figure3.gif

Figure 3.

Schematic elements of gene expression changes associated with response to paclitaxel. Red boxes indicate genes with a positive correlation between gene expression or copy number, and resistance using multiple factor analysis. Blue demonstrates a negative correlation. Genes outlined in dark grey are those in a previously published paclitaxel SVM model (reproduced from reference 1 with permission).

The RF classifier was used to predict paclitaxel therapy outcome for patients that underwent CT and/or HT (Table 2). The best performance achieved with RF showed 82.4% overall accuracy using a 3-year survival threshold for distinguishing therapeutic resistance vs. sensitivity.

The best overall accuracy and AUC (sensitivity and specificity) for CT/HT patients using mRMR feature selection for SVM predicting outcome of paclitaxel therapy was obtained for CT patients with 4 year survival. Outcomes for HT patients with 3 year survival were predicted with 84% accuracy; however the specificity was lower in this group. SVM combined with mRMR further improved accuracy of feature selection and prediction of response to hormone and/or chemotherapy based on survival time than either SVM or RF alone.

While not a replication study sensu stricto, the initial paclitaxel gene set used for feature selection was the same as in our previous study1. Predictions for the METABRIC patient cohort, which was independent of the previous validation set5, of the either same (SVM) or different ML methods (RF and SVM with mRMR) exhibited comparable or better accuracies than our previous gene signature1.

These techniques are powerful tools which can be used to identify genes that may be involved in drug resistance, as well as predict patient survival after treatment. Future efforts to expand these models to other drugs may assist in suggesting preferred treatments in specific patients, with the potential impact of improving efficacy and reducing duration of therapy.

Data availability

Patient data: The METABRIC datasets are accessible from the European Genome-Phenome Archive (EGA) using the accession number EGAS00000000083 (https://www.ebi.ac.uk/ega/studies/EGAS00000000083). Normalized patient expression data for the discovery (EGAD00010000210) and validation sets (EGAD00010000211) were retrieved with permission from EGA. Corresponding clinical data was obtained from the literature6. While not individually curated, HT patients were treated with tamoxifen and/or aromatase inhibitors, while CT patients were most commonly treated with cyclophosphamide-methotrexate-fluorouracil (CMF), epirubicin-CMF, or doxorubicin-cyclophosphamide.

F1000Research: Dataset 1. Predicted treatment response for each individual METABRIC patient, 10.5256/f1000research.9417.d13398310

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 31 Aug 2016
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Rezaeian I, Mucaki EJ, Baranova K et al. Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) Study by Biochemically-inspired Machine Learning [version 1; peer review: 2 approved with reservations]. F1000Research 2016, 5:2124 (https://doi.org/10.12688/f1000research.9417.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 31 Aug 2016
Views
37
Cite
Reviewer Report 03 Oct 2016
Chun-Wei Tung, School of Pharmacy, Kaohsiung Medical University, Kaohsiung, Taiwan 
Approved with Reservations
VIEWS 37
This study proposed prediction methods using SVM and RF classifiers with mRMR selected feature sets from cell line data and demonstrate its prediction ability for outcomes from METABRIC patient cohort. The classifiers with good prediction performance show the usefulness of ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Tung CW. Reviewer Report For: Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) Study by Biochemically-inspired Machine Learning [version 1; peer review: 2 approved with reservations]. F1000Research 2016, 5:2124 (https://doi.org/10.5256/f1000research.10141.r16345)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 27 Jan 2017
    Peter Rogan, Department of Biochemistry, University of Western Ontario, London, Canada
    27 Jan 2017
    Author Response
    Comment 1:What are the values of parameters for SVM and RF classifiers and the methods for parameter selection (by default or other selection methods)?

    Response: The parameter values for ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 27 Jan 2017
    Peter Rogan, Department of Biochemistry, University of Western Ontario, London, Canada
    27 Jan 2017
    Author Response
    Comment 1:What are the values of parameters for SVM and RF classifiers and the methods for parameter selection (by default or other selection methods)?

    Response: The parameter values for ... Continue reading
Views
46
Cite
Reviewer Report 30 Sep 2016
Elana J. Fertig, Division of Oncology Biostatistics and Bioinformatics, School of Medicine, Johns Hopkins University, Baltimore, MD, USA 
Approved with Reservations
VIEWS 46
This study develops SVM and RF algorithms built upon previously learned gene signatures of therapeutic response to breast cancer. The algorithms are applied and compared to predict patient survival under different treatment conditions in METABRIC data. The analyses and comparisons ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Fertig EJ. Reviewer Report For: Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) Study by Biochemically-inspired Machine Learning [version 1; peer review: 2 approved with reservations]. F1000Research 2016, 5:2124 (https://doi.org/10.5256/f1000research.10141.r16733)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 27 Jan 2017
    Peter Rogan, Department of Biochemistry, University of Western Ontario, London, Canada
    27 Jan 2017
    Author Response
    Comment 1: The methods require further clarification to distinguish differences between this study and the previous study as well as the parameters of the machine learning algorithms.
     
    Response: The ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 27 Jan 2017
    Peter Rogan, Department of Biochemistry, University of Western Ontario, London, Canada
    27 Jan 2017
    Author Response
    Comment 1: The methods require further clarification to distinguish differences between this study and the previous study as well as the parameters of the machine learning algorithms.
     
    Response: The ... Continue reading

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 31 Aug 2016
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.