Elucidating genomic gaps using phenotypic profiles

Advances in genomic sequencing provide the ability to model the metabolism of organisms from their genome annotation. The bioinformatics tools developed to deduce gene function through homology-based methods are dependent on public databases; thus, novel discoveries are not readily extrapolated from current analysis tools with a homology dependence. Multi-phenotype Assay Plates (MAPs) provide a high-throughput method to profile bacterial phenotypes by growing bacteria in various growth conditions, simultaneously. More robust and accurate computational models can be constructed by coupling MAPs with current genomic annotation methods. is an PMAnalyzer online tool that analyzes bacterial growth curves from the MAP system which are then used to optimize metabolic models during growth simulations. in silico Using as a prototype, the Rapid Annotation using Citrobacter sedlakii Subsystem Technology (RAST) tool produced a model consisting of 1,367 enzymatic reactions. After the optimization, 44 reactions were added to, or modified within, the model. The model correctly predicted the outcome on 93% of growth experiments. 1 2,6 3 3


Introduction
Recent advancements in genomic sequencing provide high quality, deep-coverage DNA sequences for tens of thousands of bacterial genomes.To manage this breadth of data, online tools such as RAST (http://rast.nmpdr.org/) 1 leverage the SEED database 2 by using homology and genomic context to determine gene functions encoded in the DNA sequences.This automated annotation service additionally generates a raw metabolic reconstruction of the genome for use in in silico experiments.Genome-scale metabolism analyses use these reconstructions as input into data environments such as KBase, the Department of Energy Systems Knowledgebase (http://kbase.us).Several hypotheses can be tested simultaneously, e.g., protein function identification, biological behavior simulations, and metabolic network comparisons 3 .
Metabolic models are defined by the chemical reactions that characterize the vast metabolic network of an organism.Flux-balance analysis (FBA) uses these chemical reactions to provide understanding of the physiological capacity of the cell 4 .Mathematically, the stoichiometry of metabolic networks is represented by a two-dimensional numerical matrix, in which the values are the stoichiometric coefficients of the reactants and products.Each row and column in the matrix is associated with a metabolite and a metabolic reaction, respectively.For one stoichiometric reaction, the products of the reaction are given positive integers, the reactants are given negative integers, and non-associated metabolites are given zeros.Through a constraint-based approach, the FBA algorithm uses linear programming techniques to solve this system of stoichiometric coefficients, optimizing for biomass production or another objective function [4][5][6] .
The amount of published metabolic reconstructions for prokaryotic and eukaryotic organisms has increased over the past decade 3,7 .Through the increased use of next-generation sequencing and automated annotation software, metabolic models for new organisms are arising and older models are continuously being reconciled.However, a drawback to RAST and other gene annotation algorithms is the dependency on previous functional annotations.The breadth and quality of annotated functions vary among and across bacterial species, which is not accumulating as quickly as new sequences.Automatic generation of metabolic models is limited by our knowledge of cellular metabolism and biochemistry.In addition, an existing problem with gene databases is the inconsistent nomenclature used to name and define the function of a gene.Separate databases hold slightly different annotations for the same gene, which propagates into downstream tools, leading to a loss of information in analyses and mis-annotations in models built during the initial reconstruction.To bridge the gap between quality genome annotations and accurate metabolic models novel methods are needed to supplement the reconciliation process.
Multi-phenotype Assay Plates (MAPs) provide a system to quantitatively monitor microbial growth while qualitatively deducing the metabolic capabilities of a microbe across a range of conditions.The MAPs technology uses optical density to measure substrate utilization of a clonal microbial population.MAPs have been an advantageous tool in past phenotypic studies 3,[8][9][10][11][12][13][14][15][16][17][18][19][20] .Using MAPs we can measure bacterial growth in defined conditions to aid in the validation of microbial genome annotation software.
In this study, a metabolic model of Citrobacter sedlakii is built using a workflow combining experimental data and computational analysis (Figure 1).The genome of C. sedlakii was sequenced, annotated, and a metabolic reconstruction was subsequently generated using RAST and the KBase platform.Growth of C. sedlakii was measured in 96 different growth conditions and the resulting data was introduced into a novel computational pipeline, PMAnalyzer.The PMAnalyzer automatically parameterizes raw growth data and fits a logistic model of bacterial growth for each experimental condition.Observed phenotypes from the MAP experiments were used to ground truth the genome-scale metabolic model by running FBA simulations, which identified disparities in the metabolic reconstruction.Disparities were observed to have either been missed due to RAST mis-annotations or sequencing.Here, we introduce a high-throughput workflow to obtain large-scale metabolic reconstructions and reconciliations with observed growth phenotypes.

Acquisition of Citrobacter sedlakii and MAP (Multi-phenotype Assay Plate) preparation
The C. sedlakii isolate (ATCC 51115, CDC 4696-86) was provided by Dr. Marlene DeMers in the Department of Biology at San Diego State University.A glycerol frozen stock of the sample was plated on trypticase soy agar (Becton, Dickenson and Company) and incubated at 37°C for 24 hrs.A single colony was inoculated into 3 mL of trypticase soy broth (Becton, Dickenson and Company) and incubated, with shaking, at 37°C for 24 hrs.500 µL of the overnight culture was mixed with 500 µL of 30% (weight/volume) filter sterilized glycerol and transferred to a cryogenic vial (Fisher Scientific) for storage at -80°C.
C. sedlakii was grown overnight at 37°C for 24 hrs on 50% Luria-Bertani (LB) agar (Fisher Scientific) from a frozen glycerol stock.Three independent colonies, biological triplicates, were inoculated into 3 mL of a modified 3-morpholinopropane-1-sulfonate (MOPS) broth 21 (1X MOPS (40 mM MOPS + 10 mM Tricine), 0.4% glycerol, 9.5 mM NH 4 Cl, 0.25 mM NaSO 4 , 1.0 mM MgSO 4 , 1.32 mM K 2 HPO 4 , 10 mM KCl, 0.5 µM CaCl 2 , 5 mM NaCl, and 6 µM FeCl 3 ) and incubated for 24 hrs at 37˚C with agitation.Overnight cultures were centrifuged using an Eppendorf Centrifuge 5418R at maximum speed (14,000 rpm) to pellet cells and washed with 500 µL of 10 mM Tris/10 mM MgSO 4 buffer, twice.Cells were re-suspended in 1 mL of 10 mM Tris/10 mM MgSO 4 buffer and optical density at 600 nm (OD 600 ) was measured using a Beckman Coulter DU 640 spectrophotometer.All suspensions were concentrated to achieve a final optical density of approximately OD 600 = 0.1 after a fifteen fold dilution.10 µL of concentrated cells was transferred into each well of a sterile 96 well, micro-titer plate (Grenier Biosciences), which contained 60 µL of sterile water, 50 µL of 3X MOPS basal media, and 30 µL of 5X substrate (Supplementary Figure 1).Each plate was sealed with PCR grade plate film (Sigma SealPlate ® film) and incubated on a Molecular Devices Analyst GT multi-plate plate reader (Molecular Devices, LLC.).Plate reader was programmed to incubate MAPs at 37°C and measure OD 600 every 30 min, for a total of 32 hrs.Absorbance data was saved, extracted as a text file, and uploaded to the project website for data storage (http://vdm.sdsu.edu/).MOPS basal media is derived from the culture media provided by Neidhardt et al. 21, and contains 1X MOPS (40 mM MOPS + 10 mM Tricine), 0.4% glycerol*, 9.5 mM NH 4 Cl*, 0.25 mM NaSO 4 *, 1.0 mM MgSO 4 *, 1.32 mM K 2 HPO 4 *, 10 mM KCl, 0.5 µM CaCl 2 , 5 mM NaCl, 6 µM FeCl 3 .Media was prepared with sterile Milli-Q water (Milli-Q Integral Water Purification Systems, EMD Millipore) and subsequently filter sterilized using 0.22 µm Sterivex filter unit (Millipore, Inc).(*These compounds are not included depending on the basal media.For example, 0.4% glycerol is not used in the carbon basal media, while 1.0 mM MgSO 4 is replaced by 1.0 mM MgCl 2 in the sulfur basal media).
Nutrient substrates were prepared by dissolving 1.25% (w/v) of the solid compound in sterile Milli-Q water and filter sterilized with a 0.22 µm Sterivex filter unit (Millipore, Inc).Substrate stocks were stored at 5X concentrations at room temperature in sterile conical tubes.Supplementary Figure 1 contains a detailed mapping of the substrates used in the MAPs.

Sequencing and metabolic reconstruction of C. sedlakii
As part of a DNA sequencing class at San Diego State University 26 the C. sedlakii 119 genome was sequenced using 454 pyrosequencing with the GS Junior platform and assembled with Newbler version 2.7.RAST (http://rast.nmpdr.org/)was used for subsystem annotations and metabolic reconstructions 1 .Annotations were imported into the KBase environment where the metabolic model was viewed, manipulated, and used in flux-balance analysis (FBA) simulations.FBA was used to determine if the model bacteria is successful in ascertaining growth in specific conditions.The KBase command kbfba-importfbamodel was used to import the annotations into the Citrobacter_sedlakii_119 workspace and was named C.sedlakii_nogapfill.The Citrobacter_sedkakii_119 workspace and its objects are freely accessible to anyone.Using the KBase command kbfba-gapfill, initial gap-filling was performed on the model while specifying Luria-Bertani (LB) as the growth condition (the default ArgonneLBMedia formulation was used).The reconciled model was named C.sedlakii_ArgonneLB_gapfill and is provided in the workspace.The LB gap-filled model created a representative model that fulfills the general requirements needed to utilize a rich media source for growth.

PMAnalyzer pipeline
The high-throughput analysis pipeline described below was executed in a Linux command-line environment and was developed using several programming languages, including bash, Perl version 5.16 (http://www.perl.org/),and Python version 3.4.1 (http:// www.python.org/).Perl scripts were written to parse and format the MAP raw data files into the tab-delimited intermediate files.The primary analysis script (Python) used these intermediate files for modeling the growth curves.For ease of execution, a single bash program was created as a wrapper script that executes the parsing and analysis scripts as a cohesive, automated pipeline.Commandline arguments or a configuration file was used for user input and settings.All scripts are freely accessible from a Git repository at https://github.com/dacuevas/PMAnalyzer.

Fitting a logistic model to absorbance data
Phenotypic responses were recorded by measuring the optical density at 600 nm (OD 600 ) over time, which quantitatively represents the bacterial biomass concentration at each time point.The OD 600 values are plotted to form the sigmoidal shape characteristic of bacterial growth curves.This characteristic curve, highlighted by Monod 22 and modeled by Zwietering et al. 23 , consists of three phases: lag, exponential, and stationary phases.Zwietering et al. 23 interprets these phases as parameters required to model growth, using his logistic equation where y 0 (OD 600 ) is the starting optical density, λ (hr) is the lag phase, μ (OD 600 •hr -1 ) is the maximum growth rate during the exponential phase, A (OD 600 ) is the asymptote of the growth curve representing the carrying capacity of the population, and t (hr) is time.
Supplementary Figure 2 provides a visual representation of a classical growth curve.
To parameterize the growth curves median values of the replicates were used.Python's NumPy module version 1.8.1 and SciPy module version 0.14.0 24 provides several functions for optimizing nonlinear, multivariate functions.In this case, the minimize function was used in order to denote bounds and constraints on each parameter.The default Broyden, Fletcher, Goldfarb, and Shanno (BFGS) algorithm 25 was used to minimize the sum of squared error between the logistic model from (1) and the raw data.As input, the algorithm requires estimations for each growth curve phase.The estimation for the asymptote was defined as the largest OD 600 reading from three consecutive time points (2) and the maximum growth rate was defined as the largest change in OD 600 over a 1.5 hrs window (3).The estimated lag time was set at 0.5 hr.

avg y y
The result from (2) was also used as the upper bound for the minimization function.Lag time and maximum growth rate were not given an upper bound.Lower bounds for the asymptote, maximum growth rate, and lag time were 0.01, 0, and 0, respectively.

Determining growth classifications
Each well in the MAPs has a varying level of growth, including different lag times, maximum growth rates, and asymptotes.A single value that represents the overall level of bacterial growth per well was generated by adapting the logistic model with the asymptote: Here, y logistic is the value from (1) at time i, and n is the number of data values used, which is the number of OD 600 measurements recorded during the experimental run.For the C. sedlakii MAP, n equals 64.The asymptote factor A, rather than the maximum growth rate, contributes to defining growth (i.e., growth levels from wells that achieve a higher biomass yield separate from those growth levels of wells that exhibit less growth, Supplementary Figure 3).In certain instances, growth curve models were fitted with a high maximum growth rate but did not display growth (e.g., potassium sorbate, L-valine, L-lysine, L-leucine, D-aspartic acid, and L-isoleucine).
Ultimately, (4) was implemented to distill each growth curve into a single boolean variable of growth (≥ 0.5) or no growth (< 0.5).

Model reconciliation using the MAPs
The kbfba-simpheno function in KBase executes multiple fluxbalance analysis (FBA) processes in parallel.To perform this, a text file listing information on which media condition to use as the input media in each process is required.Information regarding the MAPs result, i.e., growth or no growth, for each media condition is also listed in the text file.Digital representations of rich LB media and 90 different media compositions used in the MAPs were generated as media data objects in KBase.Each media object represents a specific condition used in the MAPs.kbfba-simpheno performs a separate FBA on each media object and compares the result to the MAPs result listed in the information text file.FBA results are labeled as: Correct Positive assertions (FBA and MAPs both display growth), Correct Negative assertions (FBA and MAPs both display no growth), False Positive assertions (FBA asserts growth, MAPs display no growth), and False Negative assertions (FBA asserts no growth, MAPs display growth).Gap-fillings were attempted for conditions associated with false negative assertions and gap-generations were attempted for instances of false positive assertions.FBA and reconciliation was performed on the LB condition first in order to identify and integrate missing reactions required for growth on general, rich media.Thereafter, using the base model capable of growing on LB, FBA and reconciliation was performed on the minimal media conditions to target missing reactions in specific metabolic pathways.The minimal set of reactions determined by gap-filling was integrated into the model using the KBase function kbfba-integratesolution.To verify that the integration of new reactions produces additional correct positives and correct negatives the multi-FBA simulation was re-executed.

RAST annotations
Reaction names for functions of genes may vary between databases, thus, causing mis-annotations by RAST when used alongside other tools.When identifying functions for metabolic reconstructions, this unclear nomenclature prevents reactions to appear in the metabolic model.To correct for this, following gap-filling, all missing reactions (excluding transporters and newly-modified bidirectional reactions) were cross-checked with the SEED to find similarly named reactions.The new list consisted of a mapping of gap-filled reaction names to possible alternative names.This list of reactions was then referenced back to the C. sedlakii RAST annotations in order to determine if the reaction was identified by RAST as the alternative name but not included in the metabolic model.When the search similar nomenclature did not resolve, a search for gap-filled reactions in closely related organisms was performed; i.e., Citrobacter koseri and E. coli K12.This consisted of Protein BLAST (blastp) searches of the reactions' sequences from the SEED database against C. koseri and E. coli.Gap-filled reactions that existed in these genomes were included in the C. sedlakii model with high confidence since closely related taxonomic groups contain common genetic material and function.
The genes encoding the missing reactions may be present in low quality DNA sequences or low coverage genomic regions.Following sequence assembly, these sequences are not present within the contigs, preventing RAST from annotating the proposed function.However, neighboring genes or protein complexes may be present and annotated, suggesting that the gene in question is there but was poorly sequenced.From within related organisms, the protein sequences of these complexes were identified and searched against the C. sedlakii genome.Finding matches for neighboring genes increases the confidence of including the reaction into the model.This method is also applicable to those genes that were not sequenced or that fall between assembled contigs.

Results
C. sedlakii 119 was assembled into 320 contigs containing 4,604,104 nucleotides with an N 50 of 28,039 bp.RAST annotated the genome as containing 4,035 protein encoding genes and 76 tRNA genes over 537 different subsystems (Figure 2).Hypothetical proteins consisted of 817 (~20%) of the protein coding sequences.Membrane transport features constituted 150 out of 3,031 subsystem features.RAST listed E. coli as a close neighbor and Citrobacter koseri as the closest Citrobacter neighbor.A BLASTn alignment (expected value of 10 -4 ) 27 against the C. koseri ATCC BAA-895 genome (NC_009792.1)resulted in a whole genome coverage of 69.8% (69.6% when including plasmids (NC_009793.1 and NC_009794.1).
Growth on MAPs of each biological triplicate was followed for 32 hrs on separate days.After completion, the OD 600 readings were processed through the PMAnalyzer pipeline (including data parsing, curve-fitting, and growth level analysis) in 8 seconds.Using a 0.5 growth level (4) cutoff and manual inspection of curves falling under the cutoff, 48 out of the 90 growth conditions -35 carbon-based media and 13 nitrogen-based media -exhibited growth (Figure 3 and Table 1).
The initial metabolic model generated in KBase (C.sedlakii_nogapfill) from RAST annotations contained 1,367 reactions and 1,277 substrates.An FBA simulation using this model and specifying LB as the media source resulted in no growth, however only eight reactions were added to simulate growth.These eight were back referenced to the RAST annotations to check for mis-annotations or functional annotations not included in the model.Using the base model, the 90 well simulation resulted in no growth for all 90 growth conditions (53.3% accuracy: 48 false negatives (FN)) (Table 1).Subsequently, gap-filling was performed resulting in the reconciled model (C.sedlakii_MOPS_simpheno).The total number of substrates increased to 1,301 (+24), the total number of reactions increased to 1,399 (+32), and nine reactions were modified to be bi-directional.When performing gap-filling on multiple conditions, KBase produced a separate solution for each condition.This results in reactions appearing in multiple solutions.To find the missing set of essential reactions, the gap-fill results were parsed to locate the minimum set of reactions present in the majority of the solutions.All new and modified reactions added to the model are listed in Table 2. Note: a number of reactions were present among the FN solutions; thus, including the reactions for the 13 FN conditions    As more bacterial genomes are processed through this pipeline, mis-annotations will occur at a lower rate.In the case of C. sedlakii, six reactions out of the 19 non-transport, newly added reactions were missed due to mis-annotations; five reactions possibly located inbetween contigs; eight not bearing homology with related organisms.Feedback from mis-annotations corrects the ambiguities for future bacteria metabolic reconstructions as administrators of KBase, RAST, and the SEED database are informed of these findings.Reactions that are gap-filled and not identified as mis-annotations will be recorded for future studies.Patterns of missing reactions provide insight into why RAST is not able to identify the functions from an organism's sequences.Furthermore, closer investigation can answer if the gene encoding the protein for a particular species (or genus) does not share strong homology to its closest evolutionary neighbor.

Missing transporter proteins
Thirteen out of the 32 essential reactions added during gap-filling were transporters.For specific conditions, the only missing reaction that prevented growth during the simulation was observed to be transport proteins specific for the growth condition (see Table 2).
Transporters are readily identified using homology-based searches, however, it is difficult to accurately identify which substrate(s) are actively transported using these techniques.A drawback of RAST is its dependence on sequence homology.Henry et al. 30 indicated that poorly annotated transporters are typically missing from preliminary reconstructions using the SEED database.During metabolic reconstruction in RAST, a minimal set of reactions are uniquely chosen through an optimization equation.The equation contains a penalty parameter that favors intracellular reactions over transporters during the auto-completion step of the model reconstruction, thus further preventing transporters from being included in the draft model.

FBA false positives
False positive (FP) results were introduced during the reconciliation process.A false positive was defined as an FBA resulting in biomass production in nutrient conditions where C. sedlakii is not able to produce biomass, i.e. the MAPs assert no growth.An FP results from reactions added to the model that enable biomass production.
The FP could also be occurring through bi-directional reactions that should actually be uni-directional.Regardless, both cases reflect an under-constrained model containing reactions with incorrect directionality or with reactions that do not belong.KBase function fba-gapgen attempts to correct these issues.To perform this function, parameters pertaining to the growth condition where the FP occurs and a growth condition where a correct positive occurs are required, allowing the algorithm to correct, or remove, reactions from the model without altering the outcome of the correct positive growth condition.In this case, the KBase software was unable to determine any reactions in our model to remove with the fba-gapgen function, suggesting that the model requires manual curation for these six FP growth conditions listed in Table 1.
Models created using the discussed pipeline should be considered draft metabolic models.While the physiological experiments provide insight into an organism's metabolic capabilities for substrate utilization and the subsequent biomass formation, the biochemical properties involved are not directly assessed.The gap-filling algorithm attempts to determine the minimal set of reactions required for growth for a specific condition but these are not considered finite decisions.Gene knockout experiments, extensive literature mining, and manual curation are required to enhance the models 31 .
However, these alternatives are neither high-throughput nor considered in this pipeline but are encouraged for further studies.Crossreferencing to closely related organisms can give insight into the validity of adding a reaction to the model 28 .
The high-throughput pipeline presented combines physiological data with genomic information to result in more accurate metabolic models.Using experimentation to validate model prediction and improve model capabilities has been shown previously 18,29 , however, these processes are not streamlined to be fast or robust.Our methodology implements speed and robustness at every level of the workflow.Using a multi-plate spectrophotometer, hundreds of different growth conditions targeting several metabolic pathways results in an expansive phenotypic profile.The PMAnalyzer pipeline executes rapidly and is integrated into an automated web server where users obtain phenotypic results quickly.DNA sequencing, annotation, and draft model construction are available in a day of using RAST and KBase.Finally, model reconciliation is performed in KBase with flux-balance analysis, gap-filling, and gap-generation commands.

Conclusion
The prevalence of complete and near-complete draft genomes is increasing as DNA sequencing becomes cheaper and more robust.Unique bacterial species are now being studied and their data is becoming more readily available for interpretation.We describe a process to combine DNA sequence data and phenotypic experiments to produce a programmatic metabolic model construct.Metabolic reconstructions are built using RAST genomic annotations by implementing PMAnalyzer, a high-throughput pipeline.Biochemical reactions not captured by homology-based algorithms are highlighted using flux-balance analysis and gap identification techniques hosted openly on the KBase platform.Citrobacter sedlakii was used as a model organism to describe, test, and critique the pipeline.FBA results using this model were shown to achieve a 93% prediction accuracy, an improvement from the initial model providing a 53% prediction accuracy.

Introduction
This section is generally well written, though I think the context for this type of analysis could be better described... that is, beyond mentioning simply KBase and RAST.Perhaps providing rationale for potential advantages of using this combination of systems rather than other options would be useful for the reader.
In the final paragraph, the term "ground truth" is used to describe the phenotype data with respect to the metabolic modeling that will be performed.Foreshadowing some comments later, what is the evidence that the phenotype data are actually correct for all conditions tested?

Methods
First, thank you for the detailed description of the methods.These are clearly written overall.Some detailed comments/suggestions.
Describing growth of the initial cells and cultures from glycerol stocks, please define the shaking parameter (rpm) and define "agitation".These parameters can be critical factors for reproduction of experiments and are often organism dependent.
Second full paragraph, you have defined the contents of 1X MOPS, but you also define it two paragraphs later in the context of the recipe for the basal media.This first instance could be removed and simply refer to the recipe later described.
In the plate format for growing the cells, you indicate that plates are sealed with a PCR grade Question: F1000Research 1.

3.
In the plate format for growing the cells, you indicate that plates are sealed with a PCR grade Question: plate film.What does this do to the aerobic/anaerobic state of each well?Is there any opportunity for gas exchange during incubation?Also, is there any shaking going on during incubation on the plate reader?It might be worth mentioning about caveats of usage of carbon/nitrogen sources being limited to these conditions, which aren't exactly known.Also, the modeling could be impacted by the aerobic/anaerobic status of the environment.Was modeling performed under both conditions?Would this impact the accuracy of the modeling results?I note the storage condition was at room temperature.Are all of the substrates stable at room temperature?How long would each stock be stored prior to use for replicates?
In the "Sequencing and metabolic reconstruction..." section: The sentence that starts with "FBA was used to determine...".This is a confusing sentence.What is meant by this?
The KBase workspace, Citrobacter_sedlakii_119, does not appear to exist in the current public release of KBase (as of March 25, 2015).I also am unable to find any FBA model objects searching for various forms of and .There are not an public narratives that Citrobacter sedlakii would match the series of commands that you describe as being run and as freely accessible.This needs to be corrected, likely by building a public narrative in the current system.

Are the named commands for KBase still valid in the current production version of the system? It would be useful to include what apps and methods correspond to these commands. A public narrative in the current version of KBase would make this study replicable and easily transferred to other model systems.
The github page for the PMAnalyzer software is good... to the point, clear.
The explanation of the logistic model and absorbance data is also clear.In the description of the growth value, you end by stating that this is boiled down to a boolean growth/no growth status for each condition.I understand why this is done, given that the model reconciliation with growth phenotypes is occurring on a boolean level, but how much information is being lost by making this experimental design decision?The nature of the growth can be very important for understanding how the organism is behaving in an environment.The more immediate consequence of this decision is in the interpretation of False Negatives by the model (where the phenotype assay says "growth" and the model says "no growth").How many of the false negatives had values near the 0.5 cutoff?The allantoin example could be a growth case like this ( = 0.529, from curve_logistic_parameters.csv).The growth curve growth asymptote appears to be near 0.25 (Fig. 2a).This is very similar to values that are considered "no growth" phenotypes.Does it make sense to have the model gap fill three reactions in this case?Table 1 might be made more complete by adding a column for the value for each growth condition rather than that sitting in the supplemental data files (alternatively, highlight in the text that these values are given in that file).Related to this, in Supplemental Figure 3, you could highlight the point that represents allantoin.It would also be useful to highlight the water, negative control in the Supp.Fig. 3.
This brings me to questions about how confidence in the value is determined (if at all).I see that growth the median value of the biological replicates is used to determine the y logistic and thus the value.growth I also see that standard error is indicated in Fig. 2a graphs.However, this does not allow for statistical evaluation of the .growth Would it be better to calculate for each replicate independently growth and then determine an average value with error around these?Perhaps there is a better growth statistical approach.In any case, this comes back to being able to state some confidence in statistical approach.In any case, this comes back to being able to state some confidence in Please define "sse" in these values to aid interpretation of potentially borderline cases.the curve_logistic_parameters.csv file.
In the "RAST annotations" section, last paragraph.How does this fit in with the gap filling process for the model?Is the context information in close genomes actually used in the gap fill process, or is it a post hoc attribution of higher confidence to the gap fills that are included in the model?

Results
The statistics on the genome assembly are worse than I would expect to see.In particular, is the coverage based on alignment by blastn to a reasonable number?I can't quickly evaluate if this is typical of C. koserii different genomes.How does a low coverage (~70%) affect the outcome of Citrobacter presence/absence of genes in the annotation and subsequent modeling process.In reading the results, it appears that the majority of reactions in the network are identified, but it may be worth addressing this explicitly.
I note that you used manual inspection of growth curves just under the 0.5 cutoff... this is another area in which a statistical confidence in that value might help.If this is to become truly high throughput, manual inspection becomes untenable except in a few cases.
In the description of gap filling reactions for complex media, you mention that EC 2.2.1.7 was not found, but is likely due to a frameshift error.Was any follow up sequencing or PCR performed to confirm the error or presence/absence of the gene in ?Or even just a blastx analysis of the region?C. sedlakii In general, when you are discussing the evaluation of annotations in RAST, figures (supplemental) of key genomic regions would help the reader to evaluate the statements being made.
In the paragraph beginning with, "Using the base model, the 90 well simulation resulted in...", you have a sentence that starts with, "Note:".This is a confusing sentence and structure.What are you trying to point out here?What reactions are being referred to?What percentage of gap filled reactions are transport reactions?Stating this clearly would improve clarity.
The last statement in the Results section focuses on false positive conditions.Do you have any thoughts as to why these are coming up as FP?There is no follow up in the discussion about this.Are they central in a network of reactions, are they dual use reactions, etc.?

RAST annotations and gap-filled reactions section:
This section would also benefit from a supplemental figure that serves as an example of what is being discussed (also mentioned above).
What is the connection between KBase, RAST and SEED?How does updating in one affect the others?This question gets at an assumption in the text that the relationships among systems are known to the reader.The text could be clarified, or key references added.

FBA false positives section:
Please expand to include more specific discussion of the 6 FP reactions identified at the end of What are the values for these?Are any of them borderline? the Results section.growth

Last paragraph:
It would be good to quantify what "several" means with respect to the number of metabolic pathways being targeted.
"…available in a day of using RAST and KBase."This sentence implies that sequencing, annotation, and model reconstruction can happen in a single day.This should refer only to the use of sequence data.Also, there in no mention of the phenotype data here in this context.I think it would be better to highlight that the system allows the user to produce a reasonably robust metabolic model quickly, giving more opportunity for in depth analysis of discrepancies and manual curation of the model given the phenotypic data.
What is the link to the web service for the PMAnalyzer?

Points to Address
I've bolded several items in the above format for this review that I would consider to be major points to address and would make the manuscript stronger.Given that this is a methods paper, it is imperative that others can reproduce the work and/or employ the approach in other organism systems.Please update methods as requested above, paying particular attention to the KBase functionality and workflow.
I have read this submission.I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Non-financial competing interests include: I have worked as a collaborator with Competing Interests: several of the authors on projects related to the SEED, RAST and KBase.I have not worked directly with the lead author.Financial competing interests: None to declare.In the final paragraph, the term "ground truth" is used to describe the phenotype data with respect to the metabolic modeling that will be performed.Foreshadowing some comments later, what is the evidence that the phenotype data are actually correct for all conditions tested?
The 96-well microtiter plate technology has been established and used in many Responseresearch studies to date, some of which have been cited in this manuscript; thus, the efficacy of the data has been verified.The phenotype response taken from these growth curves is identifying growth or no growth of the bacteria in specific minimal media conditions.Classifying growth for a given sample can be difficult because there is not an established rule that translates quantitative optical density measurements into a qualitative growth/ no growth response.When a growth curve displays an ambiguous phenotype, such as described in the manuscript with the allantoin-based F1000Research displays an ambiguous phenotype, such as described in the manuscript with the allantoin-based condition, other laboratory techniques can be used to offer a more precise answer.

Methods
Describing growth of the initial cells and cultures from glycerol stocks, please define the shaking parameter (rpm) and define "agitation".These parameters can be critical factors for reproduction of experiments and are often organism dependent.
The 250 rpm shaking speed has been added to the manuscript.

Response -
In the plate format for growing the cells, you indicate that plates are sealed with a PCR grade plate film.What does this do to the aerobic/anaerobic state of each well?Is there any opportunity for gas exchange during incubation?Also, is there any shaking going on during incubation on the plate reader?It might be worth mentioning about caveats of usage of carbon/nitrogen sources being limited to these conditions, which aren't exactly known.
The PCR grade plate film still allows gas exchange to occur, and shaking does occur Responseon the plate reader.This has been clarified in the methods section of the manuscript.
Also, the modeling could be impacted by the aerobic/anaerobic status of the environment.Was modeling performed under both conditions?Would this impact the accuracy of the modeling results?
Flux-balance analysis was performed with oxygen exchange occurring in the Responsemetabolic model.I note the storage condition was at room temperature.Are all of the substrates stable at room temperature?How long would each stock be stored prior to use for replicates?Yes, all substrates are stable at room temperature.Substrate stocks were prepared Responseon a weekly basis.
The sentence that starts with "FBA was used to determine...".This is a confusing sentence.What is meant by this?
Flux-balance analysis answers the question: with the given genome-scale metabolic Responsemodel and the nutrients present in the environment, does the genome-scale metabolic model contain biochemical reactions that will intake the nutrients and create the necessary biomass components for cellular growth?Thus, FBA was used here to determine if the model is capable of growth in the same conditions as the MAPs.
The KBase workspace, Citrobacter_sedlakii_119, does not appear to exist in the current public release of KBase (as of March 25, 2015).I also am unable to find any FBA model objects searching for various forms of Citrobacter and sedlakii.There are not any public narratives that would match the series of commands that you describe as being run and as freely accessible.This F1000Research would match the series of commands that you describe as being run and as freely accessible.This needs to be corrected, likely by building a public narrative in the current system.
The SBML files for the draft model and gap-filled models have been provided as Responsesupplementary material.
Are the named commands for KBase still valid in the current production version of the system?It would be useful to include what apps and methods correspond to these commands.A public narrative in the current version of KBase would make this study replicable and easily transferred to other model systems.
At this time KBase does not use the IRIS system to perform genome-scale metabolic Responsemodelling.KBase now uses the Narrative graphical workflow to perform the same functions using the same data types.KBase has released publicly available narratives that describe these workflows (e.g., https://narrative.kbase.us/#appcatalog/app/fba_tools/build_metabolic_model/release).
In the description of the growth value, you end by stating that this is boiled down to a boolean growth/no growth status for each condition.I understand why this is done, given that the model reconciliation with growth phenotypes is occurring on a boolean level, but how much information is being lost by making this experimental design decision?The nature of the growth can be very important for understanding how the organism is behaving in an environment.The more immediate consequence of this decision is in the interpretation of False Negatives by the model (where the phenotype assay says "growth" and the model says "no growth").How many of the false negatives had growth values near the 0.5 cutoff?The allantoin example could be a case like this (growth = 0.529, from curve_logistic_parameters.csv).The growth curve asymptote appears to be near 0.25 (Fig. 2a).This is very similar to values that are considered "no growth" phenotypes.Does it make sense to have the model gap fill three reactions in this case?Table 1 might be made more complete by adding a column for the growth value for each condition rather than that sitting in the supplemental data files (alternatively, highlight in the text that these values are given in that file).Related to this, in Supplemental Figure 3, you could highlight the point that represents allantoin.It would also be useful to highlight the water, negative control in the Supp.Fig. 3.
Supplementary Figure 4 has been generated to show the different growth levels in Responseterms of the 0.5 growth level cutoff.
Would it be better to calculate growth for each replicate independently and then determine an average growth value with error around these?Perhaps there is a better statistical approach.In any case, this comes back to being able to state some confidence in these values to aid interpretation of potentially borderline cases.
Yes, this does indeed provide some statistical evidence of the growth level and each Responseof the other growth parameters.Although the updated PMAnalyzer pipeline now does this order of analysis, it did not affect the results to this experiment; thus, the results were not altered in terms of identifying growth and no growth conditions.
"SSE" refers to the sum-squared error calculated between the logistic fitted growth Responsemodel and the OD measurements.
In the "RAST annotations" section, last paragraph.How does this fit in with the gap filling process for the model?Is the context information in close genomes actually used in the gap fill process, or is it a post hoc attribution of higher confidence to the gap fills that are included in the model?
This refers to post hoc, manual efforts made after the gap-filling process.

Results
The statistics on the genome assembly are worse than I would expect to see.In particular, is the coverage based on alignment by blastn to C. koserii a reasonable number?I can't quickly evaluate if this is typical of different Citrobacter genomes.How does a low coverage (~70%) affect the outcome of presence/absence of genes in the annotation and subsequent modeling process.In reading the results, it appears that the majority of reactions in the network are identified, but it may be worth addressing this explicitly.
The genome alignment to were meant to paint a picture of the similarity of its Response -C.koseri DNA sequence to the , which might lend insight into why some of the genes were not C. sedlakii identified with a functional role.This affects the presence of functional roles as many of those putative genes are not assigned any role, thus leaving gaps in our metabolic model.
In the paragraph beginning with, "Using the base model, the 90 well simulation resulted in...", you have a sentence that starts with, "Note:".This is a confusing sentence and structure.What are you trying to point out here?What reactions are being referred to?
Here I am pointing out that through gap-filling for only the 13 false negative conditions Responselisted in Table 2, the other 35 false negative conditions were corrected, i.e., all 48 conditions where FBA asserted false negative results now assert true positive results.This has been clarified in the updated manuscript.

What percentage of gap filled reactions are transport reactions? Stating this clearly would improve clarity.
This 46% has been added to the manuscript.

Response -Discussion
What is the connection between KBase, RAST and SEED?How does updating in one affect the others?This question gets at an assumption in the text that the relationships among systems are known to the reader.The text could be clarified, or key references added.
The references for RAST and the SEED database explain their relationships.RAST Responseuses the SEED subsystems information to annotate genomic sequences.

F1000Research
Please expand to include more specific discussion of the 6 FP reactions identified at the end of the Results section.What are the growth values for these?Are any of them borderline?
The issue has been addressed in the recent changes.Clarifications and explanations Responsehave been included in the Results and Discussion sections.
"…available in a day of using RAST and KBase."This sentence implies that sequencing, annotation, and model reconstruction can happen in a single day.This should refer only to the use of sequence data.Also, there in no mention of the phenotype data here in this context.I think it would be better to highlight that the system allows the user to produce a reasonably robust metabolic model quickly, giving more opportunity for in depth analysis of discrepancies and manual curation of the model given the phenotypic data.
This clarification has been made in the manuscript.

What is the link to the web service for the PMAnalyzer?
The link (https://vdm.sdsu.edu/pmanalyzer)has been added to the manuscript.

Response-
No competing interests were disclosed.Competing Interests: Genome-scale metabolic modeling (GSM) has gained interest over the last 2 decades as an increasingly powerful methodology for interrogating cell physiology and function.One of the holy grails in this field is the development of high quality models fully automatically, which would enable analysis of organisms from sequencing data alone, and, for example, integration of modeling with the interpretation of metagenomic data.A large contributor towards this end is the group at Argonne National Labs, which has produced RAST (a genome sequencing and annotation database), SEED (an automated platform to build genome-scale metabolic models), and now Kbase (an integrated platform that allows users to utilize the power of RAST, SEED, and many other functionalities).
This article describes a new, automated pipeline for integrating phenotyping data (i.e., growth of an organism on many individual substrates in a microwell plate, such as in Biolog analysis) with genome-scale metabolic modeling in order to improve an automatically built genome-scale metabolic reconstruction produced through the automated RAST/Kbase system.This is an important problem and a technical issue that can greatly improve model reconstruction, automated and manual (the early stages of manual reconstruction are typically automated nowadays as well).The approach seems to provide satisfactory results.However, I have a few concerns about the approach, relating to its originality and some of its features.In general, it is not totally clear from the paper what its main contribution is, and how F1000Research 1.

4.
some of its features.In general, it is not totally clear from the paper what its main contribution is, and how it is differentiated from previous work.This should be explicitly explained in the introduction, etc.

My specific criticisms are as follows:
Major criticisms: As far as I can tell, the problem that PMAnalyzer solves has already been solved previously.For example, in the original paper describing SEED ( ), 22 models are automatically Henry , 2010 et al. optimized against existing Biolog data.The paper reads: "A modified version of the Growmatch algorithm was included in the Model SEED pipeline to identify and correct the possible errors in the models that cause the incorrect predictions [in Biolog and essentiality data]…" In order to claim that PMAnalyzer is novel, it must be compared to such previous methods and shown to be superior or different in some way (I might simply not understand the difference; this should be explained clearly in the text, or some comparison shown).
In general (and as stated before), please compare this work to previous works and explain what is the main scientific novelty of the paper (i.e.how does PMAnalyzer differ from previous works?Also if the sequencing / analysis of this organism is novel, that should also be explicitly stated as well).
It is not clear from the text what part of the model building pipeline PMAnalyzer actually does (i.e., does it only analyze the growth curves?Does it do that and also run the gap filling?Etc.).Please explain this explicitly in the paper.I suggest also providing a schematic similar to Figure 1, but that specifically shows what the inputs and outputs to PMAnalyzer are (and optionally shows some of the internal mechanics of PMAnalyzer).
Please remark on and justify whether there is an optimization step to reduce false positives (which is mentioned in the discussion, but not the results).The final paragraph of the results lists false positives, but doesn't go into detail or attempt to explain why these occurred; I suggest that the authors give some explanations here if possible on why these are tricky biological cases.Figure 3 would be greatly improved if the authors listed on each panel whether it was called 'growth' or 'non-growth' by PMAnalyzer.This is, after all, a way for the reader to visually validate the method.Minor criticisms: Please explain the genesis of equation 4, as it is unclear how/why it was formulated this way and it forms the critical cut-off criterion for the calls made in PMAnalyzer.
In the section 'RAST annotations', the authors state that 'following gap-filling, all missing reactions … were cross-checked with the SEED to find similarly named reactions' --similar to what?Is this a comparison between databases held in SEED vs. in RAST?Please clarify this.
It would be interesting/informative to see Figure 2 in context of other models, especially those of close neighbors (e.g., and ).I suggest that the authors provide histograms of e. coli C. koseri subsystems in one/both of those organism as well for comparison.
The authors mention in the Discussion that there is an automated web server for executing PMAnalyzer.However, I could not find the link.Can they please link to this or remove this sentence?I have read this submission.I believe that I have an appropriate level of expertise to confirm that I have read this submission.I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
No competing interests were disclosed.

Competing Interests:
Author Response 16 Sep 2016 , San Diego State University, USA Daniel Cuevas

Major criticisms:
As far as I can tell, the problem that PMAnalyzer solves has already been solved previously.For example, in the original paper describing SEED (Henry et al., 2010), 22 models are automatically optimized against existing Biolog data.The paper reads: "A modified version of the Growmatch algorithm was included in the Model SEED pipeline to identify and correct the possible errors in the models that cause the incorrect predictions [in Biolog and essentiality data]…" In order to claim that PMAnalyzer is novel, it must be compared to such previous methods and shown to be superior or different in some way (I might simply not understand the difference; this should be explained clearly in the text, or some comparison shown).
Biolog data has been more commonly used by others to describe phenotypic Responseresponse in various media sources.These data can help enhance the accuracy of genome-scale metabolic reconstructions, as described by Henry  , 2010.It is important to note that the et al Multi-phenotype Assay Plates (MAPs) described here are different from Biolog plates: MAPs are non-proprietary assays used to measure biomass accumulation whereas Biolog technology measure substrate utilization.However, the MAPs are serving a similar purpose for model reconciliation as Biolog plates do as described by the GrowMatch paper.The novelty claimed here is the method in combining the high-throughput MAPs technology, the PMAnalyzer pipeline, and the KBase modeling pipeline.PMAnalyzer quickly and automatically calculates growth profiles for a wide spectrum of sugars, amino acids, and other compounds which the KBase modeling environment can also simulate with FBA.In addition to the FBA, KBase is performing the gap-fill and gap-gen algorithms to reconcile the model.This process uses the modified version of the GrowMatch algorithm to identify those changes (e.g., adding reactions, making reactions reversible).
In general (and as stated before), please compare this work to previous works and explain what is the main scientific novelty of the paper (i.e.how does PMAnalyzer differ from previous works?Also if the sequencing / analysis of this organism is novel, that should also be explicitly stated as well).
This issue has been addressed in the recent changes to the manuscript where we Responseclarify the difference between the Multi-phenotype Assay Plates (MAPs) technology and the Biolog Phenotype MicroArray system, and have also been explained in the response to the previous comment.

It is not clear from the text what part of the model building pipeline
PMAnalyzer actually does (i.e., does it only analyze the growth curves?Does it do that and also run the gap filling?Etc.).Please explain this explicitly in the paper.I suggest also providing a schematic similar to Figure 1 Please remark on and justify whether there is an optimization step to reduce false positives (which is mentioned in the discussion, but not the results).The final paragraph of the results lists false positives, but doesn't go into detail or attempt to explain why these occurred; I suggest that the authors give some explanations here if possible on why these are tricky biological cases.
The issue has been addressed in the recent changes.Clarifications and explanations Responsehave been included in the Results and Discussion sections.
Figure 3 would be greatly improved if the authors listed on each panel whether it was called 'growth' or 'non-growth' by PMAnalyzer.This is, after all, a way for the reader to visually validate the method.

Minor criticisms:
Please explain the genesis of equation 4, as it is unclear how/why it was formulated this way and it forms the critical cut-off criterion for the calls made in PMAnalyzer.
Equation 4 is a type of arithmetic mean that is least prone to noise in the data.

Response -
Originally, the data input into this equation was the raw OD 600nm measurements; however, after fitting the logistic model, empirical data showed that using the fitted values and introducing the asymptotic value into the equation was able to further separate those growth curves displaying growth.
In the section 'RAST annotations', the authors state that 'following gap-filling, all missing reactions … were cross-checked with the SEED to find similarly named reactions' --similar to what?Is this a comparison between databases held in SEED vs. in RAST?Please clarify this.This issue has been addressed in the recent changes with clarifications and Responseexplanations.
It would be interesting/informative to see Figure 2 in context of other models, especially those of close neighbors (e.g., e. coli and C. koseri).I suggest that the authors provide histograms of subsystems in one/both of those organism as well for comparison.

Figure 1 .
Figure 1.Combined flowchart.Initial metabolic models are built and reconciled in KBase from RAST genome annotations.Phenotypic profiles from the MAPs technology and the PMAnalyzer are incorporated into the KBase FBA-reconciliation loop to optimize the model.

Figure 2 .
Figure 2. RAST subsystems.Number of subsystems annotated by RAST for each subsystem group.Membrane transport subsystems are highlighted in separate plot.

Figure 3 .
Figure 3. Citrobacter sedlakii growth curves.Growth curves generated from the multi-plate reader over 32 hr.The y-axis is displayed in a log 2 scale and substrate groups are distinguished by color: blue for carbon sources and red for nitrogen sources.(A) Standard error from the technical replicates are plotted as gray regions.(B) Logistic models of each growth condition.
School of Computer Sciences & Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel , but that specifically shows what the inputs and outputs to PMAnalyzer are (and optionally shows some of F1000Research specifically shows what the inputs and outputs to PMAnalyzer are (and optionally shows some of the internal mechanics of PMAnalyzer).This issue has been addressed in the recent changes.An additional flowchart has Responsebeen added to Figure1describing in further detail the workflow of PMAnalyzer.

Figure 3
Figure 3 has been updated to show Growth and No Growth curves.Response -

Figure 2
Figure 2 has been updated to provide this information.Response -

Table 1 .
MAP and FBA results.Citrobacter sedlakii growth (G) and no growth (NG) phenotypes.Carbon substrates are denoted with (C); nitrogen substrates are denoted with (N).FBA results before gap-filling are shown in the Initial Model column.MAP-FBA comparison results are: correct positive (CP), correct negative (CN), false positive (FP), and false negative (FN).

Table 2 .
Gap-filled reactions.Each gap-filled reaction is listed under their respective media condition, along with the primary source of the compound in parenthesis, i.e. (C) denotes carbon and (N) denotes nitrogen.E.C. numbers were supplied by KBase when viewing gap-filling results.Reactions listed with an asterisk denote the 13 transport reactions added to the model.Reactions listed with a yes in the Reversible column denote those that were already present in the model but made bi-directional through gap-filling.

Table 2
fulfilled the reaction set for other 35 FN conditions not shown in the table.
28ed in C. koseri and two reactions (D-Arabinose ketol-isomerase[EC 5.3.1.3];myo-inositol:oxygenoxidoreductase[EC1.13.99.1]) could not be identified in C. koseri nor in E. coli.Therefore, these six reactions were added with low confidence.From the six reactions, three reactions (including the transporter) were required for growth on allantoin.In fact, Citrobacter species have not been shown to be capable of utilizing allantoin as a sole source of nitrogen28.More in-depth experiments are needed to investigate C. sedlakii's ability to grow in the allantoin-based growth condition.The 90 simulations were executed again and resulted in growth on 54 conditions (6 FP) and no growth on 36 (0 FN), a final accuracy of 93.3%.The six false positive growth conditions were: 2 deoxy-Dribose (carbon), glycine (carbon), L-pyro-glutamic acid (carbon), L-threonine (carbon), malate (carbon), and putrescine (carbon) (Table1).

profiling data for elucidating genomic gaps 3 Data Files http://dx.doi.org/10.6084/m9.figshare.1150169 Discussion
RAST annotations and gap-filled reactionsBack referencing the gap-filled reactions to the prior RAST annotations provides insight into KBase's ability to build the initial reconstruction from subsystem annotations.Several reactions identified by RAST were not included in the metabolic model.This is either due to RAST not being able to determine the function of the gene using homology, or the model is not able to correctly incorporate the function from the annotation.Gap-fill on LB media identified four reactions that were later categorized as RAST mis-annotations, whereas gap-fill on the MAPs media identified only two reactions later categorized as RAST mis-annotations.By surveying neighbor- ing homologs of gap-filled reactions in closely related genomes, one reaction on LB gap-fill and four reactions on MAPs gap-fill were categorized as missing due to poor sequence quality.