Maize Phyllosphere Microbial Community Niche Development Across Stages of Host Leaf Growth

Background: The phyllosphere hosts a variety of microorganisms, including bacteria, which can play a positive role in the success of the host plant. Bacterial communities in the phylloplane are influenced by both biotic and abiotic factors, including host plant surface topography and chemistry, which change in concert with microbial communities as the plant leaves develop and age. Methods: We examined how the Zea mays L. leaf microbial community structure changed with plant age. Ribosomal spacer length and scanning electron microscopic imaging strategies were used to assess microbial community composition across maize plant ages, using a novel staggered experimental design. Results: Significant changes in community composition were observed for both molecular and imaging analyses, and the two analysis methods provided complementary information about bacterial community structure within each leaf developmental stage. Conclusions: Both taxonomic and cell-size trait patterns provided evidence for niche-based contributions to microbial community development on leaves.


Abstract
: The phyllosphere hosts a variety of microorganisms, including Background bacteria, which can play a positive role in the success of the host plant. Bacterial communities in the phylloplane are influenced by both biotic and abiotic factors, including host plant surface topography and chemistry, which change in concert with microbial communities as the plant leaves develop and age.
: We examined how the L. leaf microbial community Methods Zea mays structure changed with plant age. Ribosomal spacer length and scanning electron microscopic imaging strategies were used to assess microbial community composition across maize plant ages, using a novel staggered experimental design.
: Significant changes in community composition were observed for Results both molecular and imaging analyses, and the two analysis methods provided complementary information about bacterial community structure within each leaf developmental stage.
: Both taxonomic and cell-size trait patterns provided evidence for Conclusions niche-based contributions to microbial community development on leaves.

Introduction
The leaf microbial community 1 can play important positive roles for the plant partner. For example, disease resistance is increased when supportive bacteria are present 2,3 and increased host plant growth in the presence of specific community members has been documented 4-6 , along with a positive effect of overall increased microbial diversity 7 . Leaf habitats are also extremely important for the microbial community, as this habitat is quite large in surface area globally. In temperate regions, seasonal growth and annual agricultural plantings lead to repeated opportunities for colonization by adapted microbes 8,9 . There is thus potential for plant-beneficial interactions with microbes that are being used for development of new agricultural products 10 .
Determinants of community assembly/succession in microbial systems include biotic and abiotic factors, often classified into different community assembly rules 11 . Typically, random or neutral stochastic models are tested against abiotic effect models, which are called environmental, or, more recently, habitat filtering rules. Environmental filtering can produce different patterns depending on scale 12 . As recommended by Kraft et al. (2015) we will refer to environmental filtering as habitat filtering when there is no separate test for abiotic and biotic effects. A second class of assembly rules addresses the interactions of organisms, by either facilitation or competitive exclusion. These niche interactions are often modeled separately from abiotic effects, or abiotic effects are assumed to happen first.
Community assembly and succession processes are typically inferred from diversity patterns in community samples 13 . The number of different taxa (alpha-diversity), the difference in the composition of taxa between replicates (beta-diversity) and the proportion of co-occurring taxa all provide evidence for these different community-generating mechanisms. Niche construction is inferred from a pattern of decreasing beta-diversity, increasing co-occurrence, and increasing richness, while habitat filtering is characterized by decreasing beta-diversity with no trends in co-occurrence or richness 14,15 . In contrast, neutral community assembly would exhibit increased richness (provided immigration rates are maintained), with unchanged beta-diversity and co-occurrence levels. Once we have measurements of the various components of diversity, we can compare observed trends to these expectations. The combination of increasing/decreasing trends would then support or fail to support a particular mechanism of community succession. Both trait-based (morphological, in our study) and taxonomic measures of diversity can be compared to the expected trends 16,17 . It is especially valuable to compare trends across different traits, as assembly patterns can differ across scales 12 . For microbial systems, cell size is one of the few traits that is measurable at the microscopic scale; other functional traits, such as metabolic activity, are more often examined at the molecular scale, by whole-community DNA sequencing, proteome analysis, or metabolite profiling.
Previous analyses of controlled inoculation phyllosphere community development indicate that habitat filtering is important, with niche effects less often detected across time, though all previous studies have used seasonal or through-time designs that confound exposure time and plant development. Pre-colonization reduces immigrant success 8,18 , suggesting biotic competition is key, though high beta-diversity in replicate samples from both pre-colonized and newly inoculated communities suggested that there are many options for species that arrive first to succeed. Several crop species have been tested for seasonal patterns of phyllosphere community development. Microbial populations increased more rapidly on younger leaves of lettuce 19 , and alpha-diversity increased over the first few weeks then plateaued 20 , which provides evidence that habitat filtering was a contributor to community succession later in the season. However, additional environmental factors can influence microbial richness, as documented in leafy greens 21 . In apple flowers, species richness increased early in the season and beta-diversity dropped, then rose 22 . About half the apple flower microbial species had significant co-occurrence relationships and the numbers of co-occurrences increased and decreased over time. This pattern supports niche construction as a key mechanism for apple flower colonization. As in apple flowers, across-season dicot crop phyllosphere samples became less diverse (alpha-diversity went up then declined). In the crop samples, more shared and plant-specific taxa were found later in the season 23 , again supporting development of specific community structure. Wild tree leaf samples exhibited consistent patterns of change in community composition, clustering in to early, middle and late season groups, but with little evidence for species-time trends 24 . In contrast, Arabidopsis epiphyte beta-diversity increased with time 25 , providing stronger support for stochastic effects in this system.
The history of annual opportunities for new colonization is reflected in the processes for dispersal and succession on plant leaves. Colonization of new growth in the presence of older leaves may proceed differently than colonization from distant source materials. The contributions of abiotic changes and host leaf age are confounded when sampling occurs through time, as in seasonal sampling, thus necessitating alternative experimental designs and/or additional phylogenetic or functional trait information to examine the roles of these assembly processes. Sampling serially across time, as all prior studies have done, could confound short and long-term trends and does not enable differences in plant host to be distinguished from inoculation processes 20,22 . Therefore, for this study host plant seeds were planted at different times and sampled once.
How can leaf microbial communities be measured? Molecular methods for detection of species using ribosomal fragments or sequences have become increasingly common 22,23,26 . Microscopy can be used to detect bacteria (and fungi) on corn leaf surfaces 27 , and surface bacteria can be followed with reporter genes 28,29 as

Amendments from Version 2
We have edited the manuscript to incorporate all of the suggestions from the second review. Notable changes include improved explanation of the image analysis methods and additional text and citations in the discussion section. We updated the supplemental results file with additional caption text and definitions. In our reply to the review we also suggest technological improvements in microbial sampling equipment that would allow researchers to efficiently collect truly multivariate molecular and cell-level trait data. We strongly encourage future sampling instrument and data analysis development in order to advance this field of research.

REVISED
well as by direct enumeration. Microscopic or particle-based enumeration methods are especially useful for microbial communities where we have limited information on traits and horizontal gene transfer is likely 30 . For this study we used both a molecular approach to measure relatedness-based (taxonomic) microbial community composition, and direct enumeration of scanning electron micrograph (SEM) images to measure cell-size trait changes.
Our specific questions concerned the pattern of community development -how community succession varies with plant age, and whether the microbial community was primarily shaped by microbe-host interactions (environmental or more generally, habitat filtering) or by microbe-microbe interactions (niche construction). We found complex patterns in the maize leaf microbial community structure even across the short series we sampled, with niche construction and stage-specific increasingdecreasing diversity trends observed.

Methods
Experimental design and sample collection Zea maysL. B73 inbred seeds obtained from the Maize Genetics Cooperation Stock Center (http://maizecoop.cropsci.uiuc.edu/) were planted using a hand held planter between April 26th and June 15th 2009 at the Central Crops Research Station in Clayton, NC (http://www.ncagr.gov/research/ccrs.htm). All necessary permits were obtained for the described study, which complied with all relevant regulations; permits were managed by the research station staff. Approximately twenty seeds were planted each date, with plantings between four and eleven days apart in a single field. Stages of leaf development, V(1...n)), are approximately one week apart and this developmental pattern informed our sampling interval. All plants within a developmental stage were planted in the same row, bordered on all sides by older and younger stages, then by additional crops, grass borders, and tree-shrub hedgerows.
All ten replicate leaf samples were collected at once on July 15, 2009. Both scanning electron microscope (SEM) and ribotype (ARISA) samples were obtained from tips of separate central leaves from ten plants at each developmental stage. SEM samples were approximately five centimeters square and were cut along the leaf mid-rib near the tip (Figure 1). Sections were immediately fixed, by immersion in a solution of 2.5 percent glutaraldehyde in 0.1 M Sorenson's phosphate buffer at a pH of 7.1.
Leaf wash samples for ribosomal DNA analysis were then collected from the same ten plants, using adjacent leaf numbers and a similar portion of the leaf tip ( Figure 1). Approximately 10 cm of leaf was cut, submerged in buffer (0.2mM Tris pH 7.5, 0.02 mM EDTA, 0.00012X Triton X100), agitated vigorously and removed using the method described in 31. Leaf washes were stored on dry ice while in the field then placed in a -70 degrees C freezer for long-term storage. One mock control (where no leaves were placed in the tubes) and one wash control (SEM sample from washed leaf to check for washing effectiveness) was taken for each stage.

SEM sample processing
Excess glutaraldehyde was removed from leaf SEM samples by rinsing for 30 minutes in 0.1 M Sorenson's phosphate buffer followed by an additional 30 minutes in distilled water. Sections were then dehydrated in an acetone series (50, 70, 90, 100 percent), each lasting 30 minutes, followed by a 30-minute treatment in hexamethyldisilazane. Sections were then allowed to air dry before being attached by the lower leaf surface to stubs using double-sided sticky tape. One leaf section chosen arbitrarily from each plant was sputter coated with 10 nm of platinum/palladium and observed in the SEM at 5 kV with a spot size of 3. The SEM procedure was tested prior to sampling using young (approximately 25 cm tall), greenhouse grown maize plants and Pseudomonas aureginosa 32 .    Figure 2). All single microbial cells present in the image were outlined; cells were identified by size and smoothness of circumference, as originally described by Davis 27 . Microbial cell outline areas were measured using Image Pro Plus (Media Cybernetics, Inc., Rockville, MD, USA) and recorded for each cell within each quadrant picture. Image Pro Plus sized outlines that were touching other bacteria as one large organism, therefore in clusters of cells as many as possible were outlined without outlines intersecting. Each image was individually measured using calibration settings based on the magnification used to take the image. The area calculation used all non-zero pixels within the outline. Each image analyst completed analysis of a training image set and the coefficient of variation of their analyses was checked to ensure that it was less than 5 percent.
Microbial cell sizes acquired from all images were binned using methods described by Shimazaki et al. 33 . Bin sizes were assigned to the microbial samples as size-class units (SCU) and a microbial community profile was created for the samples using the RiboSort package in R 34 .

Ribosomal spacer amplification and separation (ARISA)
For the DNA analysis, leaf washes were syringe filtered using previously developed methods 35 (videos of the method are available upon request) and microbial DNA was extracted using an UltraClean Microbial kit according to the manufacturer's instructions (MoBio Laboratories, Carlsbad, CA). An equal volume of each DNA sample was amplified to ensure fair representation of the epiphytes collected.
Two separate, nested PCR reactions were performed on experimental samples as described by Cardinale et al. 36 , along with four positive controls from a mixture of DNAs from cultured bacteria (Carlson, 2015) and four negative (nuclease-free water) controls. For the initial PCR step, a master mix was made to include: Q5 master mix (New England Biolabs, Ipswich, MA), ITSF primer (5'-GTCGTAACAAGGTAGCCGTA-3'), LD primer (5'-CCGGGTTTCCCCATTCGG-3'), and nuclease-free ultrapure water. Ten microliters of master mix was added to sterile PCR tubes followed by experimental and sample controls. The samples and controls were then run on an iCycler (BioRad, Hercules, CA) PCR machine with the following settings: 1X 95C for 30 seconds, 35X (95C for 30 seconds; 61.5C for 30 seconds; 72C for 1 minute, 30 seconds), 1X 72C for 10 minutes. For the second PCR step, a second master mix was made with the following: Q5 master mix, ITSreub primer (5'-GCCAAGGCAT CCACC-3'), which contained a fluorescent HEX marker for fragment analysis in a Genetic Analyzer, SD primer (5'-TGC GGCTGGATCCCCTCCTT-3'), and nuclease-free water.
Samples were run on the Applied Biosystems Genetic Analyzer (ABI, Inc, Carlsbad, CA) and analysis of PCR product length was carried out using the instrument software. For each sample, two microliters of PCR product and 13 microliters of GS500-Rox plus HiDi (ABI, Inc, Carlsbad, CA) were combined. The samples were denatured at 94C for ten minutes, placed on ice for two minutes, then placed into the analyzer and the program started.
ARISA data processing Files from the Genetic Analyzer were opened in Peak Scanner 2 software (http://www.appliedbiosystems.com) using Analysis Method=Sizing Standard-PP, and Size standard=GS500. Only the green (hex-primer-labeled) peaks were used to determine level of bacterial abundance. The size and heights of the green peaks were calculated using the ROX GS500 standard. The individual output files were assembled into a single microbial profile file using the RiboSort package 34 using the script, RiboSo rt(data,dataformat="standard",dye="G", output="proportions",z erorows=FALSE,repeats=1, mergerepeats="presentinall"). This program automatically bins the peaks to generate abundances; we used the default threshold of 50 fluorescence units as recommended by the program authors.

Statistical analysis of presence-absence matrices
The OTU (operational taxonomic unit) abundance matrices output from Ribosort were analyzed by Permdisp and Permanova 37-39 using Primer-E software v6 40 . Overall square root transformation and Bray Curtis similarity resemblance were applied to the initial abundance data within Primer-E before Permanova and Permdisp tests were run. The P-values from the program were Bonferroni-corrected. Primer-E SIMPER analysis was used to compute the number of OTU contributing to differences between stages. Primer-E functions were used to compute diversity measures and taxa numbers for each sample (Supplementary Dataset). Supplemental Table 1 to Supplemental Table 4 include full output from the Primer-E analyses.
The evenness (Pielou's J') of OTU and SCU for each sample was computed using Primer-E v6 40 . The values of J' for OTU and SCU were compared across plant stages by rank-transformed analysis of means using the routines in JMP Pro v11 (SAS Inc, Cary, NC) with P-values adjusted for multiple comparisons. For the rank-transformed analysis of means the P-value threshold was set at the Bonferroni-corrected value of P less than 0.007. This rank-transformed analysis of means is not sensitive to heteroschedasticity and is more powerful than other nonparametric tests when the data distribution is heavy-tailed 41 . Nonparametric pairwise comparisons of evenness between stages were done using the Steel-Dwass multiple-comparison method. The distributions of SCU areas in each stage of plant growth were also computed with JMP Pro v11.
Co-occurrence data analysis Pairwise analysis was completed using the R package Co-Occur 42 ; this package allows determination of the number of OTU/SCU that occur together in the same sample day at a probability above random, along with the number of taxa that are not found together (negatively correlated taxa) above the random-pair threshold. The analysis code, input and output files are available at https://github.com/skosowsky/Cooccur. We created network visualizations of the co-occurring OTU and SCU with Cytoscape (v. 3.4) and compared networks using the Cytoscape plug-in app DyNet (v1.0) 43 . Intensity of node connection change across the seven stages is reflected in the intensity of the color of the DyNet-analyzed nodes.

Statistical analysis of presence-only data
The number of OTU and SCU present within each sample was computed using Primer-E v6 (abbreviated as S, also known as Chao1). The number of OTU and SCU compared to the overall mean across plant stages was analyzed using rank-transformed analysis of means via routines in JMP Pro v11 (SAS Inc, Cary, NC), with P-values adjusted for multiple comparisons. For the rank-transformed analysis of means the P-value threshold was set at the Bonferroni-corrected value of P=0.007. This rank-transformed analysis of means is not sensitive to heteroschedasticity and is more powerful than other nonparametric tests when the data distribution is heavy-tailed. Nonparametric pairwise comparisons between stages were carried out with the Steel-Dwass multiple-comparison method within JMP.

Results
Our experimental design allowed for a single microbial sampling that was not confounded by differential exposure to recent weather events, with sampling for molecular and microscopic analyses in each plot ( Figure 1). Molecular analysis of bacterial community ribotype composition (ARISA) and image analysis of microbial cell assemblage size classes (by scanning electron microscopy, SEM) provided two complementary views of community assembly processes.
Presence-absence matrix comparisons ARISA community composition and beta-diversity. Microbial community composition changed over time in bacterial ribosomal taxa (Permanova P-value of 0.026), as measured by differences in operational taxonomic units (OTU). The Permanova pairwise results for bacterial ribosomal OTU summarized in Table 1 and with additional details in Supplementary Table 1 and  Supplementary Table 2 indicate that, specifically, day 30 and day 80 were significantly different from each other in community composition. Thus we conclude there was a difference in either the richness or arrangement of the communities' ribotypes in the earliest and latest stages of community development. To determine which aspect of community structure contributed to the difference detected by Permanova, we examined the dispersion as a measure of beta-diversity, i.e. the amount of range in the set of species present across replicates within each time. There was no significant change in the dispersion in the ribosomal OTU datatype across stages (Permdisp P=0.445, details in Supplementary Results). To further analyze the ARISA community structure, we compared the evenness of species distributions across stages.
Evenness is a measure of the range of OTU abundances in a sample unit; we used the Pielou's J' measure of sample evenness for comparison of stage evenness values. Evenness values were not normally distributed and were heavy-tailed (with kurtosis less than 3), so nonparametric tests were used for analysis of evenness differences across stages. Evenness was significantly different (P=0.0354) between stages 30 and 80 in the ARISA datatype in nonparametric pairwise tests (see Supplementary Results section 1.b for details), with the evenness value lower in the latest stage (i.e. a few OTU are more abundant in the stage 80 samples, compared to a more similar set of abundances in the first samples), though overall the evenness averages are not significantly different from the overall mean across all the sample days (Supplementary Results section b). This pairwise-significant evenness difference in important OTU abundance ranges in the earliest and latest sample dates is also visible in a SIMPER analysis of ARISA OTU, with the top contributions to the day 30 OTU more evenly spread (for day 30: OTUs with 31, 19, 11, 11, 8, 6, 4 percent) as compared to day 80 (with OTUs of 50, 19, 10, 9, 5 percent, Supplementary Table 4).

SEM community composition and beta-diversity.
In contrast to the ARISA datatype, in the SEM-based size-class unit (SCU) datatype the beta-diversity significantly decreased in the day 41 samples relative to the day 48 samples (Permdisp P=0.004, Table 2, details in Supplementary Table 3). This beta-diversity difference confounds Permanova tests, so we cannot further compare SEM community compositions across stages with that method. The significant SEM beta-diversity community change tells us that there is an important successional process at this specific stage relative to the earlier and later samples. Comparison of the Pielou's J' evenness values in the SCU datatype using nonparametric tests indicated that day 72 had significantly lower evenness of size-classes relative to the overall mean (Supplementary Results section 2), with pairwise tests showing that day 72 was significantly lower in evenness than day 80 (Supplementary Results 2.b). Pairwise comparisons among all other days were not significantly different.

Presence-only comparisons
We compared the number of OTU present in each sample to the overall mean using nonparametric comparisons suitable for non-normal, heavy-tailed distributions ( Figure 3). The number of OTU -the richness -increases significantly in the ARISA  datatype at day 62 (this mean is above the confidence interval and is thus indicated with a blue tip dot on the drop-down bar in Figure 3a), with earlier and later samples showing no significant difference from the overall mean richness (means indicated with orange dots) and no pairwise differences in means compared to each other with nonparametric tests (Supplementary Results section 1a). In contrast to the ARISA datatype pattern, the number of different SCU present in the size-class datatype is significantly lower at day 48 (Figure 3b), while all other stages are within the confidence limit for the overall mean richness. Nonparametric pairwise comparisons of SCU indicated that day 48 was significantly lower than all later-stage samples and that days 72 and 80 had significantly higher SCU means present than day 30 (Supplementary Results section 2a).

Core community members
For the ARISA datatype, there were three OTU found in all stages of community development; these were OTU9, OTU22, and OTU24, which comprised 1.5 percent of the total number of OTU. For the SEM datatype, there were 13 SCU (5.7 percent of the total); a visual display of the proportions of all 13 across the seven sample dates is shown in Supplementary Figure 3.

Co-occurrence pattern
To further examine community assembly processes we looked at trends in the co-occurrence of pairs of species in both phyllosphere datatypes. We examined these patterns in our data to identify ways that community composition could differ. Positive co-occurrence is defined as species occurring together more often than random (and thus having potential mutualistic interactions), while negative co-occurrence values indicate that the species pair is found together less often than the random expectation (potential antagonistic interactions). Co-occurrence patterns changed across plant stages in both ARISA ( Figure 4) and SEM ( Figure 5). Participation of certain OTU and SCU in varying interactions across multiple stages is indicated by the red color   intensity and number of overall edge interactions in the DyNet networks ( Figure 4b and Figure 5b). In the ARISA datatype, highly variable co-occurring OTU (intense red color in Figure 4b) had both positive and negative interactions with other community members. In contrast, the SEM datatype exhibited more positive interactions connecting highly dynamic nodes (intense red nodes in Figure 5b). There were more positive than negative significant co-occurrences in both the ARISA and SEM datatypes ( Figure 4 and Figure 5, edge arrows with points). The number of co-occurrences increased with increasing plant growth time in each datatype, though not at the same stage; as seen in Figure 4, ARISA had the largest number of network nodes at day62 and for SEM ( Figure 5) the largest number of nodes was day72. The overall proportion of co-occurring pairs was similar in each datatype (Table 3), with a minority of microbial types found to have significant co-occurrence partners. Specific OTU and SCU identifiers are individually displayed in stacked plots that show details about which OTU occurred in more than one pair within Supplementary Figure 1  Are two datatypes better than one? We measured a morphological trait (microbe size) and a relatedness (ribosomal length difference) metric, and the two datatypes were not especially similar at specific time points, making them complementary rather than redundant; we infer that predictive power from one datatype to the other would be low. The differences between datatypes suggest that size-classes may include different taxa, in other words our size-classes may not have a phylogenetic signal. Some recent work in phytoplankton and soil microbial communities contrasts with our results, showing that traits can be conserved across taxonomic groupings 30,44 ; generally conservation of traits and evolutionary relatedness is likely to depend on the genetic architecture of the phenotypic trait 45 . The relationship between metabolic and functional traits and phylogenetic signal is as yet unclear for microbial communities that have substantial levels of gene transfer 46,47 . Complementary information provided by two (or more) datatypes is thus recommended for future phyllosphere experiments; metabolic small molecule profiles along with functional gene pathway analysis would be valuable complements to cell size and shape analyses.
Misclassification of ribosomal spacer lengths to different taxa when they are in fact due to population differences, and misclassification of large bacterial cells in a size-class separate from their smaller offspring, would have consequences for trends across samples if the misclassification probability were different at the different stages of community development. In our study, the underlying assumption is that our probability of misclassification is the same across stages. For size-classes, misclassification could increase with time due to growth of cells, resulting in a continuous increase in the number of SCU over time. We do not see this trend in our data, so from our study there is no evidence of confounding misclassification. We cannot rule out misclassification that occurs only at specific stages using our experimental design. We suggest that sample processing for future experiments include inline size separation so that size-classes, ribosomal data, and molecular/metabolic data are collected as true multivariate data in each sample. Image information such as our scanning electron micrographs or fluorescence-based microbial images could also be analyzed for patterns that could illuminate resource use traits and specific cell-shape-based mutualisms 48 .
Trends in richness, co-occurrence trends, and beta-diversity changes together define overall community assembly mechanistic contributions, with habitat filtering characterized by unchanged richness, unchanged co-occurrence frequency, and decreasing beta-diversity. Niche development would be observed as a combination of increasing richness, increasing co-occurrence, and lowered beta-diversity. Finally, a neutral pattern -with no change in immigration probability over time -would exhibit trends toward increasing richness, unchanged co-occurrence, and unchanged beta-diversity. Comparison of recent phyllosphere analyses and our work using this framework emphasizes the role of niche development as a mechanism of community assembly.
First we compare our results with recent work in apple flower microbial communities, where beta-diversity was initially at relatively high levels, decreased, then increased over the remainder of the season; richness (measured as phylogenetic diversity in the apple flower work) increased then leveled off, and finally co-occurrences showed early increase then decline and rebound toward higher levels over the season 22 . To summarize, early in the apple flowering season the diversity patterns support niche development mechanisms, and later in the season both habitat filtering and niche development patterns were seen. In our study of maize leaf microbial epiphytes we also found that there was no overall temporal trend. Our leaf ARISA pattern of diversity change was not especially similar to the apple flower changes; for example, we found no significant change in beta-diversity, while beta-diversity declined then rose over the later part of the season in apple flowers. Our ARISA richness and the apple flower phylogenetic diversity also differed, with apple flower phylogenetic diversity increasing then leveling off, while on maize leaves richness increased only in one later growth stage. The evenness in both our ARISA and apple flower phylogenetic diversity varied; in apple flower the 'mid' stage had a lower evenness while in our ARISA dataset the day 80 evenness was lower than in day 30. Co-occurrences in our study increased in mid-season, as did the apple flower communities in the early flower stage. Overall, the early season patterns in the across-season apple flower microbial community analysis were the most similar in community assembly to our maize leaf staggered-stage analysis.
In addition to the apple flower study, in a recent study of three different crop leaf phyllosphere seasonal patterns, Copeland et al. 23 found that alpha-diversity increased then decreased across the season, with more shared species later in the season. In all three crops, beta-diversity (as measured as Unifrac distance for leaf-specific microbes) decreased across season, though not evenly; the beta-diversity decrease was more prominent later in season and at one time-point was higher during the later phase. Co-occurrences were not analyzed in this study, so these crop study results are consistent with niche and/or habitat mechanisms.
In the non-crop model species Arabidopsis thaliana, a controlled community assembly experiment resulted in no ribotype-traitbased evidence for niche development; instead, beta-diversity increased (by one measure) or was unchanged (using a measure adjusted for phylogenetic similarity) 25 . Beta diversity was also high in temperate tree leaves, with inter-individual ribotype variation nearly as large as variation across tree species 49 . Across seasons, tree leaves varied more by site than across time 50 , though both young and mature leaves were included in this study and thus community assembly processes were not addressed.
Core microbial communities in crops are those that are either recruited early and retained, or continually input across the season. These are often found by presence 51 , as in our analysis of ARISA and SEM core microbial community members. We found a relatively small proportion of core OTU and SCU; other phyllosphere studies such as on apple flowers 22 found more (for apple flowers the generalist category was 14 percent, for example), and in forest trees approximately 4 times as many core species were found relative to host species-specific bacterial ribotypes 50 . Our core set should be considered as a first description, as we did not consider absences or compare our core set to air or soil samples. Instead, our core set focused on microbial community members that colonized leaves of all developmental stages.

Conclusions
Our ARISA taxonomic diversity patterns and microbial cell size trait datatypes exhibited different specific patterns of diversity, but there was similarity in trend overall, as both datatypes support niche process contributions to community assembly. Niche (deterministic) processes have recently been found to be important in soil microbial communities 52,53 . In lakes, microbial generalists found at multiple spatially separated sites had a better fit to neutral stochastic models, with specialists exhibiting a more deterministic pattern 54 . As leaf surface communities are quite different than air or soil communities 9 , and the number of core species in our inventory is low, we suggest that many of the maize leaf microbial community members that we detect are specialists. Our staggered experimental design highlighted the role of microbe-microbe interactions in leaf microbial community development. We thus suggest that future experiments use staggered experimental designs and multivariate metabolic, genic, and sizetrait data collection.

Competing interests
No competing interests were disclosed.

Grant information
The author(s) declared that no grants were involved in supporting this work. In the study described in this article, the authors use two complementary approachers to investigate the bacterial diversity on corn leaves over time. They aimed at untangling the bacterial community assembly processes on corn leaves. The authors were the first to my knowledge to investigate temporal community development on corn plants. The molecular work is thorough, discussed well and leaves little room for improvement based on the techniques employed. However, I have major objections about the use of scanning electron microscopy as a tool for community investigations. I keep but wondering why the authors tried to use it in the first place since a literature search would have revealed that microscopy without molecular tools is not suitable for community investigations. This is why I propose to stress early on in the manuscript that this is a failed attempt to combine DNA-based molecular techniques and morphological data.
The following paragraph reads somewhat abrupt but they entail my concerns about the microscopical analysis and how they may be improved in the future.
The analysis of electron microscopical pictures is highly problematic as debris will be virtually and practically indistinguishable from bacteria at the resolution the images were acquired. Even a well trained analyst will not be able to distinguish them. Additionally, yeast will be visible in those images that would not be picked up in the ARISA analysis. Furthermore, the preparation of the specimen will remove not be picked up in the ARISA analysis. Furthermore, the preparation of the specimen will remove bacterial biomass from leaves. The analysis of the SEM pictures that was performed manually instead of using computer-based thresholds or filters, even though using trained individuals, may also be highly problematic to determine the size and shape of individual bacteria since the ability of the analyst to surround the bacterial cell is limited by their ability to draw using a mouse/ touchpad etc.. This is well visible on the right hand side of Figure 2. The abundance of bacterial biomass seems extremely low -it would help the reader to provide a graphical representation of the biomass coverage and its development over time. I recommend the repeat the experiment using an independent marker such as a DNA intercalating fluorescent dye or fluorescence in situ hybridisation and fluorescence microscopy. Since spatial data was not important to the analysis, I also recommend to recover the bacterial biomass by washing instead of performing on leaf observations. In general however, I am very doubtful of the use of size markers to cluster bacterial cells into operational taxonomic unit equivalents, bacteria are notoriously flexible if it comes to size: imagine a cell that is growing up to soon divide versus a cell that freshly divided. Frank Dazzo's work, which is also cited by the authors takes the cell size analysis a step further and includes additional geometrical measures to determine "morphotypes" that he uses as a proxy for diversity, which is something that the authors should consider to improve the resolution of their SCU analysis. I understand however, that there is no suitable technique available thus far that would yield a single-cell result that would resemble an OTU-based analysis. A combination of FISH + morphological analysis may be the most advanced option thus far. In summary, I am missing a motivation in the introduction for the used technique and why the authors believed that is would offer merit over the complementary ARISA approach. From the current description it is unclear to me how different SCUs were assigned, how many classes were detected etc. Some corresponding results are well hidden in the supplemental data (S1 C), however in the corresponding graphs, the units are lacking. My guess is, that they refer to µmˆ2, then again, I am confused that the chosen bins are of different width each time. Again surprising to me is, that some identified bacteria projected an area as large as 20 µmˆ2, which seems unlikely. As a side note, the quality of the pictures is very low and they are full of compression artefacts, the authors should provide a higher quality screenshot to aid readability. Additional discussion why image analysis was not suitable and how it may be improved in the future would be useful.
Other general comments: I believe that the more recent works of Steve Kembel should be cited e.g. Tree phyllosphere bacterial communities: exploring the magnitude of intra-and inter-individual variation among host species .
. Host species identity, site and time drive temperate tree phyllosphere bacterial community structure Isabelle Laforest-Lapointe, Christian Messier and Steven W. Kembel . The first sentence of the introduction should be corrected as the phyllosphere is the habitat that is colonised by microbial communities, not the community itself.
Second last paragraph of the introduction: the authors cite one of my papers to support their statement that bacteria can be tracked by using (fluorescent) reporter genes, however, in the cited paper, bacteria were washed of plant leaves prior to their investigation, therefore other papers, such as Monier and Lindow 2005 are more suitable citations here. If the authors insist to cite me, Remus-Emsermann et al. 2011 is more appropriate.
Methods: Electron microscopy: Instead, or additionally, to referring the absolute magnification, acquired area should be mentioned to better understand how representative the acquired field of views were.
The "Ribosomal spacer amplification and separation (ARISA) paragraph" ends with "Bacterial sizes acquired from all images were binned using methods described by Shimazaki et al.. Bin sizes were assigned to the microbial samples as size-class units (SCU) and a microbial community profile was created for the samples using the RiboSort package in R." I assume this should be shifted to the previous paragraph, otherwise I don't understand the paragraph Paragraph "SEM image analysis": it is not clear to me how Image Pro Plus was used to measure "bacterial areas", did you employ a detection algorithm that only worked within the circled areas, or were the circled areas themselves measures?
To maintain a parallel structure throughout the manuscript, i.e. first SEM analysis and then ARISA analysis, Table 1 and Table 2 should be switched in sequence. The same is true for Figure 3 a and b and Figure 4 and 5. Alternatively, the method sections could be switched in sequence.
the term SEM OTU occurs in table 3 -this should be corrected to SCU. Supplement S1: the description of the figures is not self explanatory and should be improved. Suggestion: "Analysis of S…" and "Total species (S)…" should read "Taxa_S = Total OTU" or "SCU" to be consistent with the nomenclature used in the graph, or the label of the y-axis could be changed.

Are sufficient details of methods and analysis provided to allow replication by others? Partly
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.
Author Response 10 Jan 2018 , University of North Carolina Wilmington, USA

Ann Stapleton
We thank the second referee group for their comments-we are especially pleased that this was used as a learning opportunity for the lab team. Our responses to the comments are in plain text below the italicized review text. Thank you, we appreciate your endorsement of our ARISA analysis.

Referee
However, I have major objections about the use of scanning electron microscopy as a tool for community investigations. I keep but wondering why the authors tried to use it in the first place since a literature search would have revealed that microscopy without molecular tools is not suitable for community investigations. This is why I propose to stress early on in the manuscript that this is a failed attempt to combine DNA-based molecular techniques and morphological data.
This study is a first step toward analysis of microbial morphological phenotypes, with the goal of comparing and contrasting what we see in SEM images to the ARISA results and recommending future technological and analytical improvements. For example, we would like future researchers develop automated size-sorting of leaf washes using a cell-sorter and in-line true multivariate analysis of metabolites and microbial genomes, in a format that would allow rapid sampling in the field without need for sample preservation. We have added additional text to the third paragraph of the introduction to state the value of non-duplicative multiple-trait measurements more clearly.
The following paragraph reads somewhat abrupt but they entail my concerns about the microscopical analysis and how they may be improved in the future. The analysis of electron microscopical pictures is highly problematic as debris will be virtually and practically indistinguishable from bacteria at the resolution the images were acquired. Even a well trained analyst will not be able to distinguish them. Our review of the older SEM literature and our initial analysis of mock and inoculated samples indicates that microbial cells have distinctive features that distinguish them from dust or soil particles in images (as Dazzo et al. exploited for their algorithm). People are generally quite effective at pattern recognition tasks of this type, so we used manual scoring of microbial cells in images --there are good recent examples of human pattern recognition being complementary to machine learning in radiographic images, e.g [we can supply references to this topic if needed]. We certainly agree that there will be false positives in any identification process; in our case we are comparing samples within our experiment, so as long as our false positive and false negative rates are consistent our samples can be compared to each other.
Additionally, yeast will be visible in those images that would not be picked up in the ARISA analysis. Furthermore, the preparation of the specimen will remove bacterial biomass from leaves. The analysis of the SEM pictures that was performed manually instead of using computer-based thresholds or filters, even though using trained individuals, may also be highly problematic to determine the size and shape of individual bacteria since the ability of the analyst to surround the bacterial cell is limited by their ability to draw using a mouse/ touchpad etc.. This is well visible on the right hand side of Figure 2.
The image shown in Figure 2 is a relatively low-resolution figure; high-resolution images are available in the bisque repository (along with our manual traces, for future comparisons). The analysis process was done on the same computer and monitor setup, so the analysts' resolution for tracing would be the same for each sample.

The abundance of bacterial biomass seems extremely low -it would help the reader to provide a
The abundance of bacterial biomass seems extremely low -it would help the reader to provide a graphical representation of the biomass coverage and its development over time. I recommend the repeat the experiment using an independent marker such as a DNA intercalating fluorescent dye or fluorescence in situ hybridisation and fluorescence microscopy. Since spatial data was not important to the analysis, I also recommend to recover the bacterial biomass by washing instead of performing on leaf observations. In general however, I am very doubtful of the use of size markers to cluster bacterial cells into operational taxonomic unit equivalents, bacteria are notoriously flexible if it comes to size: imagine a cell that is growing up to soon divide versus a cell that freshly divided. Frank Dazzo's work, which is also cited by the authors takes the cell size analysis a step further and includes additional geometrical measures to determine "morphotypes" that he uses as a proxy for diversity, which is something that the authors should consider to improve the resolution of their SCU analysis. I understand however, that there is no suitable technique available thus far that would yield a single-cell result that would resemble an OTU-based analysis. A combination of FISH + morphological analysis may be the most advanced option thus far. In summary, I am missing a motivation in the introduction for the used technique and why the authors believed that is would offer merit over the complementary ARISA approach.
We very much agree that method developments are urgently needed in this area. As you note, there are many interesting new image-based and sensor-based technologies that we think could be used for future studies. Since we conclude from our results that morphological phenotypes provide additional, new information beyond what we can learn from ribotype patterns, we thus provide the background for future researchers to justify further development of morphological phenotype measurement tools. We argue that morphology, metabolites, functional genes, proteins, and pathways are likely to be important for microbial community development and thus we encourage automated sampling device development that could measure all these traits in each sample. Future image analyses could also use spatial modeling theory and analysis tools to examine microbial distributions along the heterogenous and rugged leaf surface, possibly with a prior that adjacent cells are more likely to be related than distant cells. A micro-sampler that could select individual cells as it traversed the surface would be the optimal solution, though substantial technology development would be needed to create such a sampling device.
From the current description it is unclear to me how different SCUs were assigned, how many classes were detected etc. Some corresponding results are well hidden in the supplemental data (S1 C), however in the corresponding graphs, the units are lacking. My guess is, that they refer to µmˆ2, then again, I am confused that the chosen bins are of different width each time. Again surprising to me is, that some identified bacteria projected an area as large as 20 µmˆ2, which seems unlikely. As a side note, the quality of the pictures is very low and they are full of compression artefacts, the authors should provide a higher quality screenshot to aid readability. Additional discussion why image analysis was not suitable and how it may be improved in the future would be useful.
We revised our methods section description to make it clear that these SCU refer to microbial cells, not necessarily bacteria. We added the area units to supplemental results section c. The full list of SCU taxa with sizes and other details is available in the CyVerse data file "binnedSEMsizes.csv". All the images and metadata are available in the Bisque repository for future development of image analysis methods; each image has the keV, scale bar, magnification and quadrant size included within the image file. Maize leaves vary in lifespan over development, with the youngest leaves being shed after approximately ten days and upper leaves remaining until the end of the season (~90 days). We have added the citations you suggest to our discussion section.

Specific comments:
The first sentence of the introduction should be corrected as the phyllosphere is the habitat that is colonised by microbial communities, not the community itself. We have updated the citations as you suggest.

Methods:
Electron microscopy: Instead, or additionally, to referring the absolute magnification, acquired area should be mentioned to better understand how representative the acquired field of views were.
We added that information to the methods section.
The "Ribosomal spacer amplification and separation (ARISA) paragraph" ends with "Bacterial sizes acquired from all images were binned using methods described by Shimazaki et al.. Bin sizes were assigned to the microbial samples as size-class units (SCU) and a microbial community profile was created for the samples using the RiboSort package in R." I assume this should be shifted to the previous paragraph, otherwise I don't understand the paragraph Corrected, thank you for identifying this typographical error. Paragraph "SEM image analysis": it is not clear to me how Image Pro Plus was used to measure "bacterial areas", did you employ a detection algorithm that only worked within the circled areas, or were the circled areas themselves measures?
We added more explanation of the area calculation used by this software to the methods section.
To maintain a parallel structure throughout the manuscript, i.e. first SEM analysis and then ARISA analysis, Table 1 and Table 2 should be switched in sequence. The same is true for Figure 3 a and b and Figure 4 and 5. Alternatively, the method sections could be switched in sequence.
After we moved the misplaced paragraph that you identified above, all the SEM image information is now in one section, before the ARISA analysis section. The data analysis sections follow. If sub-subheadings are permitted by the publisher we would be happy to add those, to make the methods sub-sections (such as data analysis) clear.

the term SEM OTU occurs in table 3 -this should be corrected to SCU.
Corrected, thank you.
Supplement S1: the description of the figures is not self explanatory and should be improved. Suggestion: "Analysis of S…" and "Total species (S)…" should read "Taxa_S = Total OTU" or "SCU" to be consistent with the nomenclature used in the graph, or the label of the y-axis could be changed.

(1): e00682-13
Overall, the manuscript provides novel information about changes in phyllosphere microbial communities during development stages of an important agricultural crop. The idea that plant leaves can be used for basic studies of microbial community assembly is interesting and the use of complementary datasets (size and composition) was a good approach, because it gets at 2 aspects of the community. There were a few issues with the manuscript that I believe need to be addressed before it is acceptable.
Work is discussed appropriately in the context of the current literature No, unfortunately the literature review needs to be updated. There are several important publications in this field that should be included, and this will require a re-write of the introduction and discussion sections. -Why were the specific day-intervals and developmental stage-dates chosen? Do they relate to specific physiological processes in maize development or were they chosen at random? Explain this choice further in the methods/results.
-Planting all seeds for a particular treatment (i.e. development stage) in the same row sets the experiment up for statistical issues that are usually dealt with using a randomized block. In the current experimental design, one can interpret the results that (for example) day 30 and day 80 plant samples differ in OTU composition simply because the 30 day plants are subject to a different environmental variable than the 80 day plants (maybe soil nitrogen or light or water). Please re-do analyses using a randomized block and provide any information to suggest that there is no gradient that may account for significant differences across treatments.
-The samples were collected and analyzed in 2009. At that time ARISA was a suitable method for this study, but since then, several other methods have been developed that could address the hypotheses more fully. For example, using predicted metagenomics pathways might indicate that there is a connection between community structure and cell size that was not demonstrated with ARISA. The authors need to address that this could be the case in the discussion and that further studies using these techniques could yield different results.
I liked that the investigators included control DNA extractions. Were there any OTUs associated with the controls? How were those dealt with in ARISA processing? I thought that the network visualization approach was interesting, but it is really unclear what the authors are showing with Figure 5b. If it is just that the interconnections between SCUs gets really complicated, then I do not think a figure is necessary.
Sufficient information and source data have been provided to allow others to repeat every step of the work Yes, I think there is sufficient information to repeat the work.
The conclusions are supported by the findings. No, I think that there are a few things that need to be addressed in the discussion. First, please refer to the references I listed above and the articles that have cited them since, and re-think the discussion in that new framework. Second, the statistical issues of having all of 1 treatment in 1 row need to be addressed, otherwise all statistical results could be considered related to some unknown environmental gradient. These sampling intervals were chosen to capture maize leaf developmental stages (classically described as V(1 to n) for vegetative growth); each stage occurs at intervals of approximately one week and our sampling dates reflect this timing. We added a statement to the methods section explaining this.
-Planting all seeds for a particular treatment (i.e. development stage) in the same row sets the experiment up for statistical issues that are usually dealt with using a randomized block. In the current experimental design, one can interpret the results that (for example) day 30 and day 80 plant samples differ in OTU composition simply because the 30 day plants are subject to a different environmental variable than the 80 day plants (maybe soil nitrogen or light or water). Please re-do analyses using a randomized block and provide any information to suggest that there is no gradient that may account for significant differences across treatments.
The experimental design was chosen to fit agricultural practice, specifically maize field cultivation practice. This means that seeds are planted in rows with spacing dictated by planting equipment (30 in rows at this field station) and weed control (cultivation), fertilization, and pest control are carried out in those rows. Our experimental seeds were planted in adjacent rows, so nested spatial models that incorporate the physical distance between plants across the plots would be an ideal model to fit. However, spatial multivariate models with zero-inflated distributions for species counts are quite complex and we chose not to add this complexity to our model, as we have no evidence for spatial gradients within these short physical distances; for example, the soil type is the same (within the limits of our measurement instruments), as is the temperature and light regime.
Our experimental design controls for field-level effects of environmental history such as recent 1 2 Our experimental design controls for field-level effects of environmental history such as recent rainfall, but cannot and does not control for developmental history of exposure to environmental factors in a field setting. Controlled environment chambers would be required to control for past environment effects; however, chamber experiments are much less agronomically relevant, so we chose to carry out field experiments.
-The samples were collected and analyzed in 2009. At that time ARISA was a suitable method for this study, but since then, several other methods have been developed that could address the hypotheses more fully. For example, using predicted metagenomics pathways might indicate that there is a connection between community structure and cell size that was not demonstrated with ARISA. The authors need to address that this could be the case in the discussion and that further studies using these techniques could yield different results.
We agree that functional gene and metabolic pathway information would be valuable, and we suggest some specific hypotheses to test using such information in the discussion section. An automated cell-sorting assay would be required to collect suitable multivariate paired metabolic/pathway and cell size data, and we encourage development of instruments of this type.
I liked that the investigators included control DNA extractions. Were there any OTUs associated with the controls? How were those dealt with in ARISA processing?
After optimization of the nested PCR protocol there were no OTU associated with the controls. We did find that the PCR master mix was important for consistent amplification and low false positive rates, so we included details of the PCR reactions within our methods section. With assistance from the F1000 editorial staff we attempted to create interactive network graphics that allow readers to zoom and highlight, but it was not technically feasible at this time. Future analyses of succession might explore network modularity measures, as the networks we see in Fig. 5 appear to show modularity, especially in conjunction with future metabolic pathway analyses. Sufficient information and source data have been provided to allow others to repeat every step of the work Yes, I think there is sufficient information to repeat the work.
The conclusions are supported by the findings. No, I think that there are a few things that need to be addressed in the discussion. First, please refer to the references I listed above and the articles that have cited them since, and re-think the discussion in that new framework. Second, the statistical issues of having all of 1 treatment in 1 row need to be addressed, otherwise all statistical results could be considered related to some unknown environmental gradient.

Some minor comments:
The introduction is mostly about community assembly. However, to me, the really interesting thing about this study is that it is done using a major agricultural crop. Little is discussed about it why would be interesting to know about bacterial community assembly in plant. Perhaps there is a this context for plant productivity or agricultural disease that could be expanded to make the intro more specific to the study? specific to the study?
We added more text to the first paragraph of the introduction to incorporate agricultural context. We are pleased that you find this interesting! There is currently substantial investment in the business sector around plant microbiomes and probiotics (several hundred million dollars, and at least four new companies that we know of), but there is relatively little publicly available basic research on agricultural plant microbiomes. We added a citation from one review of this area (Parnell et al., 2016).
I found the organization of the manuscript difficult to follow. Please reorder so that all SEM-related sections are together and all ARISA sections are together in the methods. Then follow whatever order you have chosen for the results.
We moved the SEM image analysis section to the correct place after the SEM image acquisition section in the methods. Thank you for pointing this out!
The supplementary data tables are simply raw R output. Can you summarize the raw output in a supplementary table and then provide the raw output so that it is easier for the reader to interpret?
These are output from the Primer-E program with full details of the model fits and outcomes. We added more explanation to the methods for Tables 1 to 4 to make this clear. Tables 1 and 2 if the significances were just included in Figure 3 instead of in a separate table. (i.e. put the letters above each of the averages shown in Figure 3 to indicate significances)

It would be easier to interpret
The permdisp/permanova and non-parametric richness analyses have different assumptions, so we chose not to place them in the same figure. Figure 3 -the red/green dots are difficult to see (and I am not red/green color blind, so it would be even harder for 10% of our colleagues). Please adjust these so that they are more obviously different or have different shapes.
We changed these to different colors (manually). We agree that these are not correctly formatted and we have submitted a request to the company that developed the program asking them to adjust their defaults to color-blind-compatible colors and shapes.

Supplementary Figure 3 needs a legend that identifies what each of the SCU_IDs are and more information about the graph.
We have added the caption to this figure.
On page 9, the authors state that "In the ARISA datatype, highly variable co-occurring…". Please provide more of a definition of what you mean by "highly variable". Is this quantifiable?
We added more explanation to this sentence. Since network interaction patterns are a summary measure (and thus not replicated) we do not discuss them quantitatively.
Discussion Section -there are 3 different referencing styles used. Change for consistency.