The beauty of being (label)-free: sample preparation methods for SWATH-MS and next-generation targeted proteomics

The combination of qualitative analysis with label-free quantification has greatly facilitated the throughput and flexibility of novel proteomic techniques. However, such methods rely heavily on robust and reproducible sample preparation procedures. Here, we benchmark a selection of in gel, on filter, and in solution digestion workflows for their application in label-free proteomics. Each procedure was associated with differing advantages and disadvantages. The in gel methods interrogated were cost effective, but were limited in throughput and digest efficiency. Filter-aided sample preparations facilitated reasonable processing times and yielded a balanced representation of membrane proteins, but led to a high signal variation in quantification experiments. Two in solution digest protocols, however, gave optimal performance for label-free proteomics. A protocol based on the detergent RapiGest led to the highest number of detected proteins at second-best signal stability, while a protocol based on acetonitrile-digestion, RapidACN, scored best in throughput and signal stability but came second in protein identification. In addition, we compared label-free data dependent (DDA) and data independent (SWATH) acquisition on a TripleTOF 5600 instrument. While largely similar in protein detection, SWATH outperformed DDA in quantification, reducing signal variation and markedly increasing the number of precisely quantified peptides.

Mass spectrometry (MS)-based proteomics facilitates the identification of a large number of proteins in a single experiment [1][2][3] . As a result this technique has been established as a powerful complement to the classic tools of protein chemistry, such as western-blotting or enzyme-linked immunosorbent (ELISA) assays, which are of considerably lower throughput and specificity. Whereas traditional proteomic workflows mainly aimed to identify proteins, quantification has meanwhile become a major focus of technological development in this field 4,5 . On a quantitative liquid chromatography/mass spectrometry (LC-MS) platform the amount of analyte and the corresponding chromatographic peak area are in linear correlation, hence concentration values are obtained through comparison with reference standards 6 . A technically powerful approach for protein quantification involves the use of isotope-labelled standards that show a similar structure and chromatographic behaviour to the target molecule, but are distinguishable from the target by mass 7 . When added at an early stage of the quantification workflow, they allow for correction of analyte loss during sample preparation and analysis, hence rendering the quantification experiment robust. However, the requirement for isotope-labelled standards makes proteomic workflows expensive and reduces flexibility, as their production is laborious and applicable only to samples for which these standards can be obtained or generated (please see Discussion). Moreover, as both the analyte and standard need to be measured, they double the analyte load for the mass spectrometer. Consequently, recent developments that have enabled label-free peptide and protein quantification have attracted much attention [8][9][10][11][12] . In a label-free experiment, quantification is achieved through comparison of peak areas obtained for an analyte under two or more biological conditions; for instance to compare a wild-type versus a mutant, a compoundexposed versus a control condition, or a biological time series [13][14][15][16] . Upon normalisation, ideally to one or more unaffected internal standards, this approach yields a relative expression value for the target protein. This measure is then used to evaluate whether the expression of the target is altered between the conditions tested. In the case of high sequence coverage, absolute quantities may also be estimated, as peak intensities obtained for the best ionizing peptides correlate in approximation with their absolute concentration 10,12 .
The absence of an internal standard spiked early in sample preparation protocols means that label-free methods are sensitive to technical variance, consequently, label-free proteomics requires high instrument performance and standardization of sample preparation methods. In terms of instrumentation, limitations arise from the linear range of the mass spectrometer and the sample capacity of the liquid chromatography. Moreover, in untargeted proteomics, the stochastic nature of data-dependent acquisition methods, where ions are selected for analysis based on their intensity, reduces the number of quantifiable peptides to only those fragmented in all samples 17,18 . This problem is a consequence of the high number of co-eluting peptides that may considerably exceed the mass spectrometer's sampling speed when analysing full proteomes, a problem that is amplified by the high number of replicates used in a label-free study. By facilitating data-independent acquisition, where all ions are fragmented irrespective of their intensity, recent studies have demonstrated the possibility of circumventing the need of isolating individual peptides 11,17 . One such method, pioneered by the Waters Corporation, is termed MS E, 11 . In this approach fragment ions are assumed to have the same elution profiles as their precursors; this similarity is then used to pair fragments and precursors when a number of parent ions are co-fragmented. In the typical workflow, fragment pairs and their corresponding precursor ions are retrospectively paired for database searching 11 . More recently, in a workflow termed SWATH, a mass range relevant for peptidebased proteomics (400-1200 m/z) is scanned in 25 m/z windows, in which all ions that fall into that window are simultaneously fragmented (MS/MS all ). Quantification is then conducted based on the peak areas of extracted ion chromatograms (XIC), which are computationally reconstituted from the merged spectra on the basis of both experimental and in silico generated spectral information 17 .
Sample preparation techniques are equally important for the performance of a label-free experiment, and easier to optimize on a daily basis than the mass spectrometer's properties. The main objective for a label-free sample preparation method is to obtain stable peak intensities between replicate sample preparations. Consequently, the ideal workflow avoids processing steps that are prone to stochastic analyte losses, and the LC-MS set up is operated in such a way that the instrument's dynamic range does not become exhausted. These objectives may differ to classic shotgun proteomics, where the number of identifiable peptides and proteins is the most important value, and a higher variation in signal intensities is acceptable. For this reason, a sample preparation method and LC-MS/MS configuration, which is ideal for identifying a maximum number of proteins, may be sub-optimal for label-free quantification, and vice versa. For instance, pre-fractionation of the sample prior to the LC-MS/MS analysis, a popular strategy to improve peptide identification, adds another level of complexity to the sample preparation, increasing the signal variability and thus, is avoided wherever possible.
The main objective of the study presented here is to benchmark proteomic sample preparation methods for their suitability in labelfree proteomic studies. We compare popular sample protocols that are based on in gel 19 , filter-aided 20,21 and in solution 9,22 digestion procedures. Processing identical proteome samples obtained from budding yeast, and acquiring proteomic data without further prefractionation on two LC-MS/MS platforms, these methods were compared by their performance in sample preparation, their precision in label-free quantification experiments and their effectiveness in terms of time and reagents. Through the analysis of these samples on a 5600 QqTOF 23 instrument operating in either a datadependent mode or SWATH 24 mode, this study concludes with an

Amendments from Version 1
We incorporated the suggestions of both reviewers regarding the use of instrument platform and retention time standards, and to avoid confusion, we refer to a novel manuscript which has been published after the first version of this article, but uses a similar nomenclature for a different protocol Erde, J. et al. (2014). Other revisions concern minor suggestions to improve/correct the language in the manuscript. evaluation of data-dependent and data independent acquisition, and suggestions for the optimal protocol selection.

Preparation of yeast cells
All experiments were conducted using a single culture derived from a single colony of the yeast strain BY4741 25 . The strain was transferred to yeast peptone dextrose (YPD) media prepared as described in 26

Protocol selection and overall assessment
For this comparative study we selected an in gel digest method adapted from 19 , conducted in combination with an SDS-based and native protein extraction, two filter-aided (FASP (Filter Aided Sample Preparation) adapted from 32 and a recent enhancement termed eFASP adapted from 33 , and two in solution procedures (RapiGest, adapted from 22 , and RapidACN adapted from 16 ). Their characteristics are summarized in Figure 1. All procedures are given in labprotocol format as Supplementary protocol 1 to Supplementary protocol 6. Please note that after the first version of this manuscript was published, an alternative sample preparation method by Erde et al. was also named eFASP which describes however an alternative protocol.

In gel digestions
In gel digestions are popular sample preparation methods as they are convenient, and offer a simple way of protein pre-fractiona-An accumulation time of 100 ms was set for each fragment ion resulting in a total duty cycle of 3.3 s.
For LC-MS/MS analysis using the QExactive platform, separation of peptides was performed at a flow rate of 300 nL min -1 using a reverse-phase nano column (Easy-spray, Thermo Scientific Pep-Map C18, 2 µm particle size, 100 Å pore size, 75 µm i.d. × 50 cm length). Peptides were loaded onto a pre-column (Thermo Scientific PepMap 100 C18, 5 µm particle size, 100 Å pore size, 300 µm i.d. × 5 mm length) from the Ultimate 3000 autosampler (Dionex) with 0.1% formic acid for 3 minutes at a flow rate of 10 µL min -1 . Polar impurities were removed by running the system isocratically at 100% A at a flow rate of 5 µl min -1 for 6 min. Finally, tryptic peptides were loaded onto the analytical column and separated using a linear acetonitrile gradient of 5-35% B for 155 min at a flow rate of 300 nL min -1 . The LC eluant was injected into the mass spectrometer by means of an Easy-spray source (Thermo Fisher Scientific). All m/z values of eluting ions were measured in an Orbitrap mass analyzer, set at a resolution of 70,000. Data dependent scans were employed to automatically isolate the 20 most abundant ions and generate fragment ions by higher energy collisional dissociation (HCD) in the quadrupole mass analyser. Only peptide ions with charge states of 2 + and above were selected for fragmentation. Finally, the measurement of the resulting fragment ions was performed in the Orbitrap analyser, set at a resolution of 17,500. For the extraction of data acquired in SWATH mode, an ion library for yeast was generated from data acquired in data dependent mode. Spectral data were acquired in DDA mode and analysed using the Paragon search strategy as described above. Detected peptides were then corrected for retention time shifts, and the corresponding spectra were combined leading to a library containing 2800 unique yeast proteins. For extraction of SWATH data and peptide quantification Spectronaut 3 (Biognosys) and Skyline 31 were used. In parallel, Skyline was also used for quantification of peptides from data dependent acquisition experiments. Subsequent data analysis tion through gel slicing and removal of small or high molecular contaminants that could interfere with trypsin digestion. These approaches are compatible with multiple sample extraction buffers, can easily be combined with gel staining that does not interfere with protein digestion 34-36 , and thus provide a visual quality control over the protein sample. However, casting and running the gels render these protocols time consuming; hence the protocols are of relatively low throughput. In this study, we benchmarked in gel digestion in combination with both SDS-containing (Supplementary protocol 1) and SDS-free protein extraction (Supplementary protocol 2) (In gel/SDS and In gel/ABC, respectively 19 , Table 1). SDS-PAGE was however not used as a tool for pre-fractionation. In order to compare in gel methods with filter-aided and in solution digestion, the full mass range was processed and measured at once.

Filter-aided sample preparation
The second set of assessed protocols involves digestion on filter units. These protocols are popular due to their flexibility and due to the fact that they facilitate a simple handling and require only a modest hands-on time (~3 hrs). The first protocol tested, FASP 32 involves a dual protease digest (Lys-C and trypsin), while the second filter-aided procedure (here called eFASP) is a stepwiseoptimized version of FASP by Shevchenko and colleagues 21 that involves protein precipitation.

In solution digestions
The final two protocols tested in this study perform protein digestion in solution. The first protocol is based on the proprietary, acid degradable detergent RapiGest (Waters 37 ), included in a protocol derived from Von der Haar et al. 22 . This protocol involves protein precipitation, which renders the RapiGest procedure more laborious as compared to the second in solution protocol, termed RapidACN. This rather simple method is based upon a tryptic digest in acetonitrile that is combined with a filter-based sample cleanup 9 . The RapidACN method requires the least number of handling steps and lowest hands-on time (~2 hrs per sample), overall facilitating the highest throughput among the tested procedures.

Protein identification and compartment specificity
The six protocols, provided as detailed protocols in the Supplementary materials, were used to process an identical, full proteome sample of Saccharomyces cerevisiae. This single cellular eukaryote possesses a proteome of medium complexity (6,000-7,000 protein coding genes 38 ) and has served as a reference organism in many landmark proteome studies 29,39-41 . Here, the use of yeast facilitated sampling from a single culture, bypassing the possibility of biological variability occurring between samples analyzed. However, once proteins are extracted, the here tested protocols are fully applicable to processing samples obtained from other species as well. To process the yeast pellets, the protocols were executed as close as possible to their original recipes (with unavoidable minor deviations highlighted in the Protocol section), both in complete replicates (= protocol triplicates), and in injection replicates for comparing the acquisition methods (= injection triplicates). Samples were analysed on a hybrid quadrupole time of flight (TripleTOF5600, AB/ Sciex) mass spectrometer for DDA and SWATH acquisition, or on a hybrid quadrupole orbitrap mass spectrometer (QExactive, Thermo Scientific) for DDA acquisition. DDA database searches were conducted using Mascot (for TripleTOF5600 and QExactive, Matrix Science, 42 ) or ProteinPilot 43 (for TripleTOF5600, AB/Sciex), whilst SWATH data was processed with Skyline 31 and Spectronaut 44 (Biognosys) software.
It is noteworthy that in this study the analytical setup was adapted for quantification and not to maximize the number of protein identifications. This involved the injection of low amounts of sample (equalling 1 µg digest per protocol) to prevent column overload and considerable overrun of the dynamic range. Moreover, to allow a direct comparison of the protocols, data was recorded in single injections and samples were not pre-fractionated. This strategy yielded highly reproducible quantification results, achieving up to < 5% coefficient of variance (CV) values in label-free replicate injections for some protocols, as shown in Figure 4.

Digest efficiencies
As an indicator of the quality of tryptic digests, we first assessed the relative occurrence of partially cleaved peptides in data obtained from triplicate injections on the TripleTOF platform. All filter-aided and in solution protocols yielded reasonable digestion efficiencies as revealed by an analysis with both Paragon (AB/Sciex, Figure 2a) and Mascot (Matrixscience, data not shown) search engines. Both in solution and the eFASP procedure yielded arginine-and lysine cleavages in a similar ratio as found in the yeast proteome, with the lowest number of spectra assignable to missed cleavage tryptic sites found in the RapiGest dataset ( Figure 2a, and Figure 2b). In the fourth protocol (FASP), however, we found that the lysine cleavages overrepresented compared to arginine cleavages ( Figure 2b). This indicates that the presence of Lys-C in this protocol increased the overall digestion efficiency of lysine residues; however this may introduce a bias in (absolute) quantification experiments by overrating lysine over arginine peptides in quantification. With the employed in gel protocols we obtained a significantly higher number of spectra that corresponded to uncleaved peptides. As a further indicator of incomplete digestion, these protocols also gave a similar number of arginine and lysine peptides ( Figure 2a). Incomplete cleavage of peptides can render a sample preparation unsuitable for absolute quantification, but also for relative quantification, as the rate of cleavage may not be reproducible between replicates 9 . For this reason, we consider the in gel protocols as employed (without prefractionation on the whole-proteome sample) to be potentially erroneous in protein quantification and identification, and excluded the data from the assessment of protein quantification quality. This result however does not exclude the possibility that on other samples, in combination with gel slicing (geLC-MS), or with modified in gel protocols, acceptable cleavage efficiencies are achieved, and thus, this result should not be interpreted as a critique of in gel methods in general.

Protein identification
The number of detected peptides correlated with the sum of recorded total peak area, confirming that the instrument was operating within its dynamic range ( Figure 2c). The yield of detected peptides ( Figure 2c) and proteins ( Figure 2d) revealed different performance of the tested protocols. For both data dependent (DDA) and SWATH acquisition, the two in solution protocols (RapiGest and RapidACN) gave the highest number of detectable peptides and proteins. Filter-based FASP and eFASP protocols ranked in the middle range, whilst a significantly lower number of proteins were detected from the in gel digests. Of note, SDS-based compared to native protein extraction increased the number of membrane protein detections in the in gel procedure, but in total a higher number of peptides were obtained in the natively extracted samples.
To exclude the possibility that these results were platform specific, we injected the same samples on a QExactive mass spectrometer, operating with a different HPLC system and column (Dionex Ultimate 3000; 2 µm particle size C18, 75 µm i.d. × 50 cm column, see methods section). The number of protein IDs obtained with the two platforms correlated linearly, indicating that the ID performance of the tested protocols is platform independent (Figure 2d, Inset). Additionally, we tested to what extent injecting higher amounts of sample or pre-fractionation would increase the number of identifiable proteins. Single injection of 10 times the RapidACN sample increased the number of identifiable proteins by 34% to 1550 (QExactive), while high-pH RP HPLC pre-fractionation of a RapidACN digest led to the identification of 2800 proteins (TripleTOF). Similar tendencies were observed with the other protocols as well, indicating that when combined with sample pre-fractionation, all protocols and both platforms are suitable for ID-optimized experiments, as addressed in other studies. To be able to compare data dependent (DDA) and data independent (DIA) acquisition in terms of protein detection, we then analysed the samples using SWATH mode. Overall, when setting the highest quality threshold on SWATH-detected peptides (Spectronaut Q value < 0.01), SWATH and DDA detected a comparable number of proteins for the in gel and FASP procedures. However, SWATH outperformed DDA in the samples with higher peptide content, Rapi-Gest and RapidACN, leading to a modest but consistent increase in protein detection numbers (Figure 2d).
Performance of sample preparation methods in covering the variety of the proteome Next we used the TripleTOF/DDA data to assess whether the protocols covered a similar set of proteins. for a complete overview of GO terms). However, different results were obtained for membrane proteins. The lowest relative content of membrane proteins was obtained for those protocols that extract proteins under non-denaturing conditions, namely RapidACN and in gel/ABC. Conversely, most membrane proteins were detected in the detergent-rich protocols, eFASP and RapiGest. Overall, FASP and eFASP yielded the most balanced representation of both the membrane and cytosolic fraction, while RapidACN data exhibited the strongest bias towards cytosolic and against membrane proteins ( Figure 2c).
Finally, we tested whether the protocols covered the proteomic mass range and charge state equally. The proteomic mass range was similarly represented by all protocols with a slight positive bias towards large proteins in all protocols (Supplementary Figure 1a).
high occurrence of uncleaved peptides, which may affect protein identification, the in gel methods are omitted from this illustration. Indeed, the proteins identified in the in gel samples were to > 95% covered by filter aided and in solution methods as well (data not shown). All other protocols covered specific sets of proteins. RapiGest yielded the highest absolute number of unique IDs, while eFASP provided the highest percentage. Hence, in targeted proteome studies, sample preparation with different protocols might be considered in order to increase the probability of quantifying the desired target.
We next assessed whether these differences correlated to the coverage of cellular localisations. The tested protocols gave high coverage of the GO term cytosol, and performed equally on the mitochondrial proteome (  of variation (CVs), and we plotted the frequency of CVs in twodimensional distribution histograms ('violin plots', Figure 4a). DDA acquisition resulted in a CV maximal likelihood of 20% for eFASP, FASP and RapiGest. Although most peptides showed a variation of this magnitude, it is worth noting that there was a considerable spread of CVs in all three protocols, with some peptides showing as much as 140% variation. By far the highest signal reproducibility with a CV maximal likelihood of 7% was obtained with the RapidACN protocol (Figure 4a), indicating best suitability of this protocol in label-free quantification.
Next, we counted the number of precisely quantified peptides, defined as peptides with a CV < 15%. Also in this measure, the RapidACN procedure outperformed the other methods, while RapiGest, and eFASP performed second and third best, respectively ( Figure 4c). Not covered in this benchmark is the performance of The procedures, however, differed in the representation of proteins with a certain isoelectric point (pI). The best representation of the proteome pI distribution was obtained with RapiGest (deviation coefficient (d) = 2.4), followed by FASP (d = 2.8) and RapidACN (d = 2.9) (Supplementary Figure 1b). In gel procedures scored least as they were negatively biased towards neutral proteins, and achieved a lower d value of 5.3 or 5.9 for in gel/ABC or in gel/SDS, respectively.
Label-free quantification Next, we compared the protocols for their consistency in label-free quantification. As illustrated in Figure 2c, the number of identified peptides correlated with the sum of total peak area recorded, hence all procedures in principle lead to quantitative results. To be able to compare the protocols, we expressed the variation of signal intensities obtained from replicate sample preparations as coefficient the individual protocols in repeated sample preparation over longer periods, i.e. weeks to months. This might be required for particular sets of samples that can not be stored without a protease digest, yet require sampling on different days to address a specific biological question.
Finally, we tested whether SWATH analysis improved label-free quantification. Comparing the CV distribution of peptides detected both in DDA and SWATH data using the RapiGest protocol (Figure 4b), we discovered a much more focussed CV distribution around a maximal likelihood of 5% in SWATH, compared to a maximal likelihood of 20% in DDA mode. When counting the number of precisely quantified peptides (CV < 0.15), on the TripleTOF instrument SWATH led to an increase of up to a factor of two and five for RapidACN and RapiGest, respectively (Figure 4c). Hence, SWATH acquisition greatly improved the CV stability with label-free acquisition, the result of which is that a substantial number of peptides were precisely quantified.

Discussion
Stable isotope labelling is a popular and reliable strategy in quantitative proteomics, yet has limitations that arise from an increased analyte load in the precursor ion (MS1) space, and the way standards are produced or incorporated. For instance, targeted protein quantification using AQUA peptides 45 achieves absolute quantification through comparison between the peak areas of light and chemically synthesized heavy-isotope labelled peptides of known concentration. However the costs for such peptides limits the number of proteins quantifiable 7,45 . An alternative strategy is the nontargeted chemical labelling of proteins and peptides with isobaric tags (i.e. iTRAQ, TMT), facilitating multiplexing of proteome samples and providing relative simultaneous quantification of labelled peptides 8,46 . However, frequent co-selection of the reporter ions reduces both the accuracy and precision of quantification 47,48 . Such a problem is circumvented when metabolic incorporation of isotope-labelled amino acid residues (i.e. SILAC 49 , or recent extensions like for instance NeuCODE which is based on different nuclear masses dependent on the isotope combination integrated 50 ), is used to create isotope-labelled standards in vivo. However, this approach is limited to heterotrophic species that consume lysine and arginine from the culture medium, and is in practice limited to tissue culture as the attempt to introduce labelling in animal models becomes extremely expensive 51 .
Label-free experiments circumvent the use of isotope labelled standards, thus are not affected by the above-mentioned limitations. As such, label-free experiments are ideal complements when isotope labelling becomes a limitation. However, the label-free method or strategy lack possibilities to correct for selective sample loss, and hence are more sensitive to variations in sample preparation and instrument performance. The protocols employed consequently require more rigorous validation.
In gel digests Our comparison starts with a classic in gel digestion method 19 , which is tested in combination with SDS-containing-and SDS-free protein extractions (Supplementary protocol 1 and Supplementary protocol 2). These popular cost-effective procedures are based on the principle that a protein sample is denatured and separated on an SDS-PAGE gel prior to reduction, alkylation and protease digestion that are conducted within the gel matrix. The gel fulfils the function of sample clean up, as it removes positively charged contaminants as well as large macromolecules (i.e. nucleic acids) and small chemical compounds, and is very robustly applied to a large variety of sample types. Furthermore, the excision of individual bands or mass ranges make in gel digestions attractive wherever a simple sample pre-fractionation is required. Proteome pre-fractionation in gel (geLC-MS) has resulted in a significant proteome depth and dynamic range in studies were > 5000 distinct proteins were confidently identified and quantified 52,53 . Moreover, in gel digests have proven ideal when gel bands resulting from individual proteins are to be identified (i.e. for studying protein complexes). In the present study however, we did not make use of sample pre-fractionation. In order to achieve comparability with the other protocols, the full mass range was processed for the digest (see Methods section, and Supplementary protocol 1 and Supplementary protocol 2). This treatment led to a full representation of the proteomic mass distribution (Supplementary Figure S1). Under these circumstances however, the classic in gel protocol applied proved the least suitable method for label-free quantification. The protocol was the most time consuming, yet yielded a significant number of miscleaved peptides, and we detected the lowest number of proteins and peptides in total. Differences between SDS-free and SDS-containing sample extraction affected the relative content of membrane proteins identified, which was higher in the latter, whereas the native (SDS-free) extraction resulted in a higher number of proteins identified in total. This result should however not be interpreted as a general critique on in gel methods for other applications, as in combination with protein pre-fractionation (gel-slicing), they have proven well as suitable sample preparation methods in ID experiments 52,53 .

Filter-aided sample preparation
The dependence on filter units in the two tested filter-aided sample preparation procedures, FASP 32 , and one of its recent extensions (here called eFASP 21 ), increases the material costs, but has advantages for sample handling and throughput. Indeed, handling of the first protocol, FASP, was efficient and achieved a reasonable throughput with modest hands-on time (Supplementary protocol 1). In protein identification, FASP achieved the highest relative amount of detected membrane proteins. Hence, this protocol might be an ideal choice when membrane proteins are to be studied.
FASP was the only protocol in this study where digestion was carried out using a combination of proteases, Lys-C and trypsin. Similar to previous reports 54 , we observed that the addition of Lys-C for DDA, and second best in combination with SWATH acquisition. Expressed in absolute quantities, this method yielded the secondhighest number of precisely quantified peptides. Thus, the RapiGest protocol is a versatile and economic method that may represent the optimal choice in many applications. The only inexplicable issue with this protocol was related to the inefficiency of RapiGest degradation and precipitation in a small subset of samples. Thus care must be taken to avoid detergent; injection in the LC-MS/MS setup.
The second in solution protocol (termed RapidACN 9 , Supplementary protocol 6) is detergent-free and based on acetonitrile in sample processing and proteolytic cleavage, followed by clearing samples from high-molecular weight contaminants by a final filtration step. As this protocol is based on a native protein extraction, it identifiedin relative terms -the lowest number of membrane proteins. Moreover, as it does not contain an intensive pre-digest sample treatment, functionality of this protocol may omit tissue where such a forefront clean up is mandatory. Despite these limitations, RapidACN performed best in the metric most crucial for robust label-free quantification, a low CV value in replicate sample digests and injections.
Moreover, compared to the other tested methods, RapidACN was simplest in handling, required the least processing steps and only minimal hands-on time (~2 hrs), while yielding the second highest number of protein and peptide detection both in DDA and SWATH acquisition methods. Hence, RapidACN might be the most suitable solution for a label-free experiment when the focus is not to quantify membrane proteins, or to analyze tissue that requires extensive clean up.

Data-dependent versus data-independent acquisition
We chose to perform major parts of this study on a TripleTOF5600 instrument (AB/Sciex), in order to compare data-dependent acquisition (DDA) with data-independent acquisition (DIA). DIA is believed to be advantageous for label-free quantification, as it is less affected by run to run variations, and as MS2 data is reconstructed in chromatograms that resemble selective reaction monitoring (SRM) 17 . Therefore, this technique appears a desirable choice for the label-free analysis of biological time series, that require many samples (replicates over many time-points) to be compared 15 . The design of the TripleTOF5600 quadrupole allows precursor ion selection in a rectangular rather than a Gaussian mass selection window as in other instruments, reducing the co-selection of peptides falling in the adjacent mass windows 23 . In a workflow termed SWATH, the mass range from 400 to 1200 m/z (in this study: 400-1250 m/z) is scanned in 25 Da windows, and the merged data used to reconstruct spectral (MS 2 ) m/z chromatograms 17 . Processing SWATH data with Spectronaut (V. 3.0.337, Biognosys), we compared the performance of DDA with SWATH in protein detection and label-free quantification. In samples with low peptide content, the number of detected proteins with DDA and SWATH was comparable. However, in the in solution protocols that led to highest IDs, SWATH acquisition gave a slight but significant advantage in terms of peptides detected. This indicates that this approach is advantageous in protein detection when coupled with complex matrices. In contrast, SWATH was however clearly advantageous in label free quantification. The strongest improvement for SWATH increased the relative digestion efficiency. However, this resulted in an over-representation of lysine over arginine containing peptides, which may lead to bias in cases where this protocol is used in an absolute quantification experiment. In label-free quantification, FASP performance compared to the other protocols, was average both in the number of precisely quantified peptides and in the CV values obtained for replicative sample preparations. It is important to mention in this context that the performance of FASP procedures is dependent on the filter units that are available from different manufacturers, however the exact filter unit used in the original FASP paper 32 is no longer available. In this study we have chosen Amicon Ultra-0.5 3k for both FASP based protocols as used in eFASP by Shevchenko et al. 21 , as their cut-off rate (3kDa) is the closest to the addressable mass range of the SWATH acquisition (400-1200 m/z). Further work from Wisniewski et al. demonstrated that also larger cut-off rates up to 50k are suitable in combination with the FASP protocol, and can improve the identification rate of larger proteins and peptides 55 . Moreover, in difference to the other protocols tested in this study, the tryptic digest in FASP is conducted in a very high concentration of urea. A simple protocol adaptation to influence the tryptic digest could thus be to change the buffer conditions, e.g. to a buffer as used in eFASP 21 (Supplementary protocol 4).
The second filter-aided protocol, eFASP, represents a stepwise optimisation of FASP, and contains several alterations compared to its predecessor 21 (Supplementary protocol 4). The protease digest is performed using trypsin only, and the protocol includes a lipid removal step and uses n-octyl-d-glucopyranoside (nOGP) as the detergent in sample preparation. The latter might be regarded as an undesirable addition to the sample, as nOGP can interfere with electrospray ionisation. Indeed, despite all washing steps, we could detect traces of nOGP in the MS/MS spectra, and the collection of MS data was reduced at the time a nOGP sodium adduct eluted (data not shown). Despite this, the modifications made for eFASP clearly improved the performance in protein and peptide identification. However, in our hands, they did not improve the precision in label-free quantification, therefore the performance of FASP and eFASP in this measure was comparable ( Figure 4). Hence, the main advantage of eFASP over FASP lies in improvements in protein identification and proteome coverage. Please note that the protocol by Shevchenko et al. differs from a recently published protocol by Erde et al., also termed eFASP.

In solution digestion
The first method tested (Supplementary protocol 5) is based upon the commercial reagent RapiGest SF (3-[(2-methyl-2-undecyl-1,3dioxolan-4-yl)methoxy]-1-propanesulfonate 37 (Waters)), an anionic detergent which is depleted from the sample through acidic cleavage. The established protocol 22 contains a step for lipid removal and a precipitation step that renders this procedure more laborious compared to the FASP and RapidACN protocols. However, as it does not involve any filter unit, it was most economic in terms of material costs per sample if one disregards the in gel protocols. Moreover, it yielded the highest number of protein and peptide IDs, and it detected the highest absolute number of membrane proteins. In label-free quantification, it scored third best in the average CV over DDA acquisition was observed when it was used in conjunction with the RapiGest protocol (Supplementary protocol 5), where the number of precisely quantified peptides increased by a factor of five, followed by the combination with RapidACN (Supplementary protocol 6), where this measure doubled (Figure 4c). Of note, SWATH employed in combination with the RapidACN, resulted in an average CV below 5%, representing a superior value obtained in a label-free experiment. These improvements mainly resulted from a more precise quantification of peptides in the mid to high abundance range, whereas there was no increased improvement in quantification of low abundant spectra. We assume that this difference could be further optimized by improving the SWATH peak selection algorithms, as noise in the low abundance window results from occasional misassignment of fragment ions to precursors.

Conclusions
By facilitating label-free quantification, second-generation proteomics techniques enable flexible proteomic workflows. As the protocols cover different sets of proteins and cellular compartments, the main determinant to select the best suitable method and workflow remains the biological question and the set of proteins to be addressed. Despite this, sample preparation methods differ in precision, sensitivity and throughput. Under the conditions of this benchmark, and under the conditions in our laboratory, a combination of in solution digestion protocols RapiGest or RapidACN with SWATH acquisition yielded optimal results for a label-free proteomics experiment. Achieving reliable quantification at reasonable numbers of detected proteins, label-free quantitative proteomics represents a suitable alternative to isotope labelling in addressing a series of biological problems.

Competing interests
The authors declare no competing interests.     • Add SDS-PAGE loading buffer to a lysate aliquot containing 50 µg protein and incubate at 95°C for 5 min to denature proteins

Supplementary protocol 1: In-gel digestion in combination with SDS extraction
• Perform electrophoresis for 20 min at 100 V to allow the proteins to enter the separating gel • Excise the sample from the gel in a single slice

Excision and in-gel digestion of protein bands
• Cut the gel slice into small pieces (1 mm) and place into a new reaction tube • Add 100 µL (or enough to cover) 25 mM ABC/50% (v/v) ACN and vortex for 10 min • Centrifuge at 16,000 × g for 30 sec and remove and keep the supernatant using a gel-loading micropipette tip. Repeat this step for 2 or 3 times • Evaporate the solvents in a vacuum concentrator centrifuge (approximately 20 min) • Add 50 µL (or enough to cover) 10 mM DTT in 25 mM ABC to the dried gel pieces • Vortex and centrifuge at 16,000 × g for 30 sec • Incubate with the reductive solution at 56°C for 1 h • Remove the supernatant and add 50 µL (or enough to cover) 50 mM IAA to the gel pieces. Vortex and centrifuge at 16,000 × g for 30 sec • Incubate with the alkylation solution in the dark for 30 min at room temperature, with occasional vortexing. Centrifuge at 16,000 × g for 30 sec • Remove the supernatant. Add ~100 µL 25 mM ABC to the gel pieces. Vortex for 5 min and centrifuge at 16,000 × g for 30 sec • Remove the supernatant and add ~100 µL (or enough to cover) 25 mM ABC/50% (v/v) ACN to dehydrate the gel pieces. Vortex for 5 min and centrifuge at 16,000 × g for 30 sec. Repeat this step • Evaporate the solvents from the gel pieces in a vacuum concentrator centrifuge (approximately 20 min) • Add 10 µL of trypsin (10 ng/µL) to the dried gel pieces and incubate for a few min to allow rehydration • Add 25 µL 25 mM ABC (or sufficient volume to cover the gel pieces), vortex for 5 min, centrifuge at 16,000 × g for 30 sec and incubate at 37°C overnight in a wet chamber • Centrifuge at 16,000 × g for 30 sec. Add 10 µL water, vortex for 10 min and centrifuge at 16,000 × g for 30 sec • Transfer the tryptic peptides (aqueous extraction) into a new reaction tube • Add 30 µL (or enough to cover) of 50% (v/v) ACN/5% (v/v) FA to the gel pieces, vortex for 10 min and centrifuge at 16,000 × g for 30 sec. Combine the supernatants of this and the previous step. Repeat this step once more • Add 10 µL 100% (v/v) ACN to the gel pieces, vortex for 5 min and centrifuge at 16,000 × g for 30 sec. Combine with previous extractions • Centrifuge the tryptic peptide mix at 16,000 × g for 30 sec and evaporate solvents in a vacuum concentrator centrifuge (approximately 2 hrs) • Re-suspend the peptides in 50 µL 5% ACN/0.1% FA to obtain a final concentration of 1 µg/µL • Aliquot and store tryptic peptides at -80°C Compared to the original protocol the following changes were made: • 4% SDS and 0.1 M DTT were added to the lysis buffer • The lysis of yeast cells was performed using Fast Prep at 6.5 Ms -1 , 20 sec. This step was repeated 3 times with a 5 min interval on ice in between runs • An incubation at 95°C for 5 min was performed to achieve a complete lysis of cells • Add SDS-PAGE loading buffer to a lysate aliquot containing 50 µg protein and incubate at 95°C for 5 min to denature proteins • Perform electrophoresis for 20 min at 100 V to allow the proteins to enter the separating phase • Excise the sample from the gel in a single slice

Excision and in-gel digestion of protein bands
• Cut the gel slice into small pieces (1 mm) and place into a new reaction tube • Add 100 µL (or enough to cover) 25 mM ABC/50% (v/v) ACN and vortex for 10 min • Centrifuge at 16,000 × g for 30 sec and remove the supernatant using a gel-loading micropipette tip. Repeat this step 2 or 3 times • Evaporate the solvents in a vacuum concentrator centrifuge (approximately 20 min) • Add 50 µL (or enough to cover) 10 mM DTT in 25 mM ABC to the dried gel pieces • Re-suspend the peptides in 50 µL 5% ACN/0.1% FA to obtain a final concentration of 1 µg/µL • Aliquot the flow-through and store tryptic peptides at -80°C Compared to the original protocol the following changes were made: • Cell lysis and protein extraction was performed in 50 mM ABC  • Apply 50 µL 0.5 M NaCl to the filter unit and centrifuge at 14,000 × g for 20 min

• The lysis of yeast cells was performed on a Fast Prep instrument
• Add TFA to reach a final concentration of 0.5% and remove salts from the filtrate

Desalting of peptides
• Place a 3 ml MILI-SPE Extraction disk cartridge (C18-SD) in a 15 ml conical tube • Add 1 ml TFA and centrifuge at 1,500 × g for 1 min • Add 0.5 ml of 0.1% TFA, 70% ACN in water and centrifuge at 1,500 × g for 1 min • Add 0.5 ml of 0.1% TFA in water and centrifuge at 1,500 × g for 1 min • Load the filtrate and centrifuge at 150 × g for 3 min • Add 0.5 ml of 0.1% TFA in water and centrifuge at 150 × g for 3 min • Transfer the cartridge to a new tube, add 0.5 ml of 70% ACN in water and centrifuge at 150 × g for 3 min • Collect the flow-through that contains the desalted peptides  • Centrifuge at 20,000 × g for 3 min

Solutions and reagents
• Transfer supernatant to a new reaction tube • Centrifuge at 20,000 × g for 10 min • Adjust the volume with UPLC/MS water to obtain a final protein concentration of 1 µg/µL • Aliquot and store tryptic peptides at -80°C Compared to the original protocol the following changes were made: • Protein pellets were re-suspended by sonication using Rapi-Gest as surfactant • Protein precipitation was performed using 20% TCA • Protein pellets were washed twice using 80% acetone • Trypsin was added in two sequential steps to reach a final enzyme to protein ratio 1:20 • Re-suspend in 50 µL 5% ACN/0.1% formic acid • Add 0.7 µL 10% FA, vortex and centrifuge at 16,000 × g for 1 min • Incubate at room temperature for 5 min • Recommended optional clean-up: Apply sample to a spin filter unit and centrifuge according to the manufacturers instructions • Aliquot the flow-through and store tryptic peptides at -80°C Compared to the original protocol the following changes were made: • Trypsin was added at a ratio protein:enzyme ratio of 1:20 • Trypsin was added in two sequential steps This is a well written manuscript and provides an excellent reference for sample preparation ahead of any label free proteomics experiment. The protocols in the supplementary section are very detailed making them very easy for both novice and expert researchers to follow. The manuscript basically compares three different types of sample preparation and describes the positives and negative for each protocol if applied to label free quantification of proteins. This manuscript will be a very useful reference for anyone who wishes to perform label free quantification of proteins as the authors do highlight both the advantages and disadvantages of each approach.
The minor revisions I would like to see are in the description of creating a SWATH ion library using offline high pH RP-HPLC. The authors mention the addition of iRT peptides to each fraction, but do not describe why they are added. These retention time reference peptides are added so that any retention time drift from the ion library to the samples can be accounted for by the software used to analyse and quantify the SWATH data. I feel this should be described, especially for novices to the field. Secondly there is a slight complication in using the term eFASP in this manuscript. A very recent publication by Erde, J.
(2014) et al. uses the term eFASP to describe a method that differs significantly from the one described here. I think this paper should be referred to in this manuscript just to highlight that the eFASP method used here was based on the method described by and not to be confused with this more recent Shevchenko (2012) et al. publication. Apart from these minor corrections, I have no reservations in approving this article.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
No competing interests were disclosed. We thanks for these suggestions, and for spotting out the nomenclature overlap with Erde et al.
. As the Erde article was published ~three months after the first version of this article (2014) et al. we decided to keep the nomenclature, but add a caveat for clarification on two instances in the paper. An explanation/rationale about the use of retention time normalisation standards is now included as well.

None
Competing Interests: 27  . evaluated 6 sample preparation methods for label-free quantification, et al compared DDA and SWATH approaches of TripleTOF5600, and drew two major conclusions. The organization of the manuscript is clear, it is well-written, and the coverage is effective. The first conclusion is supported by their results. However, the second conclusion should be clarified that SWATH outperformed DDA in quantification when a TripleTOF5600 was applied. Without the clarification, it outperformed DDA in quantification when a TripleTOF5600 was applied. Without the clarification, it misleads scientists to believe that SWATH outperforms all DDA approaches carried out using other mass spectrometers. Figure 2C shows that DDA of QExactive detected about 40% more proteins than SWATH of TripleTOF5600. It is necessary to compare DDA of QExactive with SWATH of TripleTOF5600.

Minor issues:
"Protein Discoverer" should be "Proteome Discoverer" I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
No competing interests were disclosed. We thank the reviewer for the suggestion and clarify this now. Indeed, the TripleTOF5600 and Qexactive data can only to be compared indirectly/correlative in this study, but not absolutely, as different chromatography, HPLC, and ionisation settings were used. Therefore, the study can not be used to compare the Qexactive and TripleTOF instruments. However, in relative terms the protocols perform equally on both platforms, see Figure 2d, inset. As suggested, we have improved this part in manuscript and abstract, and included the Reviewer's comments.
No competing interests were disclosed. Competing Interests: