Liquid chromatography coupled to mass spectrometry (LC–MS)-based bottom-up proteomics is a highly dynamic field that is rapidly evolving. Recent technological developments, including faster, more sensitive instruments, novel acquisition methods and advanced data processing, have pushed the limits of throughput and sensitivity. This progress has reduced costs and increased reproducibility, establishing proteomics as a powerful tool for basic biological discovery, translational research and a potential foundation for personalized medicine1.
Data-independent acquisition (DIA)2,3,4 proteomics has grown in popularity. Advances in both instrumentation and analysis software have addressed its earlier limitations, while strengthening its core advantages: high proteomic depth, data completeness and improved quantitative performance1. Furthermore, a number of recent DIA technologies now offer the reliable fragment-to-precursor mass assignment that was previously the main strength of data-dependent acquisition (DDA), promising to expand the applicability of DIA even further5,6,7,8. Improvements in sensitivity9,10,11,12,13,14,15 and the rise of multiplexed DIA16,17,18,19,20,21,22 suggest that DIA will expand into applications that have traditionally relied on targeted methods.
Much of the progress in DIA has been driven by data processing software improvements23,24,25,26,27,28,29,30, with new algorithms enabling rapid gains in the number of proteins identified. Advanced machine learning now allows peptides to be confidently matched to recorded signals, even in the presence of noise and signal interferences from coeluting and cofragmenting peptide species. This capability, however, raises a critical question: how reliable are the quantitative values derived from these extra identifications and how much do they benefit the biological conclusions? This issue is especially important for modern, high-sensitivity, high-throughput workflows, such as single-cell proteomics or spatial tissue proteomics, that generate challenging data at scales of hundreds of samples per day5,9,11,21,31,32,33,34,35,36,37.
Substantial effort has been directed toward developing computational methods that can improve quantitative reproducibility, precision and accuracy of proteomic experiments. These include deconvolution of spectra38, selection of peptide fragment ions based on the signal quality21,26,39 and protein quantification through aggregation of multiple parallel sources of quantitative information, such as peptide-level MaxLFQ for DDA40 and fragment-level MaxLFQ for DIA41 or directLFQ42. While advanced methods for error control and missing data handling have also been developed for statistical analysis of proteomics data, as discussed and benchmarked recently43,44, a fundamental problem remains: while peptide identification error rates are reliably controlled (for example, using target–decoy competition), it is currently challenging to reliably estimate quantification errors.
In each tandem MS acquisition, the MS instrument records multiple intensity signals for each detected peptide precursor: the unfragmented precursor intensity (MS1) and the signals from its fragment ions (MS/MS). The quantification algorithm previously implemented in our DIA-NN software (‘legacy’ DIA-NN mode) quantifies each precursor by summing the three highest-quality fragments selected across runs using correlation-based scores26. This approach enables filtering out signals that are strongly affected by interferences in multiple runs but remains susceptible to interferences observed in individual acquisitions. Here, much of the available information, including MS1 signal, is discarded, while recent works showcase the relevance of MS1 information for more accurate quantification in multiplexed DIA21 and for downstream statistical analysis45. Integrating all available quantitative features instead (MS1, MS/MS signals) should naturally allow achieving higher accuracy and precision than using a limited subset. However, measured signals are subject to errors caused by random noise46,47 and ‘interferences’ (ref. 4), requiring an integration algorithm to account for these.
To integrate all of the available MS1 and MS/MS information in a statistically justified manner, we devised QuantUMS (quantification using an uncertainty-minimizing solution; Fig. 1a). QuantUMS implements an algorithm that is capable of performing quantification of precursors using any subset of quantitative features, wherein the relative contributions of features are determined by their respective quality metrics and the hyperparameters of the algorithm, by modeling the statistical properties of LC–MS-derived errors. QuantUMS then optimizes its hyperparameters toward two goals. First, concordance of relative precursor quantities calculated using distinct quantitative features is maximized, improving precision. Here, QuantUMS builds on the idea that the ratios between quantitative features of the same precursor are expected to be consistent across acquisitions. Second, hyperparameters are also tuned to make the distributions of quantitative ratios obtained using high-quality and low-quality signals similar, tackling any ratio compression bias that affects noisy quantities and improving overall accuracy. Lastly, the optimized algorithm is used to quantify first precursors and then proteins, using all available quantitative features (Fig. 1a, Methods and Supplementary Information).

a, QuantUMS takes as input integrated raw signals of individual quantitative features (MS1 precursor and MS/MS fragment ions), as well as their quality scores. These allow QuantUMS to correct signal intensities to remove bias and to estimate their log variances, in a process controlled by hyperparameters. The signal log-variance estimates cater for a statistically justified weighted aggregation of signals to obtain precursor quantities and can also be propagated through weighted aggregation to obtain log-variance estimates for these precursor quantities, allowing QuantUMS to report a ‘quantity quality’ metric for each precursor. QuantUMS also establishes concordance metrics by comparing parallel channels of quantitative information, enabling it to optimize the hyperparameters using machine learning. Precursor quantities and their log-variance estimates are likewise aggregated to obtain protein quantities and the respective quantity quality metrics for proteins. b, Performance of QuantUMS on a mixed-species dataset. Three mixtures A, B and C of human (K562) and E. coli tryptic digests with proportions A:B:C being 1:1:1 (human) and 50:33:20 (E. coli) were recorded37 using a 5-min analytical flow gradient on timsTOF Pro and analyzed with DIA-NN using legacy and QuantUMS quantification methods. Resulting protein ratios between mixtures A and C are visualized. c, The effect of quality filtering enabled by QuantUMS. Data are shown for the high-accuracy mode. On the box plots (b,c), boxes correspond to the interquartile range with the median indicated and whiskers extend to 5–95% quantiles. Horizontal lines indicate the expected ratios for each species. Bar plots indicate the MADs of A:C ratios.
QuantUMS thereby addresses two central challenges of LC–MS proteomics. First, measured signals may be subject to interferences, which bias their quantities upwards and thereby cause ratio compression—an effect that becomes more pronounced with decreasing precursor abundance. Without requiring any knowledge of the experiment design, QuantUMS can estimate and effectively eliminate such bias, at the cost of a decrease in precision. Balancing this trade-off, the QuantUMS module in DIA-NN implements two preconfigured modes termed high-precision and high-accuracy, which we benchmark in the present work. Second, controlling for quantification errors has so far been challenging in proteomics, mostly relying on precursor or protein coefficients of variation (CVs). However, CVs neither control for interference-caused systematic errors that might severely impact the accuracy of quantification while preserving precision nor reflect errors that only manifest in individual acquisitions. Therefore, ensuring confidence of observations pertaining to specific proteins necessitates laborious, potentially biased and sometimes technically impossible manual checks of extracted ion chromatograms for each acquisition and peptide of interest. With QuantUMS, we mitigate this problem by introducing a quantity-specific quality metric, which enables effective filtering for confident downstream analysis and statistical inference.
To evaluate the performance of QuantUMS, we carried out benchmarks on multiple DIA datasets covering synthetic as well as biological experiments, different instrument types and experimental scales.
First, we benchmarked QuantUMS on a mixed-species dataset recorded on a timsTOF Pro37, where the quantitative ground truth is generated by mixing human (K562) and Escherichia coli tryptic digests in different predefined ratios. On such mixed-species datasets, one can evaluate the ability of the instrument and the data-processing software to correctly recover these ratios, examining the mean absolute log2 deviation (MAD) of the constant species (human) as a proxy for quantitative precision and the MAD of the differential species (in this case, E. coli) as a proxy for quantitative accuracy. With the accuracy of sample preparation confirmed by examining high-quality MS1 signals (Supplementary Fig. 1c), we compared QuantUMS high-precision and high-accuracy modes to the legacy DIA-NN quantification mode (Fig. 1b, Supplementary Fig. 2a and Methods). QuantUMS high-precision mode yielded the best precision (human protein MAD = 0.10) while reducing the MAD of E. coli ratios 1.7-fold (0.33 to 0.19). The high-accuracy mode of QuantUMS further reduced the MAD of E. coli ratios 2.2-fold compared to legacy mode (0.33 to 0.15), eliminating ratio compression while maintaining comparable precision. The observed improvements are robust to tuning QuantUMS hyperparameters on just a subset of samples (A and/or C; Methods), where results (Supplementary Fig. 2b) are comparable to those obtained by training on the entire dataset A + B + C (rightmost panels in Fig. 1b and Supplementary Fig. 2a). This indicates that the optimized hyperparameters of QuantUMS reflect the inherent properties of the LC–MS setup and the sample matrix but do not reflect the experiment design. Furthermore, the QuantUMS results are robust to varying the threshold that QuantUMS applies internally to select high-quality signals used as a reference for bias minimization (Supplementary Information and Supplementary Fig. 2c). Filtering the dataset on the basis of acquisition-specific precursor-level and protein-level quantity and quality metrics introduced in QuantUMS (Fig. 1c and Supplementary Fig. 1b,d) shows that, as intended, only accurately quantified precursors and proteins are retained, with 0.75 quality quantile protein-level filtering resulting in almost perfect protein ratios (Fig. 1c, right). On mixed-species datasets acquired using other LC–MS platforms, involving TripleTOF 6600 (ref. 48) (Supplementary Fig. 3) or Orbitrap Astral49 (Supplementary Fig. 4) MS instruments, QuantUMS likewise shows improvements in quantitative performance.
To evaluate the robustness of QuantUMS with respect to variations in experiment size and inclusion of vastly different loading amounts, we used our recently recorded mixed-species dataset50, measured with DIA parallel accumulation–serial fragmentation (dia-PASEF) at amounts spanning a tenfold range. Compared to the legacy mode, QuantUMS high-accuracy mode alleviated or even eliminated ratio compression and substantially improved overall accuracy across all considered loading amounts, despite minor ratio expansion becoming apparent for high-abundant proteins at low loads (Supplementary Fig. 5a). These observed accuracy gains remained consistent, regardless of whether QuantUMS hyperparameters were optimized on the entire experiment (144 runs; Supplementary Fig. 5a) or just subsets thereof (8–96 runs; Supplementary Fig. 5b).
Next, we investigated what effect the enhanced performance of QuantUMS has on differential expression analyses as a common downstream application. On two out of three considered mixed-species datasets, QuantUMS yielded higher numbers of differentially expressed proteins (Welch’s t-test) than legacy mode at the same empirical FDR (Supplementary Fig. 6, left). We then speculated that the ability of QuantUMS to alleviate ratio compression and, thus, improve the linearity of the quantitative response may prove important when considering experiments where large normalization factors need to be applied, such as any applications where sample loading amounts are not equalized and applications with inherently diverse samples, including single-cell or spatial proteomics. In a benchmark to simulate this situation, we did indeed observe both QuantUMS modes to strongly reduce the false-positive numbers compared to the legacy mode after applying strong normalization, at the same adjusted P values (Supplementary Fig. 6, middle), with greater numbers of proteins correctly detected as differentially expressed at a given empirical FDR threshold (Supplementary Fig. 6, right). Therefore, we hypothesize that the enhanced accuracy (linearity) of QuantUMS may reduce false positives in calling differentially expressed proteins, although whether this conclusion also holds for actual experiments of interest, as opposed to a highly artificial benchmark here, remains to be investigated with an appropriate experiment design.
To evaluate the performance of QuantUMS in the presence of biological and sample preparation variation, we carried out differential expression analysis on a human fibroblast perturbation dataset51. QuantUMS more than doubled the numbers of differentially expressed proteins at both 1% and 5% FDR compared to legacy quantification and more than tripled them compared to DIA-NN 1.8, which was used in the original publication (Fig. 2a). Filtering the protein lists on the basis of the averaged across-runs protein quantity quality metric, to reduce the multiple testing burden, further improved the numbers of significant proteins at a given FDR (Fig. 2a, bottom). Here, filtering by the quality metric marginally outperformed the ‘naive’ approach of filtering on the basis of the average estimated protein quantity (top 1 method). Testing differential expression on a dataset of a cohort of persons with chronic lymphocytic leukemia (CLL)52 against the supplied characteristics showed that QuantUMS increased the numbers of proteins identified as differentially expressed compared to the legacy mode in all tests (Fig. 2b). By design, QuantUMS reduces variation originating from LC–MS measurement errors. The benefit of QuantUMS in a given experiment, therefore, depends on the relative contribution of LC–MS errors compared to the variation of biological or sample preparation origin. Consistent with this, the greater advantage of QuantUMS in the fibroblast dataset likely relates to the lower biological variability of cultured cells compared to the CLL dataset, where individual samples can be expected to exhibit greater heterogeneity, including proteoform-level variation, which QuantUMS is not designed to address.

a, Comparison between legacy and QuantUMS modes, as well as the previous-generation DIA-NN 1.8, on a dataset of 20 human dermal fibroblast samples recorded using a 69-min nanoLC gradient on timsTOF Pro51. Numbers of differentially expressed proteins (limma; Methods) when comparing ObHEx-treated and non-treated senescent fibroblasts (five biological replicates each) are shown. Bottom, the effect of filtering the protein lists using the QuantUMS protein quantity quality metric averaged across acquisitions compared to filtering on the basis of the estimated (top 1 method) protein amounts averaged across acquisitions. BH, Benjamini–Hochberg. b, Comparison between legacy and QuantUMS modes on a dataset52 of 50 CLL samples recorded using a 100-min nanoLC gradient on timsTOF Pro. Numbers of differentially expressed proteins when testing against four phenotypic characteristics (limma; Methods) are shown.
With QuantUMS, we address a long-standing problem of untargeted proteomics, that is, the lack of quality control for peptide and protein quantities obtained in an experiment. We show that taking into account the quality information available for individual signals recorded by the MS instrument allows to not only improve quantitative performance per se but also produce effective quality metrics to ensure confidence in the data and further empower the subsequent statistical analyses. So far, we have benchmarked QuantUMS on DIA proteomics data, using DIA-NN’s quality scores, but the algorithm is open to incorporating novel quality scores and other acquisition approaches that likewise generate multiple channels of quantitative information, including selected and parallel reaction monitoring, as well as other experiments that involve recording multiplexed MS/MS spectra. We further envision great potential for future improvements in quantitative proteomics to be achieved by integrating QuantUMS with downstream statistical analysis approaches, such as MSStats43 or Triqler44, to enable biological inference that is fully aware of all kinds of uncertainty, missingness and normalization issues in raw proteomics data.