Abstract
Building and refining pharmacology models require “system” data derived from tissues and in vitro systems analyzed by quantitative proteomics. Label-free global proteomics offers a wide scope of analysis, allowing simultaneous quantification of thousands of proteins per sample. The data generated from such analysis offer comprehensive protein expression profiles that can address existing gaps in models. In this study, we assessed the performance of three widely used label-free proteomic methods, “high N” ion intensity approach (HiN), intensity-based absolute quantification (iBAQ) and total protein approach (TPA), in relation to the quantification of enzymes and transporters in 27 human liver microsomal samples. Global correlations between the three methods were highly significant (R2 > 0.70, P < 0.001, n = 2232 proteins). Absolute abundances of 57 pharmacokinetic targets measured by standard-based label-free methods (HiN and iBAQ) showed good agreement, whereas the TPA overestimated abundances by two- to threefold. Relative abundance distribution of enzymes was similar for the three methods, while differences were observed with TPA in the case of transporters. Variability (CV) was similar across methods, with consistent between-sample relative quantification. The back-calculated amount of protein in the samples based on each method was compared with the nominal protein amount analyzed in the proteomic workflow, revealing overall agreement with data from the HiN method with bovine serum albumin as standard. The findings herein present a critique of label-free proteomic data relevant to pharmacokinetics and evaluate the possibility of retrospective analysis of historic datasets.
SIGNIFICANCE STATEMENT This study provides useful insights for using label-free methods to generate abundance data applicable for populating pharmacokinetic models. The data demonstrated overall correlation between intensity-based label-free proteomic methods (HiN, iBAQ and TPA), whereas iBAQ and TPA overestimated the total amount of protein in the samples. The extent of overestimation can provide a means of normalization to support absolute quantification. Importantly, between-sample relative quantification was consistent (similar variability) across methods.
Introduction
Quantitative proteomics has become a standard method in molecular biology, with the central aim of measuring expression profiles at the protein level. Because of its broad scope of analysis (Wang et al., 2019), label-free (global) proteomics allows measurement of a wide range of proteins that govern drug pharmacokinetics (principally drug-metabolizing enzymes and drug transporters) and their changes in response to pathologic and environmental factors (El-Khateeb et al., 2019; Prasad et al., 2019). The methodology does not depend on the availability of isotopically labeled standards, which are expensive (Al Feteisi et al., 2015) and, when used to quantify low-abundance proteins, require additional care in data analysis (Achour et al., 2018). Mass spectrometry is not, however, an inherently quantitative technique; the relationship between the concentration of the analyte and the intensity of the corresponding signal is complex (Couto et al., 2011), and all label-free methods require assumptions that may not be fully justified (Arike et al., 2012).
Label-free measurement relies on signal intensity either of all native peptides [e.g., the total protein approach (TPA) (Wiśniewski and Rakus, 2014) and intensity-based absolute quantification (iBAQ) (Schwanhäusser et al., 2011)] or a set of unique/razor peptides [e.g., high N ion intensity approach (HiN) (Silva et al., 2006)] assigned to a certain protein. Alternatively, relative quantification can be achieved based on spectral counts [e.g., the exponentially modified protein abundance index (emPAI) (Ishihama et al., 2005)], as a semi-quantitative approach to derive an estimate of protein expression.
During the past 5 years, the pharmacology community has produced approximately 20 publications quantifying human tissue proteomes by global proteomics, and this number is set to increase in the next few years. There are ongoing debates about whether samples should be fractionated and about sample preparation methods (Prasad et al., 2019), although filter-aided sample preparation (FASP) methodology (Wiśniewski et al., 2009) is widely adopted when sample is plentiful. More surprisingly, there is no consensus about methods for analysis of the (typically gigabyte) RAW files obtained from mass spectrometry experiments. Different software packages for data analysis are available that make data processing more streamlined, but these do not always produce completely consistent results (Välikangas et al., 2018). Processing can be done using different reference datasets and following different assumptions, and, above all, using different quantification methodologies (El-Khateeb et al., 2021). These factors are particularly important when modeling the impact of different covariates, such as disease, is the focus of investigation, and therefore validating such data plays a major role in increasing trust in the outcome of predictive pharmacology models.
For the work described here, we used a well-characterized set of 27 human liver microsomal (HLM) samples. As previously described, sample preparation was done by standard FASP methodology and mass spectrometry was carried out on an Orbitrap HF QE instrument (Couto et al., 2019). RAW files so generated have been uploaded to the Proteomics Identification (PRIDE) database and are freely available (Al-Majdoub et al., 2020). The focus of this article is to evaluate the quality of label-free abundance data. In particular, we aimed to use these samples as controls for an investigation of experimentally demanding pediatric samples, analyzed with no standards, and we therefore required that standard-free methods should be as reliable as possible.
Materials and Methods
The proteomic dataset has been deposited to the Proteomics Identification (PRIDE) repository under the identifier PXD020910.
Samples and Proteomic Methodology
The preparation of HLM samples (Supplemental Table 1) and analysis by mass spectrometry are fully described elsewhere (Couto et al., 2019). Briefly, liver membrane fractions were prepared using differential centrifugation, first at low speed (10,000g) to separate cellular debris from the post-mitochondrial fraction, followed by high-speed centrifugation (100,000g) to isolate microsomes. Protein content was measured using the Bradford protein assay and sample preparation of 100 µg of each sample (n = 27) followed the FASP protocol with multienzyme digestion (lysyl endopeptidase and trypsin). Three exogenous protein standards were spiked in the samples: bovine serum albumin (BSA, 0.2 µg), bovine cytochrome c (0.15 µg) and equine myoglobin (0.3 µg). Peptides (1 µg) were analyzed by liquid chromatography-tandem mass spectrometry using an UltiMate 3000 rapid separation liquid chromatography system (Dionex, Surrey, UK) coupled to a Q Exactive HF Hybrid Quadrupole-Orbitrap mass spectrometer (ThermoFisher Scientific, Bremen, Germany).
RAW files were processed using Progenesis version 4.2 as a single batch, and the resulting mgf files were processed using Mascot version 2.7 for protein identification. The human database used was Uniprot 000005640, containing 77,027 protein entries. Files were processed four times with the settings: enzyme trypsin/P, MS tolerance 5 ppm, MS/MS tolerance 0.02 Da. In the initial run, up to one missed cleavage was allowed, carbamidomethyl (cysteine) was set as a fixed modification, and oxidation (methionine) was the only variable modification. In subsequent runs, a second missed cleavage, deamidation (of asparagine and glutamine) and phosphorylation (of serine, threonine and tyrosine) were (separately) permitted. Results were compiled using Progenesis and exported as csv files.
Data Analysis and Label-Free Quantification
Peptides were assigned to proteins based on a bespoke razor as described previously (Al-Majdoub et al., 2020) using Microsoft Excel 365. Assignment prioritized full-length characterized sequences over truncated, uncharacterized, and cDNA sequences. A best-fit analysis was then run to minimize the number of proteins assigned to account for all of the peptides. Deamidated peptides with no corresponding native assignment and those that did not match any protein were deleted. The peptide MS intensities attributed to more than one protein were divided among those proteins based on the ratio of unique or razor (peptides with a single assignment within the current dataset) peptide intensities for each protein, as detailed previously (Al-Majdoub et al., 2020).
Three potential standards (BSA, bovine cytochrome c, and equine myoglobin) were assessed. The equations used in quantification by the HiN, iBAQ, and TPA methods are detailed below (eqs. 1–3). Rearrangement of these equations provides the means to compare the total sample estimated from the total intensity against the total sample analyzed. These act as sanity checks on data analysis.
High N (HiN) ion intensity method:
Where [Protein] is the abundance of a target protein, [standard] is the abundance of the standard protein, both expressed in units of pmol mg−1 total protein, and the fraction refers to the ratio of the average intensity of the n/m highest ion peaks of the target protein relative to the standard (in this case, n = m = 2 or 3). Peptides used for quantification are unique to the target proteins; other selection criteria were according to (Achour et al., 2018).
Intensity-based absolute quantification (iBAQ):
Where the summed intensity of all peptides i from the protein of interest j or the standard k is normalized to T, the number of theoretically observable peptides from a tryptic digest of protein j or standard k.
The total protein approach (TPA):
Where the ratio of the sum of intensity of all peptides i derived from a protein j of interest to the sum of intensity of all peptides (from all proteins) in a sample (expressed in parts per billion) is converted to an abundance value (pmol mg−1) by normalizing to the molecular mass of the protein in Daltons.
Statistical Data Analysis
Data were expressed as mean and standard deviation, and variability was assessed as CV and fold difference (maximum-to-minimum ratio). Abundance and activity correlations were tested by linear regression (R2). Relationships between abundance data and either age or body mass index (BMI) were tested using Pearson correlation (r) to show the direction of trends. Differences between abundance data generated using label-free and targeted methods and between male and female donors were assessed using a t test. Differences across genotypes were assessed using either one-way ANOVA or a t test. A probability cutoff of 0.05 was set for statistical significance.
Results
In this study, HLM samples were analyzed using global proteomic methods. Percent identical peptides reflected high integrity of analyses (86–99%) across replicates and across samples (Supplemental Tables 2 and 3). For the purpose of quantification, we chose to assess three MS intensity-based label-free methods on the basis that they provide more robust protein measurements than spectral counts (Arike et al., 2012). Two methods (HiN and iBAQ) rely on exogenous protein standards at known concentrations, whereas the TPA is applied without the use of a standard. The methods allowed quantification of 2232 proteins, and data describing expression of 23 cytochrome P450 enzymes, 11 glucuronosyltransferases (UGT), 17 ABC transporters, and 6 solute carriers are shown in Supplemental Tables 4-6.
Choice of Standard
For the iBAQ and HiN methods, three potential standards were included in the samples. The total amount of protein used in each experiment was 100 µg, and the amounts of standards were also known: BSA 0.2 µg (28.86 pmol mg−1 total protein), myoglobin 0.3 µg (175.61 pmol mg−1 total protein) and cytochrome c 0.15 µg (32.04 pmol mg−1 total protein). BSA was expected to give the best results because its molecular mass (69 kDa) is close to the average molecular mass of the detected proteins (60 kDa), and it yields a high number of unique peptides. Cytochrome c gave rise to a limited number of unique peptides and was therefore discarded. The HiN method was used to calculate the total analyte protein using BSA and myoglobin (Table 1 and Fig. 1A), with results of 53% for BSA and 207% for myoglobin compared with the nominal amount of protein analyzed (assuming an average molecular mass of 60 kDa for native proteins). In the HiN method, proteins represented by a single peptide as well as those falling below the limit of quantification are ignored, making 53% a reasonable value and 207% a substantial overestimate. Further calculations were therefore carried out using BSA as a standard.
Comparison of liver proteome measurements using three label-free methods. (A) Percentage of the measured total protein content relative to the nominal content (dashed line) analyzed by mass spectrometry. The amount was determined using the TPA, HiN [based on either myoglobin (MYG) or BSA as standards] and iBAQ (based on BSA). (B) Head-to-head comparison of average concentrations (of 2232 proteins) in 27 samples quantified by iBAQ (BSA) and TPA compared with HiN (BSA). (C) Correlation between mean concentrations of 57 key pharmacokinetic targets measured by TPA or iBAQ and mean abundances measured by HiN. The data show that TPA overestimates protein amounts compared with the HiN method, whereas HiN and iBAQ methods produce comparable results in most cases. The data also indicate that it is possible to estimate absolute abundances by iBAQ or TPA using conversion factors. Abundance is expressed in units of pmol mg−1 total protein.
The total amount of protein in analyte estimated by the label-free quantification methods, averaged (± SD) over 27 samples.
Comparison Between Label-Free Quantification Methods
Although iBAQ and TPA clearly overestimate the total amount of protein (Fig. 1A), both enable estimation of the abundance of proteins that give rise to a single detectable peptide, whereas the HiN method does not. Our previous work (El-Khateeb et al., 2021) indicated that all three methods perform quite well in assessing relative change from healthy baseline. We therefore investigated the correlation between absolute quantification values obtained using the three methods. An overall picture of correlations between measurements (of 2232 proteins) using the different methods is presented in Fig. 1B. The overall correlation between the three methods is strong (R2 > 0.70, P < 0.001). The TPA overestimates by a factor of two- to threefold relative to HiN. The iBAQ and HiN measurements are relatively comparable. Correlations between mean abundances of enzymes and transporters (n = 57) are presented in Fig. 1C. Individual abundance data for these targets are presented in Supplemental Tables 4-6. Specific cases related to a number of proteins of interest to drug pharmacokinetics are shown in Fig. 2. Figure 2 shows that in all cases a straight line can be drawn connecting iBAQ and HiN quantifications (R2 = 0.60–0.97) and so is the case for TPA and HiN but with generally considerably more scatter (R2 = 0.25–0.97). Generally, the TPA gives the highest estimation of the concentration of a protein; for lower abundance proteins (e.g., low abundance transporters), TPA gives the lowest estimation. Relative abundances are presented in Supplemental Fig. 1 for cytochrome P450 enzymes, UGTs, and ABC transporters, reflecting overall agreement, except in the case of relative abundance of ABC transporters determined using the TPA method.
Correlation between protein concentrations of key pharmacokinetic targets measured by iBAQ (blue) or TPA (orange) relative to HiN method in 27 liver samples. The data show examples of drug-metabolizing enzymes and transporters. BSA was used as a standard for HiN and iBAQ, and abundance was measured in units of pmol mg−1 total protein.
Correlation of Label-Free Data with Functional Activity
Functional activity data were available for several cytochrome P450 and UGT enzymes (Achour et al., 2014, 2017). Correlations between abundance and activity of cytochrome P450 3A4, 2D6, 1A2, 2B6 and 2C19 were moderate to strong across the three methods (R2 = 0.56–0.88 with HiN, R2 = 0.57–0.91 with iBAQ, and R2 = 0.63–0.88 with TPA; Fig. 3A). Correlation with CYP2C9 activity was the exception, with different degrees of correlation across methods (R2 = 0.23 with HiN, R2 = 0.43 with iBAQ, and R2 = 0.71 with TPA). Similarly, weak correlation was previously reported for CYP2C9 activity against diclofenac with targeted data in the set of samples (Achour et al., 2014).
Correlation of protein concentrations of (A) cytochrome P450 and (B) UGT enzymes measured by HiN (red), iBAQ (blue), and TPA (orange) against functional activity in 27 liver samples. Activity was measured with metabolite formation assays against the substrates: phenacetin (CYP1A2), mephenytoin (CYP2B6), diclofenac (CYP2C9), mephenytoin (CYP2C19), bufuralol (CYP2D6), testosterone (CYP3A), β-estradiol (UGT1A1), chenodeoxycholic acid (UGT1A3), 5-hydroxytryptophol (UGT1A6), propofol (UGT1A9), zidovudine (UGT2B7) and S-oxazepam (UGT2B15). Abundance was measured in units of pmol mg−1 total protein; catalytic activity was measured in units of nmol metabolite min−1 mg−1 total protein.
Correlations between abundance and activity of UGTs 1A1, 1A3, 1A6, and 1A9 were moderate to strong (R2 = 0.34–0.69 with HiN, R2 = 0.42–0.85 with iBAQ, and R2 = 0.52–0.77 with TPA; Fig. 3B). The exception was UGT2B7, with different degrees of correlation across methods (R2 = 0.24 with HiN, R2 = 0.44 with iBAQ, and R2 = 0.59 with TPA), whereas correlations of UGT2B15 were generally weaker (R2 = 0.15, 0.23, and 0.39 with HiN, iBAQ, and TPA, respectively). Moderate correlations were previously reported for UGTs 2B7 and 2B15 activity with targeted data for the set of samples (Achour et al., 2017).
Comparison of Label-Free Quantification with Targeted Data
Label-free measurements were compared with previously reported targeted data for cytochrome P450 and UGT enzymes in the same set of samples (Achour et al., 2014, 2017). For cytochrome P450 enzymes, overall agreement was observed with HiN and iBAQ data (Fig. 4A), reflecting 74% of measurements within twofold of targeted data (Fig. 4C). TPA, however, tended to overestimate measurements (only 53% of the data were within twofold). For UGT enzymes, TPA measurements were closer to targeted data (93% of measurements were within twofold), whereas iBAQ and HiN tended to underestimate, with 77% and 60% of measurements within twofold, respectively.
Comparison of abundances of (A) cytochrome P450 and (B) UGT enzymes measured using label-free methods (HiN, iBAQ, and TPA) against targeted data. Ratios of label-free measurements relative to targeted data for (C) cytochrome P450 and (D) UGT enzymes. In (A) and (B), the whiskers represent the minimum-to-maximum range, the boxes represent the 25th and 75th percentiles, the lines represent the medians, and + signs represent the means. Comparisons based on a t test against targeted data are shown in black and against HiN measurements are shown in red. * P < 0.05, ** P < 0.01, and *** P < 0.001. In (C) and (D), the dashed lines denote the twofold range, and the percentages are the proportion of label-free measurements within twofold of targeted data. Abundance was measured in units of pmol mg−1 total protein.
Assessment of Variability with Label-Free Methods
Fold difference and CV, related to between-sample variability, was very similar across the three methods for all measured proteins (Fig. 5), indicating robustness of relative quantification regardless of methodology. The calculated CV combines technical and biologic variability. Technical variability assessed using a pool of the same set of samples returned values <30% for the targets across all methods. The calculated variability related to technical error, expressed as fold difference between the 5th and 95th percentiles , was therefore within fourfold, whereas total variability reflected up to 50-fold in abundance across the three methods.
Comparison of between-sample variability across 27 liver samples in the abundance of enzymes and transporters measured by HiN, iBAQ, and TPA, expressed as (A) fold difference (maximum-to-minimum ratio) and (B) percent CV (CV = SD × 100/mean). BSA was used as a standard for HiN and iBAQ.
Covariates of Protein Expression Assessed by Label-Free Methods
Donors’ demographic and clinical information is summarized in Supplemental Table 1. In addition, CYP2B6, CYP2C9, CYP2C19, CYP2D6, and CYP3A5 genotype data were available for 25 (out of 27) samples. Abundance data were assessed against sex, age, and BMI (Supplemental Table 7). The number of confirmed smokers and alcohol users at the time of donation was small (four smokers and three drinkers), and therefore, the effect of these two factors was not probed. No differences in abundance in samples from male (n = 15) and female (n = 12) donors were observed with all three label-free methods (t test, P > 0.05). Weak, negative correlations with age were revealed for CYP2C18, UGT2B4, UGT2B10, and ABCA1 with borderline significance across the label-free methods (r = −0.42 to −0.39, P = 0.03 to 0.05). The effect of BMI was moderate in the cases of UGT1A3, UGT1A4, and OATP1B3, with lower abundance in overweight and obese donors (r = −0.59 to −0.42, P = 0.001 to 0.03). The effect of genotype was significant in the cases of CYP2D6 (ANOVA, P < 0.05), CYP2C19 (t test, P = 0.01) and CYP3A5 (t test, P < 0.05).
Discussion
This study aimed to assess measurements of hepatic enzymes and transporters by widely used label-free proteomic methods (HiN, iBAQ and TPA). We have previously outlined the use of the Disease Perturbation Factor (DPF) (El-Khateeb et al., 2021), which is essentially a ratio connecting the amount of any given protein in a diseased tissue with the amount of the same protein in healthy control tissue. The DPF has been shown to be independent of the quantification methodology (targeted versus global proteomics, HiN versus iBAQ versus TPA). Furthermore, differences in absolute abundance are explored herein, and a key piece of information—the total analyte protein—which is usually discarded in proteomic data analysis, now allows us to adjudicate between the different methods of label-free quantification and even to estimate conversion factors from one method to another.
Absolute abundance correlated across the three methods, with targeted data and with functional activity. Correlation with targeted proteomics was also previously reported (Vildhede et al., 2018; Wiśniewski et al., 2019; El-Khateeb et al., 2021). Although the HiN method (Silva et al., 2006) generally produces data that seem biologically sensible, especially when BSA is used as a standard for human samples, it has two drawbacks. First, the N in HiN is generally taken to mean two, three, or more, but not one. Thus, we are denied even an estimate of the abundances of proteins represented by a single peptide. Second, a high-quality standard is required, one that should have similar properties to the proteins under study. The choice of a suitable standard in this study was, however, empirical as highlighted by the discrepancy between abundances against BSA and myoglobin, with myoglobin clearly overestimating the total amount of protein. Prospectively, it is, of course, possible to include a protein standard at appropriate concentration in new samples; it is not, however, possible to do this retrospectively. Proteomic work in the drug metabolism and disposition arena involves the use of precious, small human samples. It is imperative, both scientifically and ethically, to derive maximum information from each sample, which means that historical samples, prepared by suboptimal protocols, are still of value (Prasad et al., 2019).
The TPA has proved an excellent method for dealing with such samples. No standards are necessary, and the approach provides broad coverage by allowing the quantification (albeit with low accuracy) of proteins represented by a single peptide. Nonunique peptides can be accommodated within the analysis. We have introduced small modifications to the data analysis so that these nonunique peptides are not overrepresented (Al-Majdoub et al., 2020), but the method is still inclined to overestimate protein concentration relative to other label-free methods. This is not surprising. The normalization in a TPA experiment is based on the total signal intensity, but we know that some signal (that due to proteins falling below the limit of quantification) is not measured. A ‘proteomic ruler’ incorporating MS signal of cellular histones was introduced to make TPA measurements more biologically sensible (Wiśniewski et al., 2014). The DPF (El-Khateeb et al., 2021), being a relative factor, allows for the use of the TPA without too much concern for the systematic overestimate.
Both the iBAQ and TPA methods overestimate the total amount of protein in the sample, and the extent of this overestimation can provide a means of normalization should absolute quantification be required. Robust quantification is biased toward higher abundance proteins, and therefore such normalization approach may only work with enzymes and highly abundant transporters. Relative quantification, both between sample and within sample, is often more pertinent than absolute estimates. Similar variability (CV) across individual measurements recovered by all methods indicates relative quantification is robust, regardless of the quantification approach provided consistent proteomic workflows are used (El-Khateeb et al., 2021; Neuhoff et al., 2021). Relative data are particularly useful for assigning stoichiometry in protein expression (Fabre et al., 2014). The current set reveals a particular example, stoichiometry of TAP1 (ABCB2) and TAP2 (ABCB3). The correlation between abundances generated by the methods for these two proteins was excellent (R2 = 0.93–0.97), indicating strong agreement, and TAP2 to TAP1 ratio (average TAP2:TAP1 > 10) was within threefold across methods. TAP1 and TAP2 form a functional heterodimer that transports peptides for antigen presentation, and one would therefore expect 1:1 expression ratio. The clearly higher abundance of TAP2 may indicate divergence in regulatory mechanisms between TAP1 and TAP2, in support of previous observations (Bahram et al., 1991; Zeidler et al., 1997). Application of relative quantification can be useful to derive changes in abundance of enzymes in a disease population compared with healthy volunteers and assess the implications of such changes for drug–drug interactions.
In conclusion, historical samples without appropriate standards can be subject to quantification using the TPA method, with the expectation that similar results would be achieved by standard-based methods. Normalization, or simply adjustment by a factor of two to three leads to estimates of absolute quantification. Where standards are available, BSA is a good choice from readily available purified proteins. Importantly, relative quantification is robust across methods, which allows consistent assignment of between subject variability.
Acknowledgments
The authors thank Pfizer (Groton, CT) for the provision of microsomal samples along with demographic, clinical, genotype, and activity data. The ChELSI Institute, the University of Sheffield, and the Manchester Institute of Biotechnology, the University of Manchester, provided access to mass spectrometry instrumentation.
Authorship Contributions
Participated in research design: Barber, Achour.
Performed data analysis: Barber, Al-Majdoub, Couto, Achour.
Wrote or contributed to the writing of the manuscript: Barber, Al-Majdoub, Couto, Vasilogianni, Tillmann, Alrubia, Rostami-Hodjegan, Achour.
Footnotes
- Received November 14, 2021.
- Accepted March 4, 2022.
Supported by the Centre for Applied Pharmacokinetic Research (CAPKR) consortium (Z.M.A.-M., B.A.), Cancer Research UK (A.-M.V.), the Saudi Ministry of Education (S.A.), and DrugTrain funded by the European Union’s Horizon 2020 research and innovation program (A.T.).
↵
This article has supplemental material available at dmd.aspetjournals.org.
Abbreviations
- BSA
- bovine serum albumin
- FASP
- filter-aided sample preparation
- HiN
- high N ion intensity method
- HLM
- human liver microsomes
- iBAQ
- intensity-based absolute quantification
- TPA
- total protein approach
- Copyright © 2022 by The American Society for Pharmacology and Experimental Therapeutics