Abstract
The appropriateness of relying on the coefficient of determination (r^{2}) as a statistical metric for judging the predictability of human clearance (CL) based on interspecies animal data was assessed. An explicit mathematical expression was derived for r^{2} as a function of species body weight and the corresponding measured value of CL. The derived mathematical function demonstrated that r^{2} is numerically large in most instances. Simulations using random CL generated from a common combination of species of mouse, rat, and monkey resulted in an r^{2} of 0.75 as the minimum, and 0.95 and 0.98 at 50th and 75th percentiles, respectively, given that total CL values increase with increasing species body weight. Analysis of literature data also indicated that the prediction accuracy of human CL was not correlated with values of r^{2}. Therefore, it is concluded that r^{2} is a limited statistical measure when assessing allometric scaling for the purpose of predicting human CL.
Allometric scaling has been widely used in predicting human pharmacokinetic (PK) parameters, although the allometric approach is empirical and numerous examples of substantial prediction errors have been observed (Boxenbaum 1982; Mahmood and Balian, 1996; Nagilla and Ward, 2004; Tang and Mayersohn, 2006). The allometric relationship for PK parameters across animal species and the confidence in extrapolation of this relationship to humans are often assessed with use of the coefficient of determination (r^{2}). The latter is obtained from linear regression of logtransformed animal body weights and the corresponding measured values of (log) PK parameters. High r^{2} values (ca. greater than 0.90) have been cited for most of the allometric relationships reported in the literature (Mahmood and Balian, 1996; Hu and Hayton, 2001). By definition, r^{2} is the fraction of the total squared error explained by the model. It is generally recognized that r^{2} is not a good statistical measure for nonlinear models. For example, overparameterized models could easily lead to high r^{2} values, whereas such models usually have little predictive value. It has also been long recognized that the loglog transformation of the allometric power function (P = a · W^{b}) would minimize deviations from the regression line (Smith, 1984). Therefore, it is reasonable to speculate that r^{2} may not offer a good measure for examining the predictive quality of the allometric relationship. We report here an explicit mathematical function of r^{2} derived to quantitatively assess the appropriateness of using r^{2} as a statistical measure in allometric scaling. Literature data were also evaluated to assess the relationship between r^{2} and the prediction performance by allometric scaling.
Materials and Methods
Theory.Expression of predicted PK parameters among species. The function relating predicted PK parameters (P̂) in humans or animal species to animal body weights (W_{i}, i = 1 to n, where n is the number of animal species) and observed animal PK parameters (P_{i}) has been described previously (Tang and Mayersohn, 2005). The following highlights the major mathematical functions needed in the subsequent derivations. where The predicted PK parameter value, P̂_{,} in the species of interest is obtained from
Expression for r^{2}. The loglog transformation of P = a · W^{b} gives Let Then, eq. 7 can be simplified to r^{2}, by definition, is expressed as where Although r^{2} can be explicitly expressed when eqs. 10 to 12 are placed into eq. 9, a visually clearer form is not readily available. Therefore, a common combination of animal species (mouse, rat, and monkey, with body weights assigned as 0.03, 0.3, and 3 kg, respectively), was used for illustration purposes. Substituting eqs. 13 to 16 into eq. 9 results in Note that, and this is true most of the time, values for total CL in mouse, rat, and monkey follow the corresponding order of body weight. Let CL_{monkey} = L · CL_{rat}, CL_{rat} = M · CL_{mouse}, where L, M > 1, r^{2} will be equal to
Simulating r^{2} values. Although a wide range of CL values for each species is considered here for simulation purposes, in reality CL values usually do not exceed certain limits in each species. The values ranged from 0.001 times liver blood flow (LBF) at the low end to 5 times LBF at the high end, for each species. In total, 10,000 random values of CL from each species were generated from a uniform distribution of [0.001× LBF to 5× LBF], in each species, where the LBF for mouse, rat, and monkey was 5.40, 3.31, and 2.62 l/h · kg, respectively (Davies and Morris, 1993). Thus, 10,000 r^{2} values were computed. In reality, the magnitude of CL follows the order of species body weight; therefore, the r^{2} values obtained were further constrained under the expected order of CL: CL_{monkey} > CL_{rat} > CL_{mouse}. All calculations and simulations were performed using MATLAB, version 6.5 (MathWorks Inc., Natick, MA).
Literature Data Evaluation. A large set of allometric data including CL values in rat, monkey, dog, and human is available (Jolivette and Ward, 2006). The combination of animal species in this data set was different from the combination of species in the abovementioned example developed under Theory. Due to the close body weights of monkey and dog, it is expected that some CL values in those species will not strictly follow the order of body weights. Therefore, lower r^{2} values are expected from that combination than from simulations obtained from the combination of mouse, rat, and monkey. Nevertheless, the correlation between the prediction performance and the r^{2} values can still be assessed.
Results and Discussion
Values for r^{2} obtained from the species combination mouse, rat, and monkey are derived from eq. 19. Notice that ≥ 2, r^{2} is, therefore, always greater than 0.75 (when → ∞), and equal to 1 (when = 2, or log L = log M). Furthermore, due to the existence of an approximate allometric relationship, that is, the value of L is usually close to that of M. Therefore, one expects that for most of the cases, r^{2} is high and close to 1. Even in some extreme situations, for example, when L = 10 and M = 1000, that is, the CL in rat is 1000fold higher than that in mouse, whereas the CL in monkey is only 10fold higher than that in rat, given the same 10fold of difference in body weights between rat and mouse, or monkey and rat, the resulting r^{2} is still as high as 0.92.
The results from the simulations also indicated that r^{2} values were high in the majority of cases based on the random values for CL in each species. The r^{2} values were highly rightskewed toward larger values (Fig. 1). The r^{2} values for the 50th and 75th percentiles are 0.95 and 0.98, respectively. These results clearly demonstrate that the common use of r^{2}, whose values are often considered to be “good” if they are greater than 0.90 or 0.95, is misleading.
The literature data indicated that there was no correlation between r^{2} and prediction performance (Fig. 2). This result suggests that r^{2} cannot serve as an indicator for predicting human values. This may be the case for several reasons. First, there exists great uncertainty associated with values in humans due to the complexity of biological systems; good r^{2} or even a perfect r^{2} does not necessarily mean that the human value will be on the allometric line of regression. Second, we have shown that r^{2} is not an appropriate measure gauging the quality of an allometric relationship; CL randomly sampled from animal species can result in good r^{2} values. Finally, the change in r^{2} is asymmetrical with respect to the values of CL. The procedure of loglog transformation followed by linear regression assumes a lognormal distribution of CL. Differentiating r^{2} (eq. 9) with regard to log CL in the monkey, for example, resulted in an asymmetrical function with respect to log CL (the resulting function, [∂(r^{2})]/[∂(log CL_{monkey})], is not shown here because of its complexity). Assume there is a perfect allometric relationship for mouse, rat, and monkey, with CL values of 0.130, 0.730, and 4.10 l/h (r^{2} = 1). Now, changing log CL in monkeys by–0.699 and +0.699 (or 5fold higher and lower, respectively) results in r^{2} values of 0.797 and 0.967, respectively. It is apparent that the same probability for the occurrence of–0.699 and +0.699 log CL resulted in striking differences in r^{2} values.
In summary, r^{2} has been shown to not be an appropriate statistical measure for allometric scaling. We are not aware of another simple statistical measure that would serve in its place. The purpose of this communication was not to discourage reporting of r^{2} values for allometric relationships. However, we caution that use of r^{2} as a measure for the quality of an allometric relationship is not appropriate. Claiming a good allometric relationship based on r^{2} values greater than 0.90 or 0.95 is not appropriate, and more practically important is the degree of confidence that one has in the predicted human value.
Appendix
Symbols Used in Derivations under Materials and Methods
a: The coefficient of the allometric power function, P = a · W^{b}
A_{i}: Equal to and used in which calculates the coefficient (a) of the allometric power function. Note, the PK parameter, P_{i}, observed in one animal species (i) is raised to its specific exponent, A_{i}, which is only dependent on the body weights across animal species and bears no relation to observed P_{i}.
b: The exponent of the allometric power function, P = a · W^{b}
B_{i}: Equal to and used in which calculates the exponent (b) of the allometric power function. Note, the log P_{i} is multiplied by its specific scalar, B_{i}, which is only dependent on the body weights across animal species and bears no relation to observed P_{i}.
L: Equal to
M: Equal to
n: The number of animal species
P_{i}: The PK parameter observed in species i
P̂_{i:} The PK parameter predicted in species i
W_{i}: The body weight of species i
X_{i}: Equal to log W_{i}, and used to transform the allometric power function, P = a · W^{b}, to linear function, Y = α + β · X
Y_{i}: Equal to log P_{i}, and used to transform the allometric power function, P = a · W^{b}, to linear function, Y = α + β · X
Ȳ: Mean of Y_{i}
Ŷ:Predicted Y_{i}
α: Equal to log a, and used to transform the allometric power function, P = a · W^{b}, to linear function, Y = α + β · X
β: Equal to b, and used to transform the allometric power function, P = a · W^{b}, to linear function, Y = α + β · X
Footnotes

Article, publication date, and citation information can be found at http://dmd.aspetjournals.org.

doi:10.1124/dmd.107.016444.

ABBREVIATIONS: PK, pharmacokinetics; CL, clearance; LBF, liver blood flow.
 Received May 14, 2007.
 Accepted August 28, 2007.
 The American Society for Pharmacology and Experimental Therapeutics