A new statistical approach to predicting aromatic hydroxylation sites. Comparison with model-based approaches

J Chem Inf Comput Sci. 2004 Nov-Dec;44(6):1998-2009. doi: 10.1021/ci049834h.

Abstract

A new approach is described that is able to predict the most probable metabolic sites on the basis of a statistical analysis of various metabolic transformations reported in the literature. The approach is applied to the prediction of aromatic hydroxylation sites for diverse sets of substrates. Training is performed using the aromatic hydroxylation reactions from the Metabolism database (Accelrys). Validation is carried out on heterogeneous sets of aromatic compounds reported in the Metabolite database (MDL). The average accuracy of prediction of experimentally observed hydroxylation sites estimated for 1552 substrates from Metabolite is 84.5%. The proposed approach is compared with two electronic models for P450 mediated aromatic hydroxylation: the oxenoid model using the atomic oxygen and the model using the methoxy radical as a model for the heme active oxygen species. For benzene derivatives, the proposed method is inferior to the oxenoid model and as accurate as the methoxy-radical model. For hetero- and polycyclic compounds, the oxenoid model is not applicable, and the statistical method is the most accurate. Broad applicability and high speed of calculations provide the basis for using the proposed statistical approach for high-throughput metabolism prediction in the early stages of drug discovery.