GPS: a novel group-based phosphorylation predicting and scoring method

https://doi.org/10.1016/j.bbrc.2004.11.001Get rights and content

Abstract

Protein phosphorylation is an important reversible post-translational modification of proteins, and it orchestrates a variety of cellular processes. Experimental identification of phosphorylation site is labor-intensive and often limited by the availability and optimization of enzymatic reaction. In silico prediction may facilitate the identification of potential phosphorylation sites with ease. Here we present a novel computational method named GPS: group-based phosphorylation site predicting and scoring platform. If two polypeptides differ by only two consecutive amino acids, in particular when the two different amino acids are a conserved pair, e.g., isoleucine (I) and valine (V), or serine (S) and threonine (T), we view these two polypeptides bearing similar 3D structures and biochemical properties. Based on this rationale, we formulated GPS that carries greater computational power with superior performance compared to two existing phosphorylation sites prediction systems, ScanSite 2.0 and PredPhospho. With database in public domain, GPS can predict substrate phosphorylation sites from 52 different protein kinase (PK) families while ScanSite 2.0 and PredPhospho offer at most 30 PK families. Using PKA as a model enzyme, we first compared prediction profiles from the GPS method with those from ScanSite 2.0 and PredPhospho. In addition, we chose an essential mitotic kinase Aurora-B as a model enzyme since ScanSite 2.0 and PredPhospho offer no prediction. However, GPS offers satisfactory sensitivity (94.44%) and specificity (97.14%). Finally, the accuracy of phosphorylation on MCAK predicted by GPS was validated by experimentation, in which six out of seven predicted potential phosphorylation sites on MCAK (Q91636) were experimentally verified. Taken together, we have generated a novel method to predict phosphorylation sites, which offers greater precision and computing power over ScanSite 2.0 and PredPhospho.

Section snippets

Methods

Data collection. We get the data set of phosphorylation sites from Phospho.ELM [24] which also includes the data of PhosphoBase [25]. After removing the phosphorylation sites with ambiguous information of PKs, we get 1404 items. We also manually checked the recent publications and got 597 more items. After clustering some homology PKs with too few known phosphorylation sites into a unique group, we got 52 PK families/PK groups, including ABL, ALK, AMPK, ATM, AURORA-B, BTK, CAK, CAM-II, CDK,

Performance on kinase PKA

We try to evaluate the performance of GPS method against two popular phosphorylation prediction systems, ScanSite 2.0 [20] and PredPhospho [21]. ScanSite 2.0 provides phosphorylation site prediction for 26 kinases, while PredPhospho only provides such functionality for four groups and four families of kinases. Besides most of the kinases in the above two systems, GPS method also includes several kinases which came into focus recently, e.g., Aurora-B. In the following, we will mainly evaluate

Acknowledgments

We thank Dr. T.J. Gibson and Dr. F. Diella for providing the data set of Phospho.ELM for this study. This work was supported by grants from Chinese Natural Science Foundation (39925018 and 30121001), Chinese Academy of Science (KSCX2-2-01), Chinese 973 project (2002CB713700), and American Cancer Society (RPG-99-173-01) to X. Yao. X. Yao is a GCC Distinguished Cancer Research Scholar.

References (38)

  • G.J. Gorbsky

    Mitosis: MCAK under the aura of Aurora B

    Curr. Biol.

    (2004)
  • W. Lan et al.

    Aurora B phosphorylates centromeric MCAK and regulates its localization and microtubule depolymerization activity

    Curr. Biol.

    (2004)
  • G. Manning et al.

    The protein kinase complement of the human genome

    Science

    (2002)
  • S. Caenepeel et al.

    The mouse kinome: discovery and comparative genomics of all mouse protein kinases

    Proc. Natl. Acad. Sci. USA

    (2004)
  • S.B. Ficarro et al.

    Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae

    Nat. Biotechnol.

    (2002)
  • B.A. Ballif, J. Villen, S.A. Beausoleil, D. Schwartz, S.P. Gygi, Phosphoproteomic analysis of the developing mouse...
  • S.A. Beausoleil et al.

    Large-scale characterization of HeLa cell nuclear phosphoproteins

    Proc. Natl. Acad. Sci. USA

    (2004)
  • Y.P. Lim et al.

    Phosphoproteomic fingerprinting of epidermal growth factor signaling and anticancer drug action in human tumor cells

    Mol. Cancer Ther.

    (2003)
  • T.S. Nuhse et al.

    Phosphoproteomics of the Arabidopsis plasma membrane and a new phosphorylation site database

    Plant Cell

    (2004)
  • Cited by (138)

    • Phosphorylation of RIAM by src promotes integrin activation by unmasking the PH domain of RIAM

      2021, Structure
      Citation Excerpt :

      Two Src family kinases, lymphocyte-specific protein tyrosine kinase (LCK) and Fyn, are essential for T cell development and activation (Laird and Hayes, 2010; Lovatt et al., 2006; Palacios and Weiss, 2004), and have also been shown to phosphorylate RIAM in T cells (Patsoukis et al., 2009). Interestingly, RIAM possesses 20 tyrosines in total, 15 of which are located in the PH domain and the adjacent RA-PH linker and are predicted to be substrates for Src family kinases by Group-Based Prediction System (GPS), a computational prediction algorithm of cognate protein kinases (Zhou et al., 2004). Nonetheless, the regulatory role of RIAM's function in mediating integrin activity through phosphorylation in the PH domain by Src kinases remains unknown.

    View all citing articles on Scopus
    1

    These authors contributed equally to this work.

    View full text