Biochimica et Biophysica Acta (BBA) - Protein Structure and Molecular Enzymology
ReviewThe lipocalin protein family: structural and sequence overview
Introduction
The special edition of Biochimica Biophysica Acta, of which this paper forms a small part, describes, in detail, the biological function of many members of the lipocalin protein family. The lipocalin family is a large, and ever expanding, group of proteins exhibiting great structural and functional variation, both within and between species [1]. Lipocalins are typically small (160–180 residues in length), extracellular proteins sharing several common molecular recognition properties: the binding of small, principally hydrophobic molecules (such as retinol); binding to specific cell-surface receptors; and the formation of covalent and non-covalent complexes with other soluble macromolecules. Although they have been classified mainly as transport proteins, it is now clear that members of the lipocalin family fulfil a wide variety of different functions.
Despite many common characteristics and common functions, membership of the lipocalin family has been defined largely on the basis of sequence, or structural, similarity and it has grown to encompass a large corpus of proteins. Within this the lipocalins display unusually low levels of overall sequence conservation, with pairwise comparisons often falling below 20%, the nominal threshold for a reliable alignment. However, all lipocalins share sufficient similarity, in the form of short characteristic conserved sequence motifs, to form the basis of a useful definition of family membership [2], [3]. Most lipocalins share three characteristic conserved sequence motifs – the kernel lipocalins – while other more divergent family members – the outlier lipocalins – typically share only one or two.
Lipocalins of known three-dimensional structure include a group of kernel lipocalins: retinol-binding protein (RBP) [4], β-lactoglobulin (Blg) [5], insecticyanin [6], bilin-binding protein (BBP) [7], major urinary protein [8], α2u-globulin [8], [9], epididymal retinoic acid-binding protein [10], and neutrophil lipocalin [11]. There are also a number of outlier lipocalin groups including odorant-binding protein [12], [13], Bos d2 allergen [14], nitrophorin [15], and a histamine binding protein from the tick Rhipicephalus appendiculatus [16].
The common structure of the lipocalin protein fold is now well-described [1], [3], [17]. The lipocalin fold is a highly symmetrical all-β structure dominated by a single eight-stranded antiparallel β-sheet closed back on itself to form a continuously hydrogen-bonded β-barrel. In cross-section, this has a flattened or elliptical shape. The β-barrel encloses a ligand-binding site composed of both an internal cavity and an external loop scaffold. It is this diversity of cavity and scaffold that gives rise to a variety of different binding modes each capable of accommodating ligands of different size, shape, and chemical character. The eight β-strands of the barrel, labelled A–H (see Fig. 1), are linked by a succession of +1 connections, giving it the simplest possible β-sheet topology. The seven loops, labelled L1–L7, are all typical of short β-hairpins, except loop L1: this is a large Ω loop. Loop L1 forms a lid folded back to close partially the internal ligand-binding site found at this end of the barrel. Between strand H and the short terminal strand I is an α-helix; this is an ever-present feature of the lipocalin fold but is not conserved in its position relative to the axis of the β-barrel nor in its length (see Fig. 1). Previous work has analysed the conservation of sequence and structure in the lipocalin protein family [1], [8]. These accounts show how the common core characteristic of the lipocalin fold is dominated by three large structurally conserved regions (SCRs): SCR1 (strand A and the 310-like helix preceding it), SCR2 (strands F and G, and loop L6 linking them), and SCR3 (strand H and adjoining residues). Other SCRs of the common core are small and can be neglected (see Fig. 2). The three principal SCRs each contain a sequence motif that is wholly, or partly, invariant.
Together with three other distinct protein families: the fatty-acid-binding proteins (FABPs), avidins, metalloproteinase inhibitors (and the presently enigmatic triabin), the lipocalin family forms part of a larger structural superfamily: the calycins [3], [18], [19]. This is an example of a ‘structural superfamily’: a set of proteins with closely related three-dimensional structures that show no significant overall similarity at the sequence level. In this paper, we shall review and update structure and sequence relations within the lipocalins, and also those between the lipocalins and other members of the calycin protein superfamily.
Section snippets
Structural relationships to the calycin protein superfamily
The lipocalins, the fatty acid-binding proteins (FABPs), and the avidins – three families of ligand-binding proteins – together form the calycin protein superfamily. The FABPs are a family of predominantly intracellular proteins involved in lipid metabolism [20]. The avidins are proteins with a remarkable affinity for the vitamin biotin, which have found important applications in biotechnology [21]. Recently, the size of the calycin protein superfamily has been enlarged to incorporate a group
Relations within the lipocalin family
Lipocalins have been found predominantly in eukaryotic organisms, mostly in vertebrates, although some have been identified in other phyla. Apart from vertebrates, which are over-represented for historical and pragmatic reasons, the best exemplified phylum is arthropoda, where proteins include butterfly insecticyanin, grasshopper lazarillo [36], cockroach Bla g 4 protein [37], lobster crustacyanin [38], a lipocalin from Leucophaea maderae [39], nitrophorin from Rhodnius prolixus [15], and
Discussion
The lipocalin protein family has continued to grow, both in terms of sequences and structures determined, but also in their diversity and interest. In this brief review, we have focused our attention primarily on new relationships within the calycin superfamily and on a more careful appreciation of sequence relationships with in the lipocalins.
The kernel–outlier division has proved a useful tool in the analysis of the lipocalins accounting for most sequences of the lipocalin family. The
Conclusion
The classification of the lipocalins as kernel and outliers is a simple, yet powerful, definition. Nonetheless, despite its classificatory power there are examples, such as the late lactation proteins, that have proved exceptions, as they lack the Trp–Arg interaction characteristic of almost all calycins. As the number of calycins identified as members of the superfamily continues to increase, it seems likely that it will become even more diverse. Nonetheless, with this growth will come a new
Acknowledgements
We thank the two anonymous referees for their helpful and constructive comments, which have doubtless strengthened this paper. We should also like to thank Dr T.K. Attwood, Professor J.-P. Salier, Dr B. Akerstrom, and Dr R.E. Bishop for helpful discussions.
References (58)
- et al.
Mouse oncogene protein-24p3 is a member of the lipocalin protein family
Biochem. Biophys. Res. Commun.
(1991) - et al.
Bovine beta-lactoglobulin at 1.8 angstrom resolution – still an enigmatic lipocalin
Structure
(1997) - et al.
Molecular structure of the bilin binding-protein (BBP) from Pieris brassicae after refinement at 2.0-Å resolution
J. Mole. Biol.
(1987) Structure of the epididymal retinoic acid-binding protein at 2.1 angstrom resolution
Structure
(1993)- et al.
The solution structure and dynamics of human neutrophil gelatinase-associated lipocalin
J. Mol. Biol.
(1999) - et al.
Probing the molecular basis of allergy – three-dimensional structure of the bovine lipocalin allergen bos D 2
J. Biol. Chem.
(1999) - et al.
The crystal structure of nitrophorin 4 at 1.5 angstrom resolution: transport of nitric oxide by a lipocalin-based heme protein
Structure
(1998) - et al.
Tick histamine-binding proteins: isolation, cloning, and three-dimensional structure
Mol. Cell
(1999) Structural relationship of streptavidin to the calycin protein superfamily
FEBS Lett.
(1993)- et al.
Lipid-binding proteins – a family of fatty-acid and retinoid transport proteins
Adv. Protein Chem.
(1994)
Crystal-structure of a complex between Serratia marcescens metalloprotease and an inhibitor from Erwinia chrysanthemi
J. Mol. Biol.
How far divergent evolution goes in proteins
Curr. Opin. Struct. Biol.
Phosphatidylinositol phosphate kinase: a link between protein kinase and glutathione synthase folds
J. Mol. Biol.
A structural tree for proteins containing 3 beta-corners
FEBS Lett.
Scop: a structural classification of proteins database for the investigation of sequences and structures
J. Mol. Biol.
cDNA cloning of an adult male putative lipocalin specific to tergal gland aphrodisiac secretion in an insect (Leucophaea maderae)
FEBS Lett.
The first prokaryotic lipocalins
Trends Biochem. Sci.
Stationary phase expression of a novel Escherichia coli outer membrane lipoprotein and its relationship with mammalian apolipoprotein D. Implications for the origin of lipocalins
J. Biol. Chem.
Xanthophyll cycle enzymes are members of the lipocalin family, the first identified from plants
J. Biol. Chem.
Evolution of the lipocalin family as inferred from a protein sequence phylogeny
Biochim. Biophys. Acta
The bacterial lipocalins
Biochim. Biophys. Acta
Eukaryotic signalling domain homologues in archaea and bacteria. Ancient ancestry and horizontal gene transfer
J. Mol. Biol.
Many of the immunoglobulin superfamily domains in cell-adhesion molecules and surface-receptors belong to a new structural set which is close to that containing variable domains
J. Mol. Biol.
Supersites within superfolds. Binding site similarity in the absence of homology
J. Mol. Biol.
Modelling G protein coupled receptors for drug design
Rev. Biomembr.
Protein structures sustain evolutionary drift
Fold. Design
ALTER: eclectic management of molecular structure data
J. Mol. Graph. Mod.
The lipocalin protein family: structure and function
Biochem. J.
Structure and sequence relationships in the lipocalins and related proteins
Protein Sci.
Cited by (739)
Probing the binding sites of bioactives with β-Lactoglobulin at different gastrointestinal pHs
2024, Food HydrocolloidsStructural determinants of odorant-binding proteins affecting their ability to form amyloid fibrils
2024, International Journal of Biological MacromoleculesThe host plant strongly modulates acaricide resistance levels to mitochondrial complex II inhibitors in a multi-resistant field population of Tetranychus urticae
2023, Pesticide Biochemistry and PhysiologyTributyltin-binding protein type 1 (fish acid glycoprotein) is a potential gatekeeper of ethinylestradiol action in fish
2023, Comparative Biochemistry and Physiology Part - C: Toxicology and Pharmacology