Saturday, October 29, 2011

A mystery pathway in prokaryotes

Computational studies of proteins have greatly contributed to our understanding of the biology of a species or a system.  In many instances, computational analyses have solved tricky biochemical problems (e.g. the biochemistry of pupylation), or have uncovered unexpected systems or pathways (e.g. the prokaryotic cognates of the eukaryotic ubiquitin pathway), or solved long-standing mysteries (e.g. the principal transcription factors of apicomplexa), or clarified difficult evolutionary problems (e.g. the extent of lateral transfer between prokaryotes, the evolutionary origins of the AID/APOBEC deaminases). Yet there are instances, when the biochemistry of most parts of a system are easily identifiable, but the biology remains an unsolved puzzle. Recently, we uncovered one such widespread system present in most lineages of proteobacteria, actinobacteria, spirochaetes, cyanobacteria, chlamydiae and chloroflexi and also some crenarchaea. As the system is present in Mycobacterium tuberculosis, we shall use the Mycobacterial gene names  as representative identifiers. The basic system consists of
  1. Rv2410c (DUF403 in Pfam 25) : An alpha-helical protein,called Alpha-E  that contains an internal duplication with each repeat possessing conserved ER motifs. Click here to access a multiple alignment.
  2. Rv2411c (split as DUF404+DUF407 in Pfam 25): A circularly permuted peptide ligase of the ATP-grasp fold.
  3. Rv2409cRv2569c: Transglutaminases that could serve either as a peptidase or a classical transglutaminase.
  4. Rv2568c (DUF2248 in Pfam 25): A metallopeptidase-family peptidase.
  5. Rv2567: An inactive circularly permuted ATP-grasp fused to the Alpha-E domain.
  6. Rv2566 (Transglut+DUF2126 in Pfam 25): A transglutaminase fused to a circularly permuted peptide ligase of the COOH-NH2 ligase superfamily.
  7. Some species additionally contain an NTN hydrolase related to the  proteasomal peptidase (called Anbu in one study) in the gene neighborhoods (not  Mycobacterium) and amidotransferases of the GAT-I family.  Click here to access all operons.
Thus, these systems together include two active peptide ligases, 5 distinct types peptidase-like proteins (2 transglutaminases, Zincin-like metallopeptidase, the GAT-I domain and a NTN peptidase) , the mystery Alpha-E  protein and an inactive peptide ligase that may be fused to the mystery Alpha-E domain. In any case all systems minimally contain at least one peptide ligase, the Alpha-E protein and one peptidase-like domain. The only evidence for its biological context comes from experiments in Pseudomonas putida where the transglutaminase is highly expressed upon nitrogen starvation. Several protein/peptide conjugation systems contain  peptide ligases (e.g. the ubiquitin transferring enzymes, the Pup ligases) as well as deconjugating emzymes (e.g. JAB deubiquitinase and Dop depupylase) in the same gene context (For a comprehensive set of examples, read our paper on amidoligases).
However, assembling the pieces of the puzzle together, we can be sure of a few things
  1. This is not involved in amino acid or glutathione biosynthesis. The species containing this system typically have intact pathways for glutathione or amino acid biosynthesis. Also there are no other genes suggestive of metabolic function in the neighborhood.
  2. It is not involved in the biosynthesis of a distinctive secondary metabolite such as an antibiotic or siderophore, for it lacks characteristic associations seen in these systems (see examples in our study of such systems).
  3. There is no evidence of a small protein that is conjugated to a target as in ubiquitination or pupylation.
Gene neighborhoods of the novel system described in this post
Thus the system appears to be a novel peptide transfer/peptidase system with the Alpha-E protein playing a central role.  We postulate that the ATP-grasp and COOH-NH2 ligase in this system catalyze two distinct peptide bond formations. It is tempting to speculate that the Alpha-E protein with the highly conserved ER motifs serve as a substrate for elongation of a peptide via the gammacarboxylate of its side chain. This proposal is consistent with the use of glutamate side chains as substrates in eukaryotic proteins such as tubulin by peptide tagging ATP-grasp enzymes.The presence of two peptidase genes in most of these operons suggests that two successive peptidase reactions are necessary for removal of the peptide product.
 Alternatively, the transglutaminase superfamily protein might indeed function in cross-linking the peptide to lysine side chains or other amino groups. Thus, the weight of the contextual evidence supports a role for this widespread conserved gene-neighborhood in peptide synthesis; the resulting peptide could be added as a tag to the unique Alpha-E protein in this system.Such a tag could either regulate the assembly of complexes of the alpha-E domain protein via cross-linking or its interactions (e.g. as in tubulin) or serve as an amino acid storage mechanism. Yet, as you can see, certain details of this interesting pathway are in need of further investigation, but its widespread presence suggests that an important and exciting piece of biology awaits creative experimentalists...