Friday, December 31, 2010

A 40 year old mystery: The identity of the wybutosine hydroxylase and other questions

Modified bases are particularly prevalent at position 37 of tRNA. Being adjacent to the anticodon, these modifications stabilize mRNA–tRNA pairing and assist maintenance of the reading frame during translation. One such complex modified base found at this position in eukaroytic phenylalanine tRNA synthetase is wybutosine (also called Y-base or yW). Over the past 40 years various studies identified the intermediates and enzymes involved in its biosynthesis. Additionally, it was shown that precursors of this modification pathway are present in archaea, suggesting an archaeal origin for this modification.

Although wybutosine is detected in diverse eukaryotes, this base position in tRNAPhe shows considerable variation. For example, tRNAPhe in yeast contains wyosine in the same position whereas flies only have 1-methylguanosine. This type of variation can be attributed to gene loss, given that 1-methylguanosine and wyosine are precursors in the wybutosine biosynthesis pathway. This is also supported by the phyletic distribution of the enzymes of this biosynthesis pathway in these organisms. In contrast, mammalian liver extracts, and Geotrichum were shown to contain a further modification; hydroxy/hydroperoxywybutosine, suggesting the presence of a distinct enzyme that catalyzes this step. Until recently, the identity of this hydroxylase was not known.

Using a combination of sequence and contextual analysis, we identified the enzymatic domain involved the biosynthesis of hydroxy/hydroperoxywybutosine. The domain is often fused to enzymes involved in the biosynthesis ofwybutosine precursors and also occurs as a stand-alone domain in metazoans (e.g. C2orf60 in humans). What is remarkable is that it turned out to be a member of the JOR(jumonji-related)/JmjC superfamily. Members of this superfamily are normally characterized as hydroxylases of proteins or histone demethylases. This is the first example of an RNA substrate for a member this superfamily. A few months after our publication, this prediction was experimentally confirmed. The JOR/JmjC belongs to a lineage of protein called the 2-oxoglutarate Fe (II) dependent dioxygenases or 2OGFeDO. In turn, the core of this lineage belong to the double stranded beta helix (DSBH) fold.

This discovery actually unraveled more questions--

Q1. Are RNA substrates ancestral to the JOR/JmjC superfamily or were they derived only in the wybutosine hydroxylase family?
Answer: The use of an RNA substrate as in the wybutosine hydroxylase appears to be a derived condition in this superfamily of proteins.

Q2. What are the inter-relationships between the various JOR/JmjC families?
Answer: All studies until now only used eukaryotic members for phylogenetic reconstruction of evolutionary relationships between various JOR/JmjC families. Using a comprehensive sequence, structure and phylogenetic based approach that included bacterial sequences, we show that the eukaryotes contain 17 major lineages of JOR/JmjC proteins, that were in turn acquired on three distinct occasions from bacteria. Thus, the major groups of JOR/JmjC appear to have diversified in bacteria followed by a transfer of at least one member from each of the three clades to the eukaryotes, prior to the divergence of the heterolobosean-kinetoplastid clade and the remaining eukaryotes. The three major clades are named the histone demethylase-like, FIH1/yW-hydroxylase-like and the MINA/No66-like clade respectively.

In addressing this, we went a few steps further and were able to classify the entire double-stranded beta-helix fold. Here's an interactive site where you can play with our classification. This led to some interesting hypotheses about their evolution.

Q3. What were the roles of the bacterial ancestors of the eukaryotic JOR/JmjC?
Answer: Quite consistently, we observe that bacterial representatives of this superfamily are coded by gene clusters involved in biosynthesis of secondary metabolites, such as pyoverdine-like siderophores and peptide antibiotics. These gene clusters often encode multiple functionally linked dioxygenases, tryptophan halogenase-like oxidoreductases and other enzymes involved in non-ribosomal peptide biosynthesis and modification. Actually, some of these contexts are quite remarkable. For example, in the Synechococcus phage Syn9 one of these gene clusters encodes 10 tandem dioxygenases including the MINA/No66 homolog, Syn9-gp49. The remaining nine dioxygenases belong to the classical 2OGFeDO superfamily. Analysis of these nine dioxygenases suggests that they are all not closely related. They belong to at least five distinct families, including one distinguished by a fusion to tetratricopeptide repeats.

Q4. Are there any other distinct substrates predicted for the eukaryotic JOR/JmjC superfamily?
Answer: Did you know that there are members of the JOR/JmjC family that are membrane associated or secreted? In each of the 3 major clades in eukaryotes, we detected secreted or membrane associated proteins with JOR/JmjC domains. Some members of the FIH1 clade are fused to sulfotransferases. A distinct lineage-specific expansion of MINA/No66 like JOR/JmjC in Monosiga comprises of receptor-like proteins with extracellular JOR/JmjC domains. All of these proteins combine a JOR/JmjC domain with one or more of several extracellular domains such as cysteine-rich GCC2/3 repeats, immunoglobulin, disintegrin or SUSHI domains and with intracellular SH2 or tyrosine kinase domains. These extracellular proteins appear have been recruited for modifying cell surface proteins, probably as hydroxylases similar to leprecan and the prolyl hydroxylase. Further, the receptor-like proteins in Monosiga could also function as sensors of redox conditions that signal via intracellular tyrosine phosphorylation pathways.

Finally, a common misconception is that the N-terminal region of the JOR/JmjC domain, called JmjN in the literature is a distinct domain. Structural analysis shows this to be conserved in all members of the JOR/JmjC superfamily. Further the domain has no independent existence and merely represents a structural extension of the DSBH fold.

Wait, there is much more.... You can read about it here in detail. Feel free to browse the comprehensive supplement.