One of the more fundamental mysteries in the evolution of Life concerns the emergence of the translation release factors. Release factors are the proteins which severe the terminal tRNA-aminoacyl bond at the ribosome, enabling the completed protein chain to diffuse from the ribosome. Release factors accomplish this task by adopting a structural shape that mimics the tRNA, allowing it to access and interact with the stop codon in the ribosome. On gaining access to the ribosome, the release factor catalyzes hydrolysis of the tRNA-aminoacyl bond by coordinating a water molecule with an absolutely-conserved glumatine residue.
Despite this crucial and universal function, no single release factor can be traced to the last universal common ancestor (LUCA) of life: in fact, the principal release factor proteins of the bacterial and archaeo-eukaryotic lineages belong to two entirely distinct protein folds. Dueling parsimonious evolutionary scenarios can account for this observation: 1) one of the two release factor folds was found in the LUCA, and was later displaced in one of the lineages and 2) the two versions emerged independently in the lineages, each displacing the ancestral release factor. In the latter scenario, the release factor could have been a tRNA or tRNA-related ribozyme, consistent with other RNA-world hypothesis.
To throw light on the question of early release factor diversification, we specifically investigating the evolutionary history of the archaeo-eukaryotic release factors (aeRF1s) [see Verma et al.]. Through this analysis, we identified a pair of novel clades in the aeRF1 superfamily, both of which surprisingly had a substantial bacterial component. One of these clades contained 4 families with an unusually complicated evolutionary history: the earliest-branching family is found only in archaea and retains the core architectural features of the classical archaeal aeRF1s, suggesting it was an ancient duplication of the classical versions. At some point, representatives from this family were transferred to a terminally-differentiated bacterial lineage, eventually giving rise to two distinct families. One of these families, found primarily in Bacteroidetes, was then acquired early in the evolution of eukaryotes, giving rise to the final family in the clade. This eukaryotic family is sporadically-distributed across several lineages, but was fixed early in the crown group eukaryotes (plants-fungi-amoebozoa-animals) as the central catalytic core of the Vms1/ANKZF1-like proteins.
Through a collaboration with the Deshaies laboratory, this family was characterized in a recent publication in the Nature magazine as the key missing release factor of the ribosome quality control (RQC) pathway [Verma et al.], the pathway that rescues “jammed” ribosomes which are stalled on mRNA with the growing peptide chain still attached. We suspect, due to shared sequence and domain architectural features, that the prokaryotic families of this clade (named the VLRF1 clade for Vms1-like aeRF1 clade) are also likely to be involved in the clearance of stalled ribosomes.
The second clade we identified contained a total of 14 previously unrecognized families found across a diverse assortment of bacterial lineages (named the baeRF1 clade for bacterial-aeRF1). Despite the monophyly of these families, a wide range of structural, domain architectural, and sequence diversity is observed, suggestive of considerable selective pressure being applied to these families. Perhaps most notably, the characteristic loop region of the aeRF1 superfamily which typically houses the active site glutamine residue varies tremendously in length and content both across and within the baeRF1 families, many families are even predicted to be catalytically inactive due to the lack of a strongly-conserved glutamine residue. While these families remain functionally uncharacterized, one strongly-conserved genomic contextual association was consistently observed across several families: shared genome association with an HPF-like ribosome hibernation factor. These domains are known to directly interact with the translational machinery and induce conformational change in the ribosome to promote the inactivation of ribosomes. The association between baeRF1 and HPF-like domains could indicate that baeRF1 proteins play a complementary role in inducing ribosome inactivation, potentially by occupying the typical tRNA binding sites on the ribosome (consistent with the inactivation of the enzyme in most families). Alternatively, the association could act as a regulatory switch, with the baeRF1 displacing HPF and restoring ribosome function. Given the rapidly-evolving features of the baeRF1 clade, it seems likely that this function is tied to bacterial conflict. As such, baeRF1 could be activated in response to the detection of invasive elements, potentially to prevent the element from hijacking the endogenous translation machinery.
While these findings speak to a previously poorly-understood complexity in the evolution of the aeRF1 superfamily and resolves one of the last remaining mysteries of the RQC pathway, ultimately little is revealed about the state of the release factor in the LUCA. While it may be tempting to suggest that the VLRF1 and baeRF1 clades represent the surviving remnants of a potential ancestral bacterial aeRF1 presence that was displaced early in bacterial lineage by the bacterial-specific release factor fold, our analysis indicates that both clades likely emerged from later transfers from a classical archaeal aeRF1 progenitor.
The most striking evolutionary finding from this analysis is the clear acquisition of the eukaryotic Vms1/ANKZF1-like release factors from the Bacteroidetes lineage. This observation adds to an increasing list of key eukaryotic factors which have their direct antecedents in the Bacteriodetes, suggesting that an important complement of genetic material in eukaryotes was likely inherited early in eukaryotic evolution from a Bacteroidetes symbiont, independent of the α-proteobacterial mitochondrial progenitor.