Now here’s an interesting thing: a completely unsuspected gene – as in, on open reading frame (ORF) that actually DOES something – in one of the best-studied familes of plant viruses. From the International Service for the Acquisition of Agribiotech Applications (ISAAA)’s CropBiotech Update 30 May 2008:
Scientists Discover Hidden Gene in Major Plant Virus Family
The virus family Potyviridae includes more than 30 percent of known plant virus species, most of which are of great agricultural significance such as the potato virus Y, turnip mosaic virus and wheat streak mosaic virus. Scientists from the Iowa State University, working with colleagues from the University College Cork in Ireland, have discovered a tiny gene present in all members of this virus family. Without this gene, the viruses are harmless.
Using a gene-finding software, the team identified a stretch of nucleotide bases that overlaps with a much larger and well characterized gene in potyviruses. They called the new gene pipo (short for pretty interesting potyvirus ORF). Alterations in the sequence of the pipo gene, while leaving the polyprotein amino acid sequence unaltered, were found to be lethal for the viruses.
The team led by Allen Miller and John Atkins are now working to determine the function of gene during infection as well as how the pipo protein is expressed from the viral genome. For this, the U.S. Department of Agriculture National Research Initiative (USDA-NRI) has awarded them with a $400,000 competitive grant.
For more information, visit \http://www.public.iastate.edu/~nscentral/ Read the paper published by PNAS at http://www.pnas.org/cgi/reprint/105/15/5897
Nice one, guys…$400 000 should buy a few more ORFs…B-) Seriously, though, the dogma has been for years that potyviruses, like picornaviruses, have a single long (~10kb) ORF, which expresses a polypeptide from the genomic RNA which is cotranslationally processed into a number of different proteins – and that was all there was. This discovery is like finding a new and secret drawer in an old and familiar chest of drawers, or an extra pocket in your trousers. Or, as I did recently, that there wer two interior lights in my car which I had not known of for six years…but I digress.
In the words of the authors:
“We report the discovery of a short ORF embedded within the P3 cistron of the polyprotein but translated in the +2 reading-frame. The ORF, termed pipo, is conserved and has a strong bioinformatic coding signature throughout the large and diverse Potyviridae family. Mutations that knock out expression of the PIPO protein in Turnip mosaic potyvirus but leave the polyprotein amino acid sequence unaltered are lethal to the virus. Immunoblotting with antisera raised against two nonoverlapping 14-aa antigens, derived from the PIPO amino acid sequence, reveals the expression of an ~25-kDa PIPO fusion product in planta. This is consistent with expression of PIPO as a P3-PIPO fusion product via ribosomal frameshifting or transcriptional slippage at a highly conserved G1-2A6-7 motif at the 5′ end of pipo. This discovery suggests that other short overlapping genes may remain hidden even in well studied virus genomes (as well as cellular organisms)…”
They go on to tout the virtues of the “software package MLOGD”, which it turns out is from here (Firth AE, Brown CM (2006) Detecting overlapping coding sequences in virus genomes. BMC Bioinformatics 7:75), and is the Maximum Likelihood Overlapping Gene Detector. They say:
“Tests show that, from an alignment with just 20 mutations, MLOGD can discriminate non-overlapping CDSs from non-coding ORFs with a typical accuracy of up to 98%, and can detect CDSs overlapping known CDSs with a typical accuracy of 90%. In addition, the software produces a variety of statistics and graphics, useful for analysing an input multiple sequence alignment.”
And yes, it does make nice pictures: see this and this for examples.
All of which simply goes to reinforce my conviction that virus genomes may be generally quite small, but small does not necessarily mean simple. Small means having to compress information, reuse sequences – and overlap ORFs in unsuspected ways.