I'm working on the revisions for the USS paper, writing up the new analysis of reading frames and constraints. I put the data in the table on the left; the "Relative to best codons" values estimate how easily the USS-encoded tripeptide would be translated. I'm puzzling over why only 49 USSs are in reading frame A.
The graph below shows USS number plotted as a function of the postulated cause, proteome number (this is plotted the opposite way to that in my previous post on this analysis). This plot more clearly shows that one point (frame A) is an outlier; it has a lot fewer USSs than we
would expect given how often the tripeptide it specifies appears in the proteome.
Could this be because of codon constraints, i.e. because USSs in frame A require the tripeptide KVR to be encoded by inconvenient codons? No, the codon score is quite high (0.77) meaning that this frame's USS-specified codons are commonly used in the proteome.
If anything there should be more USSs in this frame than predicted by the proteome abundance of its tripeptide, because the consensus is weak for the first and second positions of the first codon and the second position of the second codon.
3 hours ago in Variety of Life