The question the data address is why uptake sequences in coding regions are preferentially located in particular reading frames, specifically the effect of the codons they use on how efficiently the mRNA can be translated. I had used the overall frequencies of different codons in the proteome (all the proteins the genome encodes) to estimate codon 'quality', by summing the frequencies of the three codons specified by the uptake sequence in a specific reading frame, and dividing this by the sum of the frequencies of the best (most-used) codons for the same amino acids. This analysis showed no relationship between codon quality and the reading frame bias.
In retrospect I should have multiplied the frequencies rather than adding them. This both gives a more realistic approximation of how the codons' effects interact, and eliminates the differences caused by the differing frequencies of different amino acids in the proteome. I should also have excluded those uptake sequence positions that allowed more than one codon.
And I probably should have done the analysis for different species rather than only for H. influenzae (though, to be fair on myself, at the time I was working on a H. influenzae USS analysis).
So now I've redone the analysis for the three species that exemplify the different uptake sequence types (H. influenzae, A. pleuropneumoniae and N. meningitidis), and find a substantial effect of codon quality. That is, in each species, uptake sequences are mostly found in reading frames that allow use of the most common codons for those amino acids. As codon frequency is thought to reflect the abundance of the corresponding tRNAs, this means that differences in the efficiency of translation explain some of the differences in reading frame use.