The overall goal of the work the post-doc and I describe in our Defining the USS manuscript is to find out whether the sequence specificity of the H. influenzae DNA uptake machinery matches the consensus of the uptake signal sequence (USS) repeats in its genome. We've done good jobs of characterizing both the uptake specificity and the USS consensus, but now we're discovering that comparing them is trickier than we had realized.
The figure shows both results, with the genome consensus shown as a SequenceLogo at the bottom, and the effect on uptake of changing each position shown by the bar chart above. Because all the positions in the core (the AAGTGCGGT at the left) have a very strong consensus in the genome, we expected that changing any of these positions would dramatically decrease uptake. But instead we see that changes to different core positions have very different effects. Changing the first position doesn't decrease uptake at all (within our limit of detection) but changing any of the central three knocks it way down.
One confounding factor comes from how the genome consensus is described; the Y-axis of the logo is not a linear scale but a logarithmic scale reflecting 'information content' of the consensus. Another may come from how the uptake effects were measured; cells were deliberately given less DNA than needed to saturate the uptake machinery.
But the biggest complication is that, although we have a simple 'molecular drive' hypothesis describing how biased DNA uptake leads to accumulation of the preferred sequence in the genome, we are only beginning to develop ways to evaluate the different components of this model. This means that we can't predict exactly what sequences will accumulate in response to any specific uptake bias.
The post-doc describes the discrepancy between the two parts of the figure as maybe resulting from 'saturation' of the evolutionary process causing USS accumulation. If a small uptake bias acting over millions of generations is enough to drive the preferred base at a particular position to a very high frequency, then a larger bias may not make much difference to the outcome. For example, say USSs with a C or G or T at position 1 are taken up 95% as well as a USS with the consensus A at that position, but this bias provides sufficient long-term drive to cause 98% of the USS population to have As at that position. If so, increasing the bias might not increase the frequency of As by very much.
So what will we say in our manuscript? We could start by discussing the implications for the role of the USS in DNA uptake. We'll point out that uptake is very sensitive to the central positions of the core. Even if the other positions in the USS (core and flanking AT-rich segments) are all perfectly matched to the genome consensus, changing any one of these central bases drastically reduces uptake. This suggests that these positions make very important contacts with the uptake machinery. Changes at any of five other core positions reduce uptake to 20-60% of the control (perfect USS), so these also probably make important contacts. Changing the initial A, or pairs of bases in the flanking segments, reduces uptake only modestly, suggesting that the functions of these in uptake are dispensable if the rest of the USS is perfect.
We could go on to discuss the implications for accumulation of USSs in the genome, as I do above, and end by pointing out that what's needed is a better understanding of the consequences of molecular drive (and maybe put in a plug for our Perl simulation model).
Strong wind brings strange leaves?
14 hours ago in The Phytophactor