(I should be thinking about DNA uptake projects but instead I'm still trying to understand why so few US accumulate in our simulations.)
I'm (at last) trying the approach I learned from Dick Lewontin when I was a beginning post-doc. When you're trying to understand a confusing situation where multiple factors are influencing the outcome, consider the extreme cases rather than trying to think realistically. Create imaginary super-simple situations where most of the variables have been pushed to their extremes (i.e. always in effect or never in effect). This lets you see the effect of each component in isolation much more clearly than if you just held the other variables constant at intermediate values. Then you can go back and apply the new insights to a more complex but more realistic situation.
So I've created versions of our model where the input DNA consists of just end-to-end uptake sequences, where the genome doesn't mutate (the fragments being recombined do have mutations), where the probability of recombination is effectively 1.0 for fragments with perfect uptake sequences and almost zero for fragments with less-than-perfect ones, and where this probability doesn't change during the run. The genome is small (2 kb) to speed up the run and the fragments are only 25 bp, so recombination doesn't bring in a lot of background mutation.
These runs reach equilibria with 45-50 uptake sequences in their 2 kb genomes. If they had one uptake sequence every 25 bp they would have 80 in the genome. Maybe this is as good as it can get - it's certainly way better than what we had previously.
Eliminating the background mutation in the genome makes a surprisingly large difference; a run with it present had only 16 perfect uptake sequences. I wonder if, with this eliminated, the amount of the genome recombined per generation now has much less effect on the final equilibrium? In the runs I've just done, fragments equivalent to 100% of the genome are recombined, replacing about 63% of the genome in each cycle. So I've now run one with half this recombination - it reaches the same equilibrium.
I had previously thought that the reason our runs reached equilibria that were independent of mutation rate was some complex effect of the weird bias-reduction component of our simulations. But in these new 'extreme' runs I had the bias reduction turned off, and I got the same equilibria with mutation rates of 0.01 and 0.001. I also tested a more-extreme versions of our position-weight matrix, where a singly mismatched uptake sequence was 1000-fold less likely to recombine rather than the usual 100-fold less, and this gave the same equilibrium as the usual matrix. So I tried using a weaker matrix, with only a 10-fold bias for each position. This gave about 30 uptake sequences in the 2 kb, a much less dramatic reduction than I had expected. Not surprisingly there were also more mismatched uptake sequences at the equilibrium.
This is excellent progress for two reasons. First, I now have a much better grasp of what factors determine the outcomes of our more realistic simulations. Second, I now know how to generate evolved genomes with a high density of uptake sequences, which we can analyze to determine how the uptake sequences are spaced around the chromosome.
Why I'm Marching for Science
6 hours ago in Angry by Choice