I've made some more progress on the problem of how to score USSs in simulated DNA fragments, and written it up in our work-in-progress document of how our model works. But here I want to get back to thinking of how mutagenesis can be set up to allow control of both the base composition and the ratio of transition mutations to transversion mutations. The problem was introduced in a previous post.
I have a printout of the analysis done for us by my biomathematician colleague. I'll try to summarize here what I think it says, and then I'll ask her to look at this post and tell me if I've got it wrong.
Before our program does any mutagenesis it will need to first calculate the values of three parameters, alpha, beta and gamma. (spelled out because Blogger doesn't do Greek letters.) These parameters specify the rates at which the specific bases mutate to each of the other three bases, as indicated in the table ("Blaisdell's 1985 mutation matrix"). In our model we will normalize these by dividing by their sums, as indicated below the table, so we can use them as probabilities.
The values these parameters will take depend on the three parameters we give to the model; these are described in the blue box. The formulas in the green box were derived by our colleague - we will write Perl code that uses these to calculate alpha, beta and gamma from G, µ and R. The program will then put these values into the table. Then, at each mutation step, the program will determine whether the mutating base is an A, T, C or G, and look up the appropriate probabilities of bases it can mutate to.
If we were to begin the simulation with a genome that did not have the desired base composition, it would also not initially have the desired ratio of transitions to transversions. If no opposing forces were acting, the base composition and ratio would equilibrate at the desired values. This should not be an issue for us because our genomes will be created with the desired base compositions.