On the Gibbs front, my new analysis of the coding sequence DUSs did find them. Both replicates did: the trick was to reduce the expected number of occurrences (not something I would have predicted). I may try that with other difficult searches. My analyses of the whole genome are still running. I hope they don't run over the 36 hours I specified when I put them in the Westgrid queue, because they'll just be aborted and I'll have to start them over again.
On the Perl simulations front, I've got the program running and used it to do the control simulations. The first controls use random sequences the same lengths and base compositions as the concatenated H. influenzae or N. meningitidis intergenic sequences, run with matrices specifying the corresponding USS or DUS core but with no recombination. These controls tell us what the baseline USS or DUS score is for a genome that hasn't experienced any accumulation. The second controls use the real H. influenzae or N. meningitidis intergenic sequences instead of random sequences, and run for a long time to see how long the sequences take to degenerate to the predetermined baselines (i.e. to become randomized with respect to USS or DUS). The score isn't a very sensitive indicator for this degeneration, as the genome may still contain an excess of the imperfectly matched cores, but I'll be able to tell this from the final analysis done at the end of the run.
After screwing up the settings many times this afternoon (e.g. specifying the N. meningitidis sequence and matrix but forgetting to change to the corresponding base composition), I realized that I could save myself a lot of wasted time by making two versions of the program, each with its own matrix and sequence files and with a settings file that specifies the appropriate genome size, base composition, and matrix and sequence files. So I did. All of the analyses I've planned will be simulating the evolution of either USS in H. influenzae intergenic sequences or DUS in N. meningitidis intergenic sequences, so now I just need to open the right folder.
Macrocycles, flexibility and biological activity: A tortuous pairing
1 day ago in The Curious Wavefunction