The postdoc and I are back at it yet again, working on his paper about the sequence specificity of DNA uptake. I'm beginning to think there's something pathologically wrong, either with us or with this piece of research, because we never seem to get closer to finishing it. Instead, we just keep discovering more analyses that need to be done. (The part that's done gets better and better, but we seem to be no closer to submission.)
This time it's that we need a more rigorous comparison of the uptake-specificity motif his data has produced with the old 'genomic' motif we derived by analyzing the genome with the Gibbs Motif Sampler. Both motifs consist of numbers representing the probability of finding each of the four bases (A, G, C T) at each position in a 32 bp segment. We've been saying and writing that, although these motifs have the same consensus, they are very different in the importances they ascribe to different positions. We have a list of four possible explanations for the differences, but before we discuss these we need to test whether the motifs actually pick out different subsets of the genome. Maybe all of the ~2500 sequences that would be found by searching for the genomic motif would also be found by searching for the less-constraining uptake motif. If so, we might then focus on what other sequences the uptake motif found, or, if it didn't find any, why not.
A new kind of problem
12 hours ago in RRResearch