With more help from the motif-software expert, I now have the program running well. I can use fragmentation masks to control where the program puts gaps, and this also lets me set whether the search is for motifs in the 'forward' or 'reverse' orientation. I can search for motifs with up to 21 significant positions, spread over any length of sequence I care to specify.
I've run a couple of passes through the entire genome. The results are just as unsurprising as I had expected (is that a tautology?). Blogger won't upload the logo image files right now so I'll add them later (done). Basically, the forward and reverse motifs are fairly close reverse complements of each other. I was going to write that this means that there are no strong effects of the direction of DNA replication, but that's not true. To see effects of the direction of DNA replication I'd need to split the genome sequence into two parts, one clockwise from the origin of replication to the probable terminus, and one counterclockwise from the origin to the terminus and compare the forward and reverse motifs in each half. That way the forward/clockwise and referse/counterclockwise motifs would be derived from the 'leading' strand and the reverse/clockwise and forward/counterclockwise motifs would be derived from the lagging strand. [Or vice versa, as I don't know which strand is which.] Hmm, I don't think anyone has ever done this. So I should.
Rock, paper, scissors, lizard, Spock
7 hours ago in Doc Madhattan