RRResearch: Neisseria repeats

I've spent parts of the last coupe of days first discovering that a dataset of uptake sequences (DUS) from the Neisseria meningitidis intergenic regions contained a large number of occurrences of a different motif. I could easily see that it was longer than the 12 bp DUS but couldn't figure out what the motif was.

I spent a long time suspecting that these repeats were 'correia' elements, a very short but complex transposable element common in Neisseria genomes. But I couldn't find a clear illustration of the correia consensus, and I couldn't find a good match between the correia sequences I could find and the sequences of the stray motif in my DUS dataset.

Finally I realized that I could try using the Gibbs motif sampler to characterize the motif. So I took my set of intergenic sequences, used Word to delete all the perfect DUS (both orientations), and asked Gibbs to find a long motif. I didn't know how long the stray motif actually was, so I tried guessing 20 bp, then 30, then 40. But this didn't seem to be working - instead of finding a couple of hundred long correia-like motifs it would find a couple of thousand occurrences of something with what looked like a very poor consensus. So I seeded the sequence set with about 20 occurrences of the motif taken from the dataset where I'd first noticed it.

Gibbs again returned about 1500 of what looked like poor-consensus occurrences, but this time I had a bit more confidence that this might be what I was looking for, so I trimmed away all the notation and posted them into WebLogo. This gave me a palindromic repeat that I'll paste below later, and a bit of Google Scholar searching showed me that this isn't correia at all, but a short repeat called RS3, known to be especially common in intergenic sequences of the N. meningitidis strain I'm using.

So now I can write a sensible manuscript sentence explaining what these repeats are and why I'm justified in removing them from the dataset.

2 comments:

AnonymousNovember 25, 2008 at 11:54 AM
Don't you love it when a piece of work just comes together? Good for you and good luck with writing the manuscript. I had earlier read through some of your posts outlining the difficulties you had with elsevier. I would suggest that in the future, perhaps with this manuscript, you consider open access journals under the PLoS or Biomedcentral publishers.
Rosie RedfieldNovember 29, 2008 at 9:22 AM
We've had bad experiences with BMC too. Not with the open-access issue, but with excessive delays and bad judgement. We like PLoS a lot.

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS

Field of Science

Neisseria repeats

2 comments: