OK, now I guess I'd better go through these parts of the UV-variation manuscript, figuring out what's done and what still needs to be done:
- Analysis of the true consensus and variation in uptake sequence motifs in all the bacterial genomes that have uptake sequences (= the family Pasteurellaceae (USS) and the genus Neisseria (DUS)).
I've done whole-genome Gibbs analyses and logos for all the species with uptake sequences. (Hang on, better check that no new ones have appeared since I did this.) Good thing I did; they've finished the
Haemophilus parasuis genome. A quick count of canonical USSs (MS-Word bioinformatics) finds only 99 Hin-type USS cores (AAGTGCGGT and reverse) and 450 Apl-type (ACAAGCGGT and reverse). The genome has only 12 of a predicted novel USS core GAGTTCGGT), nicely confirming our prediction that another group's assignment of this as the
H. parasuis uptake sequence was an error (Redfield et al. 2006). Now I need to remember how to do the Gibbs analysis and do it on this new genome.
- Analysis of variation in DUS and USS motifs across different location categories (orientation wrt replication, in coding sequences, in non-coding sequences, in terminator positions).
We're only describing this for
N. meningitidis and
H. influenzae. The
H. influenzae work is all done, but I still need to do at least some of the N. meningitidis analysis. My notes say I haven't done the direction-of-replication analysis but I think I have - maybe I didn't finish it up.
- Analysis of covariation between the different positions of the DUS and USS uptake sequence motifs (e.g. does having a particular base at one position correlate with having a particular base at another position).
I've done this for both
N. meningitidis and
H. influenzae, and prepared the figure.
- Additional experimental data on how variation in uptake sequence affects uptake by H. influenzae. (This will just be a paragraph as it only modestly enriches a previously published dataset.)
Done, figure prepared.
- Development of a computer-simulation model of uptake sequence evolution, and use of it to investigate the roles of key factors in maintaining uptake sequences in the non-coding parts of genomes.
Now this is the biggie. The model is all developed, and we've done a lot of work with it. But I need to remind myself of what we'd found (my recollections are all muddled withthe confusing interim results and changes we made to the model). Luckily, before the former post-doc left she put together a good summary of where things stood, so my first task is to use that to restore my brain to its previous understanding.
This comment has been removed by a blog administrator.
ReplyDelete