Field of Science

Simulations run successfully, but drowning in contamination angst

After some futzing around I rediscovered how to run my computer program that simulates the evolution of DNA uptake sequences in genomes.  So now I've done 4 runs, using uptake-bias matrices derived either from our previous genome analysis or from the postdoc's new uptake data.  There are 4 runs because I started each matrix with either a random 20 kb DNA sequence or a random 20 kb DNA sequence pre-seeded with 100 uptake sequences.

Now I've just sent the input and evolved sequences to the postdoc - he will analyze them with his new uptake-sequence-prediction model and with the old-genome-derived model.  We hope this will help us understand why our previous view of uptake specificity was wrong.

He and I have spent months (and months and months) working on the manuscript that describes his DNA uptake results.  Lately I've been griping that he's too much of a perfectionist, always trying to make sure the data and analysis is absolutely perfect rather than just getting the damn manuscript written and submitted.  But I've now fallen into the same trap, spending days trying to understand exactly how contamination of one DNA pool with another might be compromising his analysis.  (The Excel screenshot below is just for illustration - there's lots more data where that came from.)  And it's not the first time I've been the one to insist on perfection - last month I spent weeks making sure we really understood the impact of sequencing error.

But we also have a reason to celebrate, as his paper on recombination tracts just appeared in PLoS Pathogens:  Mell, Shumilina, Hall and Redfield,  2011.  Transformation of natural genetic variation into Haemophilus influenzae genomes.  Open access, of course.

No comments:

Post a Comment

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="">FoS</a> = FoS