Among other things I've been working with one of the post-docs on her manuscript about the amount of variation in competence and transformability between different isolates of Haemophilus influenzae. Today we progressed to thinking about what we should say in the Discussion section, specifically about how selection on competence genes might have changed since the common ancestor of these strains.
(I'll use 'strains' interchangeably with 'isolates' in this post. In so doing I'm implicitly (here explicitly) assuming that the properties of the human-dwelling H. influenzae cell that gave rise to the original lab colony have not been changed by whatever laboratory propagation its descendants might have experienced.)
To discuss this variation we have to change how we've been thinking about variation. The data the post-doc has generated tell us about the ability of 34 present-day strains to take up DNA and recombine it into their chromosomes. To discuss the data's evolutionary implications we need to integrate it into the (unknown) history of these strains and of the species.
We don't know anything directly about the common ancestor of these strains, or of all the bacteria we call H. influenzae. But maybe we can start by making some inferences from a large published survey of the genetic variation in H. influenzae strains, and from the published genome sequences of some strains.
The large survey was a 'MLST' study, in which the same 7 genes were sequenced in each of more than 700 strains (Meats et al. 2003). I don't remember whether the authors were able to draw any specific conclusions about evolutionary history, but if they did we should certainly consider whether they can be applied to our analysis.
About 12 H. influenzae genomes have been sequenced (and the sequences are 'available'), but only a few of them have been analyzed in any detail. Much of the sequencing work is being done in the context of an explicit evolutionary hypothesis - that H. influenzae and other bacterial pathogens are best described as having a 'distributed genome'. This is Garth Ehrlich's idea; here's how one of his papers explains it:
The distributed genome hypothesis (DGH) states that pathogenic bacteria possess a supragenome that is much larger than the genome of any single bacterium, and that these pathogens utilize genetic recombination and a large, non-core set of genes as a means of diversity generation.Well, that's certainly very relevant to our analysis of the distribution of transformability! Now we just need to clarify, first for ourselves and then for potential readers of our manuscript, how having a diversity of competence and transformation phenotypes fits into this.