Field of Science

Thinking about Gene Transfer Agent

I'm at Dartmouth for three months, working with Olga Zhaxybayeva's group to improve our evolutionary understanding of Gene Transfer Agent.  I'm writing an R-script simulation of the genetic exchange it causes (finally learning R), but my control runs with epistasis don't give the expected results.  So I'm writing this post and creating a Powerpoint deck to clarify my thinking.

First, what's Gene Transfer Agent?  A number of different kinds of bacteria produce 'transducing particles' called Gene Transfer Agents.  These look line small phage capsids but they don't usually contain phage DNA; instead they contain random fragments of chromosomal DNA.  In the best-characterized GTA ('RcGTA'), these are all 4.4 kb in length, which appears to be the DNA capacity of the tiny GTA heads.  Like phage, GTA particles inject their DNA into recipient cells (usually of the same species), where it often recombines with the chromosome and can change the cell's genotype.

GTA particles aren't infectious like phages are, both because they don't preferentially package the DNA that encodes them and because their heads are too small to contain this DNA.  The RcGTA head and tail proteins are encoded by a 14 kb gene cluster.  The sequences and organization of these genes strongly resemble that of homologous phage genes, so the known GTA systems are generally thought to have descended from what were integrated prophages. 

In lab cultures of cells with the RcGTA genes (Rhodobacter capsulatus cells), GTA is produced mainly after exponential growth has ceased, and only produced by a small subset of cells. Like release of phage particles from infected cells, release of GTA requires lysis of the cell, and the genes for the holin and endolysin proteins are encoded separately from the main RcGTA cluster.

There are good reasons to think that GTAs are not simply defective prophages that still can package small DNA fragments:
  1. The main RcGTA gene cluster has been somewhat stably inherited over a very long time, maybe a more than a billion years.  Some descendants have lost all the genes, but about 25% of the 225 alpha-proteobacterial genomes examined have retained versions of a single large cluster, typically containing 14-17 co-transcribed genes, most of which encode capsid head and tail proteins.
  2. Expression of this gene cluster is at least partly controlled by cellular regulatory mechanisms.  
  3. Other genes, at other chromosomal locations, are also needed for efficient RcGTA production.
I just crunched some numbers from a detailed phylogenetic tree for the alpha-proteobacteria showing which taxa have GTA.  The large GTA cluster is only found in a subclade (148 taxa, 109 distinct species names); the authors estimate that this subclade is 1.0 - 1.4 billion years old.  57% of the taxa in this subclade have the large GTA gene cluster.

My goal for these three months is to generate models of GTA evolution (probably computer simulations) that evaluate the following candidate explanations for its persistence:

  1. Infectious spread of GTA by rare large-head particles that package the 14 kb gene cluster.
  2. Restoration of mutated GTA genes by unidirectional recombination with functional alleles from GTA-producing cells.
  3. Beneficial recombination of chromosomal genes.
Flawed model for Explanation 1:  Nobody has seen the large heads postulated by Explanation 1, but nobody has explicitly looked for them.  The Zhaxybayeva lab already has an unpublished mathematical model that addresses this exp lanati on, created by a mathematically-inclined former post-doc.  It asks how frequent such heads would need to be in order to maintain GTA-producing cells in a mixed population of GTA+ cells and GTA- cells lacking the gene cluster.  The model assumes that  large heads are produced at frequency µ, and that these inject the GTA gene cluster into GTA- cells, converting them into GTA+ cells.  Only a small fraction of GTA+ cells are activated to produce GTA in any one generation, and these lyse after GTA production.  

The conclusion from this model is that GTA+ cells can persist at high frequency even if they only make large particle for every 10^5 normal small particles.  Because the model assumed a reasonable 'burst size' of 100 GTA particles per producer cell, this means that GTA+ can persist if only one cell in a thousand produces a single large particle.    

But I didn't think this result could be correct.  Since each cell lysis destroys a GTA+ cell and only one in a thousand creates a new GTA+ cell from a GTA- cell, the GTA+ population should be continually decreasing.  Production of new GTA+ cells only compensates for 0.1% of the loss of GTA+ cells.  

I initially had a hard time fully understanding the mathematics of this model.  It included expressions for logistic growth, which complicated the math without adding anything to its utility.  So I created my own version of this model, which gave a very different answer.

New model for Explanation 1:  I'm going to put the description of this model into another post, because here I want to get on to my beneficial recombination model.  Bottom line: the model's result is that transduction of the GTA gene cluster by large-head GTA particles can't come close to maintaining GTA+ cells in a mixed population even if every cell produces a large-head particle.  This is because:

  1. All cells that produce GTA die; 
  2. Only a small fraction of large-head particles will contain a complete gene cluster (maybe 0.1 to 1%); 
  3. Except when GTA+ cells are rare, many particles will attach to GTA+ cells rather than to GTA- cells; 
  4. In a natural environment many GTA particles will fail to find recipients.  (This issue isn't part of the model.)
  5. To overcome these obstacles each GTA-producing cell would need to produce more than 1000 (10,000? 100,000?) large-head particles.
Finding the flaw in the lab's model:  Assuming that I understand the lab's model correctly, the main error is that it 'corrects' for the probability that a GTA particle will attach to a GTA+ cell rather than a GTA0- cell by multiplying by the number of GTA- cells rather than by their frequency.  Since the model assumes populations of 10^7 to 10^9 cells, this overestimates the amount of transduction by orders of magnitude, leading to a comparable underestimate of the frequency of large heads needed to maintain GTA+.

Model for Explanation 2:  I modified the basic structure of my Explanation 1 model to consider a related hypothesis.  Defective alleles of GTA genes are expected to arise by random mutation.  At least some of these will also prevent the cell from lysing when GTA production is induced.  These cells can still receive functional alleles of their defective genes from GTA particles produced by 'wildtype' cells, but they can't transmit their defective alleles to the wildtype cells because they can't produce GTA.  This asymmetry favours spread of functional alleles, and might be able to maintain GTA, although it wouldn't allow GTA+ to spread to cells that completely lack the GTA genes.

Like the model for Explanation 1, the result is a strong NO.  Because the models are very similar, it's not surprising (in retrospect) that spread of functional alleles faces the same obstacles

  1. All cells that produce GTA die; 
  2. Only a small fraction (about 0.1%) of particles will contain whatever GTA gene is mutated in a recipient cell; 
  3. Except when GTA+ cells are rare, many particles will attach to cells with the functional allele rather than to those with mutated allele; 
  4. In a natural environment many GTA particles will fail to find recipients.  (This issue isn't part of the model.)
  5. To overcome these obstacles each GTA-producing cell would need to produce more than 1000 (10,000? 100,000?) large-head particles.
Models for Explanation 3:  Most microbiologists assume that GTAs are maintained in their genomes by selection for presumed benefits of chromosomal recombination.   They implicitly assume that randomizing the combinations of chromosomal alleles in a population creates a benefit strong enough to overcome the cost of the cell death associated with GTA production.  They don't explicitly assume this, because they're not used to thinking rigorously about evolutionary processes.  Instead their explanation usually relies on GTA-mediated recombination creating some specific beneficial new combination, and ignores the selective costs associated with other combinations.

In fact, many very smart people have spent many years looking for conditions where random chromosomal recombination creates benefits strong enough to maintain the genes that cause it.  These 'evolution of sex' models have identified some conditions, but usually these benefits are small and occur only under special circumstances.  Most of the time recombination appears to be a waste of time at best.

Recombination Model 1:  Way back when I was a new post-doc spending a year in Dick Lewontin's lab, I developed a computer-simulation model of recombination by natural transformation (Redfield 1988, Evolution of bacterrial transformaiton: Is sex with dead cells ever better than no sex at all?).  In this model I applied a relatively simple model of the evolution of sex to a population of naturally competent bacteria.  My first goal for addressing Explanation 3 is to adapt this model so it applies to recombination caused b GTA rather than by natural transformation.  I'll describe my progress (and current deadlock) in the next post.

Recombination Model 2:  Model 1 is 'deterministic'; it ignores random ('stochastic') events, effectively assuming that the population is infinitely large.  But the strongest benefits of recombination are now thought to arise from precisely the stochastic effects Model 1 ignores.  So I also want to make a stochastic model that tracks individual cells, or at least a model that takes stochastic processes into account.  I haven't started writing this model yet, but I might pattern it on the transformation model described by Takeuchi et al, 2014.

No comments:

Post a Comment

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="">FoS</a> = FoS