Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
Does local nicking ease uptake?
We're hypothesizing that USSs are sequences that are more easily taken up because they are easily deformed to pass through the narrow secretin pore. If so, a single-stranded nick might take the place of part of the USS, because it should be easy to bend the DNA sharply at the site of a nick.
I remembered old work showing that restriction enzymes, which normally cut both DNA strands at their recognition sites, will cut only one strand in the presence of high concentrations of the DNA-binding dye ethidium bromide. So if we replace part of a USS with a restriction site, we can test whether a nick at that site increases uptake by competent cells. This should be quite easy, as our standard USS for such experiments is cloned into the middle of the restriction sites of plasmid pGEM, giving us lots of built-in sites to work with.
A snapshot of uptake specificity?
We'd start with the plasmid that has a perfect USS insert, and subject it to high-efficiency random mutagenesis in the USS segment (say about 40-50bp). This would be done using a mutagenesis kit and a batch of degenerate oligos for this segment. Each degenerate oligo would have a small probability (say 9%) of having a 'wrong' base at each position (3% of each 'wrong' base), so that on average each oligo in the batch would have about 3-4 differences from the consensus. But the distribution would be broad, so a small fraction of the oligos would have one or no changes, and some would have 5 or more.
We'd then use competent cells to select, from the pool of mutagenized plasmid inserts, ones that can be bound or taken up (depending on whether or not we add DNase I to destroy DNA on the outside of the cells).
Then we'd use PCR to amplify the USS inserts of the taken-up sequences, and analyze their genetic diversity. I'm not up on the technology that would be most appropriate - I'll need to ask and search for genome analysis tools. In principle this could be done in two ways - we'd probably want to do both. The first way would be to use some sort of chip or array (?) to determine the proportion of each base at each position in the 40-50bp we've mutagenized. Because this wouldn't tell us anything about the correlations between differences at different positions, we'd also want to sequence some (say 1000?) of the mutagenized segments.
In principle this is just a high-tech version of analysis Sol Goodgal did about 15 years ago.
Sxy and CRP
The first thing would be that Sxy helps CRP bind to and bend DNA right at the CRP-S site. At CRP-N sites, which have an easily-bendable sequence, Sxy isn't needed for CRP binding.
The second thing would be that Sxy also interacts with another attribute of CRP-S promoters, and together with CRP helps RNA polymerase to begin transcription. This 'other attribute' is probably outside of the core CRP-S site; it could be a part of the nearby sequence that we haven't examined yet, or something about the separation of the CRP and RNA polymerase binding sites. If CRP binds to such a site without Sxy, it can't initiate transcription.
The other component of our model is that there is much more CRP than Sxy in the cell, so most of the CRP isn't associated with Sxy, but most of the Sxy is associated with CRP.
This model explains why changing the core of a CRP-S site into a CRP-N type core doesn't allow the promoter to work nearly as effectively as an authentic CRP-N site. And why doing the reverse (changing a CRP-N core into a CRP-S core) also creates a lousy promoter.
In our sketches we always had Sxy associate with CRP in solution, before either made contact with the DNA. But I'm wondering if this is necessary, or if one or the other protein might bind DNA and then its partner. Probably it is necessary. E. coli CRP alone binds very poorly to CRP-S sites in vitro; H. influenzae CRP won't bind them at all. So either Sxy must bind the DNA at CRP-S sites before the CRP gets there, or they must meet up in solution. Sxy has none of the features of typical DNA-binding proteins.
Of course, it could be that Sxy doesn't bind CRP at all, but instead binds to RNA polymerase....
New analysis of sxy RNase data
The tricky part is how we combine the RNase data with our genetic evidence and with the secondary structure predicted by Mfold. The Mfold predictions look good, and it's easy to give more credibility to a hypothesis expressed as a drawing of a structure than to one expressed only in words or numbers (our brains love pictures). But I would feel more comfortable with the predictions if we were able to use the software at a more sophisticated level, rather than just pasting in various sequences and leaving all the settings at their defaults.
Hypotheses about loss of competence
So here's one possibility:
Maybe cells take up DNA only because the genetic changes this sometimes causes are occasionally beneficial. (Most people think this is true, though I think the DNA=food consequence is much more important.) These benefits will be rare. So most competent cells will go for long periods taking up DNA but getting no benefit. When mutations that reduce or eliminate gene function arise in genes needed only for DNA uptake, there may be no selection against them for very long times. Depending on the particular mutations, these cells will have an advantage because they won't waste resources taking up DNA that's doing them no immediate good. So the frequency of competent cells in the population will be gradually decreasing.
(I write 'may be' because mutations that mess up one component of a complex machine may cause harm in ways that eliminating the whole machine wouldn't (like the difference between a car with no brakes and no car at all). For example, knocking out the secretin pore but keeping the rest of the DNA uptake machinery messes up the membranes of competent cells in ways that knocking out the ability to turn on competence doesn't. But lets not worry about this right now.)
But once in a while a cell that takes up DNA gets a good genetic change, one that lets it outcompete its relatives. This cell and its immediate descendants all have fully functional competence genes, so the frequency of competent cells in the population has increased.
The long-term outcome depends on how often DNA uptake produces good changes and how often deleterious mutations arise in competence genes and how harmful or beneficial these mutations are in the short term. If the good changes happen often enough, this could give populations that always contained lots of competent cells and some recently arisen non-competent ones. But if the good changes are less frequent, the cells with mutations causing loss of competence could completely take over. And once this happens there's no going back.
Mass spectrometry for the masses (=us)?
The plan is to incubate competent cells with DNA, and then add formaldehyde, which will create crosslinks between DNA and the proteins it's in contact with. We'll then dissolve the cells and pull out the DNA with its attached proteins. Then we'll get rid of the DNA and undo the crosslinks (by boiling the mix), leaving us with a little tube containing a mixture of DNA-contacting proteins of unknown identities. Then we'll digest the proteins with the protease trypsin, which will cut them into predictable pieces.
We'll use a combination of HPLC (high performance liquid chomatography) and mass spectrometry to find out the amino acid sequences of all the peptides in the mix. By comparing these with the known sequences of all the proteins specified by the genome, we'll know what proteins the peptides came from. By comparing results with DNAs that either do or don't have a USS (or have a variant USS) and with cells carrying different mutations, we can infer a lot about the specific interactions (I hope).
We don't need to invest in equipment for or learn how to do the HPLC and mass spec; we can pay local experts to do it for us.
One issue we'll need to grapple with is the small amounts of protein our 'fishing' technique is likely to produce. I think we can scale up, but that's always a source of problems. Another issue is stopping the cells from quickly sucking the DNA all the way inside - we may be able to control this by initially using cells with uptake mutations, or by sticking the DNA onto beads that are too big for the cells to take up.
Grantspersonship
Right now my proposal addresses two distinct aspects of H. influenzae competence. One is the regulation of competence, specifically the signals that induce the master regulator Sxy and the process by which Sxy induces the competence regulon genes. The other is the mechanism of DNA uptake, especially the role of the uptake signal sequence.
I could easily split this into two smaller proposals, each asking for less money, and each targeted to a particular peer-review committee. Because each proposal would be more focused, there would be more room in the allowed 11 pages to explain the issues specifically important for it. This would make each proposal easier for the peer-reviewers to read and understand. If only one of them got funded, we'd have plenty of money to go on with and I'd resubmit the other one for the next competition six months later.
I'm going to wait till I've gotten the opinions of a couple of other colleagues before making a final decision.
Got your pencils and rubber bands?
First the retraction issue: I had been thinking that we knew that H. influenzae and other Pasteurellacea could do 'twitching motility', which meant that they must be able to actively retract their pili. But I rechecked and found that nobody has directly shown that they move. Strains of H. influenzae that have pili do form the same 'rafts' at the edges of their colonies that cells with known twitching motility do, but nobody's watched them doing it. I don't know how difficult this would be, but in any case we can't do it until we have cells that we know are piliated.
Second, the DNA-kinking issue: I had been thinking only at an intuitive level about this (hand-waving and doodles), and her questions pushed me into coming up with a physical model.
We can reasonably the DNA and pilus with a long rubber band (= circular DNA) and a pencil (= the pilus); the relative diameters are appropriate. If a loop of DNA is going to be pulled through a pore that's only slightly wider than the pencil, then all of the loop has to become closely appressed to the pencil. For this to happen the DNA must, at some point, make a 180 degree turn in a distance no longer than the diameter of the pencil. If we were to let the DNA detach from the pencil the turn could open up to a larger diameter, but then the pencil plus DNA wouldn't fit through the pore.
The 180 degree turn is a problem because the persistence length of DNA is about 50nm (150bp), whereas the pilus diameter is only about 6nm. Nucleosomes manage to bend DNA smoothly through two full turns over distances of 165bp; this is a diameter of about 9nm (radius of curvature of 4.3nm). Perhaps not coincidentally, part of the uptake signal sequence resembles a nucleosome-positioning sequence.
So the DNA loop needs to bend on the pilus a bit more sharply than it does in a nucleosome - maybe not too big a problem. And if the DNA is lying in the positively charged grooves of the pilus, it may not stick out too far to fit through the pore (our model needs a pencil with grooves). But where the DNA makes its turn, it will have to leave its snug grooves and cross over at least one of the raised parts of the pilus, making an awkward bulge on the surface that's going to have a hard time fitting through the pore.
So uptake of a loop raises two problems. One is forcing the DNA to bend sharply, and the other is fitting the pilus back through the secretin pore once it has a loop of DNA attached to it. I don't think this is impossible, but it probably takes some specialized interactions. I think the role of the uptake signal sequence is to interact with the pilus and secretin to facilitate this.
I'm going to see if I can buy some tubing of different sizes to use for a demo of the problem.