Field of Science

Why do bacteria take up only one DNA strand?

People who disagree with us about the nutritional importance of DNA for bacteria very often cite the degradation of one DNA strand at the cell surface. Like almost all (maybe all) other competent bacteria, H. influenzae initially binds double-stranded DNA, but brings only one of the two strands across the inner membrane into the cytoplasm. The other strand is degraded to nucleotides in the periplasm. In Gram-positive bacteria the degradation also occurs at the surface, before the remaining DNA strand is transported into the cytoplasm, and the released nucleotides are found in the culture medium.

(Hmm, aside, I must go back and check the old literature, to see if it's the nucleotides or only the phosphates that are found in the medium. This is significant because we know that the phosphate must be removed from each nucleotide before it can be taken up. No, they used tritiated thymidine, so the label was in the bases.)

If bacteria were taking up DNA for its nucleotides, they argue, the bacteria would certainly not discard all of one strand, thus getting only half the ncleotides they would get if they took up both strands. Therefore DNA must be taken up for its genetic information. My usual response to this (usually ignored) is to point out that the bacteria have efficient uptake systems for nucleotides (actually nucleosides = nucleotides with their phosphates removed), and any nucleotides produced at the cell surface are likely to be taken up by these systems rather than 'discarded'. The same people happily accept that bacteria secrete nucleases to obtain nucleotides from environmental DNA, a process much more strongly limited by diffusion, so they should find this convincing.

But today I've thought of another response, which is to bounce the 'discard = waste' argument right back. If bacteria are taking DNA up for its genetic information, then surely it is wasteful to take up only one strand. Although only a single DNA strand usually participates in a single recombination event, many strands are degraded by cytoplasmic nucleases before they can recombine at all. Others recombine, but the new genetic information they contain is lost through mismatch correction. Surely it would be better to take up both strands and give each a chance to recombine, effectively doubling the probability that the cell will get the new genetic information.

Super-ultra-high-throughput sequencing? done cheap?

A potential collaborator/coworker has suggested what I think would be a great experiment, if we can find a way to do it economically.  But it would require sequencing to very high coverage, something I know almost nothing about the current economics of.

Basically, we would want to sequence a pool of Haemophilus influenzae DNA from a transformation experiment between a donor and a recipient whose (fully sequenced) genomes differ at about 3% of positions, as well as by larger insertions, deletions and rearrangements.  The genomes are about 2 mb in length. The DNA fragments could have any desired length distribution, and the amount of DNA is not limiting.  

Ideally we would want to determine the frequencies of all the donor-specific sequences in this DNA.  For now I'll limit the problem to detecting the single-nucleotide differences (SNPs).  And although we would eventually want to have information about length/structure of recombination tracts, for now we can consider the frequency of each SNP in the DNA pool as an independent piece of information.

At any position, about 1-5% of the fragments in the pool would have a base derived from the donor strain.  This means that simply sequencing to, say, 100-fold coverage of the 2 mb genome (2 x 10^8 bases of sequence) would be sufficient to detect most of the SNPs but could not establish their relative frequencies.  Increasing the coverage 10-fold would give us lots of useful information, but even higher coverage would be much better.

The collaborator and the metagenomics guy in the next office both think Solexa sequencing is probably the best choice among the currently available technologies.  Readers, do you agree?  Are there web pages that compare and contrast the different technologies?  What should I be reading?

One possibility I need to check out is skipping sequencing entirely and instead using hybridization of the DNA pool to arrays to directly measure the relative frequencies of the SNPs.  Because the pool will contain mainly recipient sequences, we'd want to use an array optimized to detect donor sequences against this background.  How short to oligos need to be to detect SNPs?  How specific can they be for a rare donor base against a high background of recipient sequences?  Is this the kind of thing Nimblegen does?

More from Elsevier (names removed...)

Email from an Elsevier Publisher:

Dear Rosie,

I'm so sorry this process has been difficult. As you say below, you can pay by credit card (personal or institutional). We don't have an agreement with CIHR, but they may indeed refund the cost of the sponsored access. That's something CIHR will have to advise you on.

The journal manager for JMB; (name removed) is working with her colleagues to make your article available should you wish. If you haven't completed the paperwork for payment, the link is here:

I see on your blog, however, that you've given up due to frustration with the process and the inability of the service group to respond to your questions. I know they did the best they could. Please do let the journal manager know if you do indeed want to pursue the sponsored access option.

While I recognize your frustration, and I agree, I would feel the same, it's frustrating to those of us working on the journal that your blog has been wholly negative about JMB and has never acknowledged our correspondence or those of us working to resolve the difficulties you've had with your proofs.

Please let the journal manager and I know if you are indeed resolved not to move forward with the sponsored access.

Best wishes,

Friendly Elsevier Publisher (name removed)

Dear Elsevier Publisher,

I have indeed abandoned my attempt to pay for open access to our article, entirely because of my frustration with the inability of the service person handing the transaction (name removed) to give a straight answer to any of my questions.

As I now think I understand the situation, most of the things she told me were not true. First, she said my purchase order could not be processed because my granting agency was not on the list. Then she said that my granting agency would not refund me the open access charge. I took care to make my questions very clear and simple, and each time she'd respond with boiler-plate statements that did not answer the questions and raised new confusion, and with referrals to largely irrelevant documents.

I don't think the fault can be all hers. The Elsevier sponsored-access system is confusing, the policy is not clearly explained, and the necessary information is hard to find.

The Journal of Molecular Biology is an excellent journal, and we're proud to have our article appear there. The submission and review process went very smoothly, the copy editing was very professionally done, and the 50 free offprints are a nice treat. But I feel strongly that taxpayer-supported research should be published where the taxpayers can see it, so I won't be submitting to any Elsevier journals in the future.



p.s. CIHR has never quibbled in the past about paying open access charges, and no concerns were raised when I included them as a line item in my grant budget.

p.p.s. I'll post this correspondence on my blog, with the names removed.