Field of Science

Planning the DNA sequencing part of the PhD student's project

The former post-doc (I'll call him the FPD) visited yesterday afternoon, and we had intense discussions of how to proceed with both the RNAseq work (summarized here on our Sense Strand blog) and with the PhD student's planned DNA uptake experiments.

His planned experiments take advantage of the phenotype of a rec2 knockout mutation.  These cells take up DNA normally across the outer membrane, into the periplasmic space, but they cannot transport it across the inner cell membrane.  This allows him to recover intact DNA that has been taken up, and to use DNA sequencing to compare it to the input DNA the ∆rec2 cells were given.

Some of the experiments will use genomic DNA of the species being tested, fragmented to appropriate length distributions, and some will use synthetic DNA fragments (~200 bp) containing a 30-50 bp stretch of random sequence (see figure).

The FPD, who developed the synthetic fragment protocol, pointed out that his experiments had used full lanes of Illumina sequencing only because it was not then possible for us to 'barcode' our different DNA samples and mix them for sequencing as a single lane.  The sequencing depth he obtained was useful, but it will be extreme overkill for the experiments the PhD student plans.  So we need to design barcoding into our analyses, so we can mix up to 24 samples in one lane for sequencing, and then separate the resulting sets of sequence reads by their different barcodes.  We'll still need to use two lanes, because each 'recovered' sample will need to have a corresponding identical 'input' sample.  Because these samples will have the same barcode they could not be distinguished if they were sequenced in the same lane.

So rather than doing one very-deeply sequenced experiment, he'll be able to do multiple replicates, each sequenced at a moderate but entirely adequate depth.  If he uses a HiSeq machine for the sequencing, he'll be able to get 1.6 x 10^8 reads for each of 12 samples; with a NextSeq this would give 4 x 10^8 reads per sample. (Is that right, per sample, not per lane?).

One issue to keep in mind is that it would be foolish to save all the sequencing for one big batch at the end of the thesis work.  Instead the work needs to be designed with an initial set of samples to be sequenced, so he can (1) tell whether everything is working as it should, and (2) begin analyzing sequence data from one part of the project while generating additional samples for other parts.  For a preliminary batch of sequencing, it might be better to use a MiSeq machine, whose smaller capacity would let us sequence a few samples more economically.

We also talked about how long the random-sequence segments should be in the 200 bp fragments, and about where to locate the barcode segments.  These consist of an independent sequencing primer followed by 8 bp that identify the source experiment.  Putting these to the right of the random segment will let him efficiently create the double-stranded 200 bp fragments, using the same long left-side oligo (containing the random segment) with many different right-side oligos, each containing a different barcode.

Sensitivity of the PhD student's planned analysis

The PhD student is proposing to use Illumina sequencing of input and recovered-after-uptake DNAs to detect possible biases in uptake of DNA by bacteria other than H. influenzae.  (This is a simplified version of the analysis proposed in our funded NSERC proposal.) We're discussing the factors that will affect the sensitivity of this analysis, so he can say how strong a bias would have to be in order for his experiment to detect it.

The factors we've thought of are:

A. Nature of the preferred sequence pattern: 
  1. How long is it (3 bp? 10 bp?)?  How specific is it (e.g. is each base specified, or just 'purine' or 'pyrimidine'?  Together these determine how often this pattern will occur in the input DNA (by chance or due to uptake bias-drive).
  2. How strong is the bias favouring uptake of fragments containing this pattern?  How strict is the preference (are variants of the specified pattern also taken up, but less strongly)?  Are fragments with more than one occurrence of the pattern more likely to be taken up?
 B. Properties of the input DNA:
  1. If this is genomic DNA, what is the size range of the fragments?   The sensitivity of the experiment will be low if the fragments are so large that each has at least one occurrence of the preferred pattern.
  2. If this is a synthetic fragment containing a fully degenerate segment, how long is the degenerate segment?
C. Sequencing coverage:
  1. How high is the sequencing coverage?  Is it the same for the control input DNA and for the recovered DNA?  This will determine the noise due to random factors.  
  2. Does the error rate of the sequencing matter?
  3. For genomic input DNA, are there position-specific differences in coverage across the genome?
  4. For degenerate-fragment DNA, are there non-random factors in the input DNA or in its sequence-ability?
He's going to start by working through the values for a very-strong-bias case, detecting the H. influenzae uptake sequence in genomic DNA (figure below), and then relaxing the inputs.

Mutagenesis plans

(I'll add some explanations later.)

1.  Mutagenize more RR805 DNA, using a range of high EMS doses (10, 15, 20, 25, 30, 40 min in 50 mM).  Transform this DNA directly into competent KW20 (without EMS inactivation or DNA purification) and select for CmR and maybe for NovR.

2.  Mutagenize RR805 cells, using a range of high EMS doses (from expt. #180, 80 mM for 1 hr gives ~10^-2 survival).  The cells don't need to survive, because I'll just grow the culture for a couple of hours and then extract all the DNA and use that DNA to transform KW20 to CmR.

For both 1 and 2, then pool CmR transformants and transform at low cell density to StrR with RR514 DNA.  Test individual StrR colonies for hypercompetence by colony transformation with MAP7 DNA.

3. Mutagenize NovR and NovS PCR fragments (made by the sabbatical visitor), using the same EMS concentrations as in experiment 1.  Then test the effects of the EMS mutagenesis by transforming each DNA into KW20, looking for gain of NovR in cells transformed with the NovS DNA, and loss of transforming ability of the NovR DNA.

I can do experiments 1 and 3 today (if I first pour lots of plates).  I can then do Experiment 2 tomorrow or on the weekend, once the cells have grown up.


1.  I must have put too little chloramphenicol in the Cm plates for this experiment, because all the cells grew on the Cm plates.  I need to repeat this experiment.

3.  Increasing exposure to EMS caused decreased transformation by the NovR fragment, as it should, but the corresponding exposures of the NovS fragment gave no NovR transformants, indicating no detectable mutagenesis.  So the decrease seen with the NovR fragment may just be due to damage, not mutation.

2.  My streak of RR805 cells has grown nice little colonies.


I've inoculated one of the RR805 colonies for an overnight culture, so I will be able to do the experiment 3 cell mutagenesis tomorrow.  And tomorrow I'll make lots and lots of Cm plates, with the right amount of chloramphenicol, so I can also repeat experiment 1.

No new candidate mutants (sigh...)

As I planned here, I pooled the CmR colonies resulting from transformation with EMS-mutagenized CmR murE+ DNA, and grew them to log phase (OD600 ~ 0.1).  The murE+ cells in the pool should have been non-competent under these conditions, but any murE* hypercompetenc mutants should have been competent.  To select for these mutants I transformed the cells in each pool with DNA carrying a streptomycin-resistance mutation, and plated on Str plates.  One pool gave several hundred StrR colonies (many more than I would have expected as transformants), but the other pools had very few or none (4 total). I then screened individual StrR colonies by mixing them with dilute NovR DNA and plating on Nov plates.

Unfortunately none of the StrR colonies transformed to NovR at the high frequency seen for the positive control (murE749) colonies.  In fact, none transformed any better than the murE+ negative control colonies.

This is a bit surprising, given that the 2-fold higher level of EMS mutagenesis reduced by 100-fold the ability of the CmR cassette to transform cells, and the 4-fold higher level eliminated it entirely.  I had assumed that this reduction/elimination was due to too-heavy mutagenesis, but perhaps it was a direct consequence of the DNA damage.  One possible explanation I'm considering is that damaged DNA is almost always repaired or destroyed, and rarely gives rise to recombinants.  Another possibility is that, when cells are mutagenized, the mutations arise mainly only when levels of damage are so high as to overwhelm the repair systems, allowing the damaged bases to be used as templates for DNA replication.  Maybe this also requires induction of the error-prone DNA polymerase.

So now the sabbatical visitor and I are designing a control experiment, to test whether this direct DNA mutagenesis is working as we think it should.  We're going to mutagenize two versions of a DNA fragment containing the gyrB locus.  One is wildtype, and the other has the novR allele we usually use in our transformation assays.  We expect the transformation efficincy of the novR allele to decline with high doses of EMS, and we hope that now novR mutations will arise from high doses to the wildtype allele

* Here's some wishful thinking: Ideally we should be selecting for a G->A transition mutation because those are what EMS induces best.  But we're using novR (G->T) because we have the porimers handy and know they work.  The mutation spectrum of EMS is reported to be much broader with the in vitro mutagenesis we're using, so we hope this will work.  But I just checked the numbers and they didn't see ANY of the kind of change we'd need.

Really we should use selection for streptomycin resistance, since its T->C mutation is a type that arose at high frequency with the in vitro EMS treatment.  I wonder if we have the primers for this - I think the post-doc might have gotten them for us.

Mutagenesis results

I don't have any novobiocin-resistant transformant-mutants after 24 hr (though slow-growing colonies might appear later), so I can't use that to tell how effective the mutagenesis was.  But I have tons of chloramphanicol-resistant ones at the low exposures to EMS (2, 5 and 13 minutes), 100-fold less at 30 minutes exposure and none at 60 minutes exposure (the highest dose).  This tells me that the EMS was doing its job, and that the DNA damage caused many potential transformants to have lethal mutations either in the CAT cassette or in nearby genes in the recombination tract.

So I think I'll go ahead and make pools of colonies from the 5-min and 13-min treatments and enrich them for hypercompetent mutants by selecting for StrR transformants in log-phase cultures. Then tomorrow I can screen these for hypercompetence by our crude colony-transformation assay.

Why not also the 2-min treatment?  OK, I'll include one pool of those too.

I'll have four five pools (10^4 and 10^5 transformants from each of the two treatments), which will be easy to handle.  What control cultures should I include?  RR805 (murE+) will give negative control colonies, and RR797 (murE749) will give positive control colonies.

* One reason to not use the ~1000 CmR colonies from the 30-min dose is that these are less likely to have recombination tracts extending all the way from the CAT cassette to murE.  That's because this segment contains two essential genes (ftsI & ftsL), and recombination tracts that cover the CAT-murE distance are much more likely to have had a lethal mutation in one of these genes than are tracts that don't reach to murE.

Mutagenesis planning in progress

By midday today I'll have checked my strains and made my DNA. 

The strains are RR805, which has a CAT cassette linked to the murE+ gene (normal competence), and RR797, which has the same cassette linked to the murE749 hypercompetence allele and a StrR point mutation elsewhere in the chromosome. I've checked their antibiotic resistances, done platings that will confirm their competence phenotypes (will count colonies this morning), and made crude DNA preps (I'll complete purification this morning).

Next I should do the mutagenesis dose-response curve, and I've now realized that this experiment can also be used for the first hunt for more hypercompetence mutants. 

Mutagenesis (today?):

Set up one tube containing 12 µg RR805 chromosomal DNA in 120 µl water or TE, at 37 °C.

Take a 20 µl time = 0 sample (see below).

Add EMS to the remaining DNA, to a final concentration of 50 mM. 

Take samples at time = 2, 5, 12, 30 and 60 minutes.  Immediately add each sample (including t = 0) to 100 µl of 5% sodium thiosulfate, which will inactivate the EMS and stop the mutagenesis.

The t = 2  sample will have had about 6-fold less exposure to EMS than used by the Lai et al. paper, and the final sample will have had 5-fold more.

Add NaCl to each sample to 0.15 M and add 2 volumes of ethanol to precipitate the DNA.  Rinse the pellets (probably invisible) with 70% ethanol and air dry.  Resuspend each in 50 µl TE.  (If the invisibility of the pellets is a problem I could add some E. coli DNA as carrier, since this won't interfere with the subsequent transformations.)

Transformations (today):

Thaw out lots of vials of frozen competent KW20 cells (wildtype).  I need one tube for each of the 6 DNA samples, and also one for RR797 DNA (chloramphenicol resistance control) and one for MAP7 DNA (transformation control).

Add 2.5 µl (= 100 ng) of each DNA to a tube containing 1 ml of cells.  Incubate for 15 min at 37 °C.

Add 3 ml sBHI and incubate for 90 min longer, to allow expression of the chloramphenical resistance.

Dilute and plate on plain plates (10^-6, 10^-5), Nov1 plates (for low-level novobiocin resistance, plate undiluted and 10^-1) and Cm1 plates (plate 10^-3, 10^-2, 10^-1 and undiluted).

Freeze the remaining transformed cells in case I want to do more with them later.

Analysis and next steps (Friday):

Use the colony counts to assess the extent of mutagenesis and gene inactivation.  For doses that gave high NovR mutagenesis without reducing the CmR transformation rate, make pools of the CmR colonies from plates that have >1000 colonies (one pool per plate). 

Then I cna grow each pool to early log in sBHI and transform it with StrR DNA to enrich for hypercompetent mutant.

Then I'll screen individual StrR colonies for hypercompetence by mixing them with MAP7 DNA and plating on Nov.

What if I don't get any NovR mutants? 

My previous use of EMS, mutagenizing cells, not DNA, gave NovR mutants at about 10^-6 of the survivors.  If this was the level of NovR mutations in my mutagenized DNA, the transformation assay probably wouldn't detect their presence because only about one cell in 1000 will have recombined the nov-containing DNA fragments, giving a transformation rate of 10^-9, below the detection limit.  But I expect the mutation rate to be much higher for the pure DNA, so I'm hoping that I'll see significant increases in resistant colonies.

 If I don't?  I could just go ahead and screen a couple of the high-dose CmR pools for hypercompetent mutants anyway, since if I find some then I can just forget about the Nov test.  If I don't find any hyprecompetent mutants I should repeat the mutagenesis using a NovS DNA fragment as control.

What mutation rate do I want for my experiment?

I need to decide on a desirable mutation rate for my murE mutagenesis experiment (described here).  To do this I need to think about (at least) how big the gene is, how large a region of the gene I want to investigate, what fraction of mutations will interfere with or eliminate gene function, and what fraction of mutations might cause hypercompetence.

How big is the gene?  1467 bp (489 aa).

Are hypercompetence mutations  equally likely to occur anywhere in the gene?  The mutations we have are in domain 3, at amino acids 361 and 435, so maybe other mutations would be nearby.  But maybe not.  Let's first consider the whole gene, and then decide * if focusing on the last third of it would make any difference.

What are the expected frequencies of mutations with different effects?  About 50% of random base changes change an amino acid (surely someone has done this calculation...).  Since all three of our known mutations change an amino acid, let's assume that silent mutations don't affect competence. About 34% of random amino acid changes interfere seriously with protein function (Guo et al. 2004).  Our known mutants appear to have normal MurE catalytic function, and defective mutants will not show up in our screen because murE is an essential gene.  So that leaves about 1/3 of all the mutations as causing well-tolerated amino acid substitutions.

What fraction of well tolerated amino acid substitutions cause hypercompetence?  We know of three that do.  How many different amino acids can each codon mutate to?  Probably about 9 or10 on average.  So let's say we have 500 codons of interest, that's about 5000 different possible amino acid substitutions.  About 2/3 of these will be well-tolerated.  So we know that 3 out of 3,300 amino acid changes cause hypercompetence.  Other mutations may cause hypercompetence too, but since half the mutations will be silent, this lower-bound means that at least 1/2000 colonies with a single murE mutation can be expected to be hypercompetent.  That's pretty good odds, given that our transformation-selection step can enrich 1000-fold for hypercompetence mutations.

So an average of 1-2 mutations per kb should give us easy-to-find hypercompetence mutations. Will higher mutagenesis give us more? Issues to consider:
  1. More mutations means more non-tolerated mutations, which means that some hypercompetence mutations won't be seen because their cells died.  I don't think this is a big deal, unless we made the mutation rate very high.
  2. More  mutations means more irrelevant mutations in each gene we sequence.  This is important.  Inference will be greatly simplified if genes from hypercompetent cells have only one mutation.  So it's probably best to  use the lowest level of mutagenesis that will give us easily-detected mutants.
The Lai et al paper had 5-6 mutations per kb. This is probably too high for us.

Another concern is mutations in the genes between the CAT cassette and murE.  Some of these are essential, and mutations in them will reduce the frequency of recovering viable transformants that contain both the CAT cassette and murE.  This is another reason to go for a low mutation rate.

* Back to a previous point.  Does it matter whether we want to screen only the last third of the gene?  No, because we don't have any way to isolate this from the rest.

murE mutagenesis planning

 (Edited on March 1, after more thinking and planning.)

I want to create a pool of cells with random point mutations in the H. influenzae murE gene, and to select and screen this pool of cells for hypercompetent mutants.  I'm going to do this by mutagenizing the DNA with the chemical mutagen ethyl methanesulfonate (EMS) in vitro and then transforming it into cells, rather than mutagenizing cells.

One unanticipated benefit of the in vitro method is that the mutation spectrum is better.  With in vivo mutagenesis, EMS produces mainly  GC-to-AT transition mutations by alkylating guanines in DNA, creating O-6 ethylguanine which mispairs with T instead of C during DNA replication. (info from Wikipedia). But the in vitro work found a much less biased distribution, with 42% GC-to-AT transitions, 34% AT-to-GC transitions, and 24% GC-toCG transversions.


Step 1. Cut chromosomal DNA of strain RR797 RR805 with the restriction enzymes KpnI and BglII. This strain contains the wild type allele of murE (oops, no, this strain has the murE749 hypercompetence allele! The strain I want is RR805.), and has a chloramphenicol resistance cassette inserted about 5 kb away from the site of the known murE mutations,. This digest creates an 8 kb fragment that contains both the CAT cassette and the wild type murE allele.

Here's the map:

This pre-digestion step could probably be omitted if necessary, because random fragmentation of the DNA will accomplish almost as much. But it shouldn't hurt, and it might double the frequency of cotransformation.   But I just looked at some old cotransformation data, and I see 60-70% linkage (selecting for CmR gives the linked murE allele), which is very good

Step 2. Soak this DNA in an EMS solution for 1 hr.

Step 3. Wash the DNA and transform it into competent wildtype cells.  Use about 100 ng DNA per ml, so that each cell is likely to recombine only a single DNA fragment.  As a control, transform the same cells with DNA from the chloramphenicol-resistant murE749 strain.

Step 4. Select for chloramphenicol resistance, to enrich for cells that have recombined in murE.  This will also confirm that the level of DNA damage was not so high as to limit transformation.  I should be able to get many thousands of independent transformants.

Step 5. Pool chloramphenicol resistant colonies, creating separate pools from independent sets of transformants.  Aim for about 5 pools.  Freeze some of the cells of each pool.  Make a pool for the control transformants too.

How many colonies should be in each pool? I want enough colonies per pool that each is likely to contain at least one hypercompetent mutant - how many colonies will this be?  I know of three mutations that produce hypercompetence, which would let me predict the minimum expected frequency of hypercompetent colonies if I knew the frequency of mutations in the DNA and the degree of linkage in the transformation.  I can measure linkage by doing colony assays on the control transformation.  The enrichment can increase the frequency of hypercompetence by 1000-fold, if all the mutants are as hypercompetent as the ones we have.  So if the frequency of hypercompetence in the chloramphenicol-resistant transformants is 1/1000, I should put at least 1000 colonies in each pool.  If it's less, I should put more. 

Step 6. Grow the pooled cells in sBHI at low density for a few hours, then transform with cloned or PCR'd NovR DNA (or a different marker?).  Plate on nov plates.  Do this with the control murE749 transfornation too.

Step 7. Screen individual NovR colonies for hypercompetence by touching them to nov plates and then resuspending the rest of the cells in sBHI containing MAP7 DNA and plating on Kan (or Nov?) plates.  Do only 10 colonies per pool, or 1/1000 as many colonies as went into the pool?  I expect most of the control colonies to be hypercompetent.

Step 8. For each pool, pick one or two high-transformation colonies from their toothpicked plate, and retest their competence with a simple time course.

Step 9.  PCR and sequence the murE genes from5 or 10 of the confirmed hypercompetent mutants (depending on how many I get, of course).  Are the known mutations present?  New mutations?

First we should test different levels of mutagenesis:  

The protocol we have (Lai et al.) says to use 1 µg DNA in 20 µl 10 mM EMS for 1 hr; this gave 5-6 mutations per kb in the clones they sequenced.  It also reduced the transformation efficiency of the plasmid insert they mutagenized to about 60%.  If they carefully standardized the amounts of DNA, this reduction should have been a direct consequence of DNA damage and repair processes, since they were not selecting for function of their mutagenized insert.

5-6 mutations per kb sounds pretty good for us (but see next post, which suggests we want fewer), since about half of them will be silent, but I think we should first try a wide range of concentrations.  For the cell mutagenesis (many years ago) I used 50 mM for 45 min and 80 mM for 30 min (RR expt # 181), but we want much heavier mutagenesis here.  So here let's try 0, 2, 5, 10, 20, 50, and 100 mM - that's 7 DNA samples to do transformations with.

Two assays for the extent of mutagenesis:  

1. (To identify an optimal concentration) Mutations creating low-level resistance to novobiocin: Mutagenize any novS DNA (e.g. RR805) and transform into KW20 and select for low-level novobiocin resistance (1 µg/ml rather than 2.5), to check the efficacy of the mutagenesis.  There should be an optimal dose of EMS, above which the frequency of nov resistance drops because the DNA is too damaged to recombine or contains too many mutations that block gene function.

2. (To identify concentrations that are too high) Mutations that inactivate the CAT cassette:  Mutagenize RR805 DNA and transform KW20 to chloramphenicol resistance. At some EMS dose the transformation frequency will decrease because the DNA is too damaged to recombine or contains too many mutations that block gene function.  (This test could also be done with any point mutation creating antibiotic resistance.)

What we know about the competence-regulon gene comM

The grad student of an upstairs colleague has been doing a lot of excellent work on the Rhodobacter capsulatus homologs of some H. influenzae competence genes, because he has discovered that they are also needed for gene transfer by GTA, the phage-related 'gene transfer agent'.

One of the genes he's looking at is comM.  ComM is predicted to be a cytoplasmic protein, a member of the YifB subfamily of AAA-ATPase proteins.  Here's a review about the AAA+ superfamily.  These proteins have a very diverse range of activities, so it's hard to make any prediction about a likely function for ComM from looking at its relatives.

ComM was originally studied in H. influenzae, by Michelle Gwinn and Jean-Francois Tomb in Ham Smith's lab.  They reported that their comM mutant had normal DNA uptake but reduced transformation (down about 300-fold).  It had normal expression of a lacZ fusion to another competence gene, indicating that it didn't affect regulation of competence.  (It also had reduced phage recombination, but we still don't know what this assay means.)

To find out why transformation was reduced, they followed the fate of end-labelled DNA fragments. The kinetics were like those of both wildtype cells and a rec1 mutant (rec1 is the H. influenzae homolog of E. coli's recA; it's absolutely needed for homologous recombination). So the authors concluded that the comM knockout does not affect the transport of DNA into the cytoplasm.  But their data doesn't distinguish between an effect on DNA degradation (indirectly preventing recombination) and a direct effect on recombination.

We've independently created a comM knockout; its DNA uptake is also normal, and its transformation is also down, but only about 20-fold.  We haven't done anything more to evaluate its phenotype.

*Interestingly, Gwinn et al. commented that "In addition, HI1117 has homology to a magnesium chelatase gene of Rhodobacter capsulatusbchI, involved in bacteriochlorophyll biosynthesis (1) and to related genes from other photosynthetic organisms."

Grant proposal's done! What experiment shall I do?

I clicked 'Submit' on my grant proposal last night; my immediate teaching responsibilities are light, and there's nothing else big on my plate, so now I get to start doing experiments again!

I think the most fun thing to do will be to join the sabbatical visitor and the co-op tech in doing mutant hunts for hypercompetent strains.  They're mutagenizing the rpoD gene and screening for new mutations that cause hypercompetence, and I can use the same methods on the murE gene.

This old post describes what we know about the relationship between murE and competence.  Well, what we used to know, because now we have new RNA-seq data that will tell us how transcription changes.  Basically, we have four independent mutants that all cause very similar extreme-hypercompetence phenotypes.  murE749 is the main one we've studied.  Some lacZ-fusion analyses indicate that it acts by causing overexpression of genes in the competence regulon (we looked at two genes) and one low-quality microarray appeared to confirm this and (maybe?) show some overexpression of sxy (the regulatory protein that controls expression of the competence regulon).

We assume that the other murE mutations act the same way.  But we have absolutely no idea how the mutations cause the phenotype.  MurE is an essential cytoplasmic enzyme in the pathway that synthesizes the cell wall.  The mutants all grow normally (though we haven't done a BioScreen run), and are not unusually sensitive to any simple test of cell-wall function.
One big part of the puzzle is how the mutations change the protein. The diagram above shows that three of the mutations change a poorly conserved amino acid (at position 435); these changes wouldn't be expected to have any serious impact on the enzyme's catalytic function.  So how do they have such a big impact on cometence?

On the other hand, the mutation in murE751 changes the strongly conserved leucine at position 361 to a very different amino acid (serine).  Leucine is hydrophobic but serine is polar, so they make very different interactions with their surroundings.  Because this leucine is highly conserved we think it must play an important role in the enzyme's catalytic function.  This would explain how the mutation can have a big effect on competence, but leaves us instead wondering why it doesn't have a big effect on cell growth.

I need to do several things:
  1. Update my reading to find out what's been learned about MurE function since we published our paper way back in 2000.
  2. Dig into the new RNA-seq data to see what it tells us about RNA changes in the murE749 mutant.  This will require finally learning some R and/or getting help from other lab members.
  3. Isolate new murE mutations that also cause hypercompetence.
Lots of fun!

Yes, I'm still here

The last-chance-for-everyone CIHR grant proposal deadline is Friday at 8:30 am!  After that's in, I promise to get back to the bench and back to proper research blogging.

Is there DNA in oreos?

I have to weigh in on this.

I spend a lot of time discussing the idea that bacteria can use DNA as a source of nutrients, and audiences are always surprised when I show this graphic and point out that DNA is ubiquitous in our foods.  And these are scientifically sophisticated molecular biologists and microbiologists.

So it's not at all surprising to me that 80% of the general public would check 'Yes' when asked, in a set of survey questions about food labelling and regulation, whether foods containing DNA should be labelled as such.  Instead of laughing at their ignorance we should think about how much expert knowledge is needed to evaluate this issue.

Many people, if guided by a series of prompting questions, could figure out that there's probably some DNA in at least some natural foods. But would you expect someone who hadn't taken high-school biology, or took it a long time ago, to know the answers to any of these questions?
  • Is there DNA in meat?
  • Is there DNA in leaves?
  • Is there DNA in potatoes?  In rice?  In noodles?
  • Is there DNA in fruit?  What if you don't eat the seeds?  In fruit juice?
  • Is there DNA in beer?  In wine?  In scotch? 
  • Is there DNA in flour?  In butter?  In olive oil?  In oreos?
  • Is DNA destroyed by being cooked?
  • Does DNA break down (like some vitamins) when food is stored?
  • Does DNA dissolve in water?

RNAseq success!

The sequences are back for our big RNAseq project, and the big good news is that the RNA preps were of good enough quality to give useful sequences for all the samples!  Thos was very much in doubt, because the Bioanalyzer characterization of the RNA samples showed almost no detectable mRNA-sized molecules in the samples we tested.

Now we have all this data, we need to decide how to analyze it.  It's not just a big dataset but a very rich one since the samples differed in what can be considered to be three independent directions, all of which are underlain by rich sets of biological information about phenotypes and molecular events.

  1. Each sample was in one of two different culture media, either rich growth medium (supplemented brain-heart infusion, sBHI) or the competence-inducing starvation medium M-IV.
  2. Each sample is part of a time course, either three different cell densities in sBHI or four time points of a culture transferred to M-IV.
  3. Each sample has a specific genotype: wildtype, knockout mutations in well characterized competence-regulating genes (sxy or crp or cya), mutations that cause hypercompetence (sxy-1, murE747 or rpoD753), and other mutations that affect competence by unknown mechanisms (knockouts of hfq and of one or both members of the mysterious toxin/antitoxin system
  4. Most samples have one or two replicates, from independent cultures usually on different days.
Ideally we would first characterize the data quality of each sample, and decide if we need to apply any constraints to its use.  Then we'd do a very meticulous analysis of the wildtype cultures to identify the genes that change when cells become competent, followed by analysis of the sxy and crp/cya knockouts to identify the genes that are specifically responding to these regulators.  This would let us identify all the genes we know we should pay attention to when looking at the effects of the other mutations.

But I don't have a regimented team of minions to do exactly what I tell them.  There will be several of us working on this data set, with different skill levels and research goals.  So here's my tentative plan to keep us at least informed about what each other is doing.

Group blog: I've set up a group blog on Blogger, called The Sense Strand (great name, right?).  Each of us needs to post there to tell the others what we've done and what we've learned.  These posts should be in plain English, this is not a place for data files or code.

Gene-info: We have a big table of information about the known competence genes, and I'm going to convert this into a Google Docs spreadsheet that we all can edit, adding new genes and new information as we develop it.

Shared data files: We've created a shared folder on Google Drive, where we all will post copies of the useful data files we generate.

Code repository:  Finally, I've just learned how to create a shared code repository on GitHub (I'm doing the short Coursera course A Data Scientist's Toolbox, in preparation for their short R Programming course.)  We'll all use this to archive copies of the code we use to do our analyses.

Transformations with dirty DNA

The previous post considers ways to test the effects on transformation of chromosomal proteins bound to donor DNA, by gently lysing donor cells and transforming recipient cells with the crude lysate.  We've now done one experiment trying out some methods.

We used donor cells resistant to novobiocin.  We tried freezing the cells without adding the usual 16% glycerol as cryoprotectant, vortexing them with a drop of chloroform or in 0.0001% SDS to disrupt the membranes, with or without pretreatment with EDTA and lysozyme (0.1 mg/ml, 1/10 the normal concentration) to break down the cell walls.

One question was how efficiently different methods would kill all the donor cells.  If some donor cells remain alive they will form colonies on the selective plates used to isolate transformants, confounding measurements of transformation.  Freezing was surprisingly effective.  Many aliquots of a single 'late-log' culture (OD600 = 1.0) were originally frozen at -80 °C in their normal growth medium.  After thawing, 20-40% of the cells were still viable.  All the treatments (listed below) decreased viability, but most left 10,000 or more viable cells per ml.  But after the various treatments we froze the cells again, this time at -20 °C, and when these were thawed there were no viable cells.  This is good, because freezing/thawing is very easy and we don't expect it to disrupt the associations of chromosomal proteins with DNA.

We had 5 treatments:

  1. just chloroform
  2. just SDS
  3. lysozyme then chloroform
  4. lysozyme then SDS
  5. just lysozyme

After the various treatments we pelletted the remaining cells and/or debris, and used 10 µl and 100 µl of each sample as 'donor DNA' in transformations of sensitive wildtype cells.  We got LOTS of transformants from all the treated samples, sometimes as many or more than from an equivalent amount of purified DNA.  There were differences between samples but the causes are unclear.

Next steps:

Freezing/thawing:  try just -20 °C with no other treatment.  Does this kill all the cells?  Does it liberate transforming DNA?

Try less SDS plus lysozyme, and try a milder detergent.

To characterize the DNA, try running the treated samples in an agarose gel, say 0.5% agarose so that large DNA enters the gel.  Run samples ± lots of SDS, to see what difference bound proteins might be making.

Do chromosomal proteins on 'donor' DNA affect transformation?

I've started polishing my not-quite-good-enough CIHR proposal for what's called the 'Transitional Open Operating Grant Program' competition.  This is the last-chance-under-the-old-system competition; any future proposals will be evaluated under the new system, which doesn't have much use for pure science or small labs. Proposals are due March 2 2015.

Before then I'd like to do some preliminary experimental work on one of the studies I'm proposing. I expect that the H. influenzae DNA available to H. influenzae cells in the host environment will still have bound to it many of the normal chromosomal proteins (HU, H-NS. Fis), and might even retain aspects of the normal nucleiod structure.  I want to find out how this affects the ability of the DNA to be taken up by competent cells, particularly whether specific sequences or segments are affected more than others.

My poorly thought-out plan is to lyse donor cells carrying one or more antibiotic resistance alleles in a way that doesn't disrupt bindings of proteins to DNA (so not with 1% SDS), and then mix this lysate with competent sensitive cells.

Big problems I forsee:

1.  How to lyse the donor cells without disrupting the nucleoid proteins?

My original plan was to use the H. influenzae phage HP1.

I could probably instead lyse the cells with lysozyme and a small amount of a surfactant that disrupts membranes but not proteins.

2.  How to lyse the donor cells without killing the recipient cells?

If the recipient cells are lysogenic for this phage (I have such a strain in the freezer), then they will be resistant to the free phage in the lysate.  An undiluted lysate has an enormous number of phage (>10^10 per ml), but I'll dilute the lysate to a chromosomal DNA concentration of about 1 µg/ml, which would be saturating for uptake if the DNA had been purified.
(Hmm, what will be the chromosomal DNA concentration of a lysate?  Say 3x10^9 cells/ml, most of them lyse, 1,830 kb of DNA/cell, 10^-12 µg of DNA/kb...  That's only about 4 µg of chromosomal DNA per ml.)
I can dilute the DNA way more than this, because transformation has such a wide sensitivity range.  I could also make it difficult for the phage to infect by eliminating Mg, but that might also hinder DNA uptake (the normal competence medium is 10 mM Mg).  If I had antibodies to the phage I could block infection that way (we used to do this with lambda), but I don't.

If I lysed the cells with surfactant, I could then easily dilute the lysate to reduce the concentration of the surfactant to a concentration that wouldn't harm the recipient cells (10-fold or 100-fold wouldn't be a problem).

3.  How to kill off or remove all the donor cells that aren't lysed, without disrupting the nucleoid proteins?

I don't know if it would be possible to pellet the cells without pelleting the nucleoids, especially since the nucleoids and DNA may be still attached to the cell wall.  Or to filter out the cells without removing or disrupting the nucleoids.

I don't really need intact nucleoids, but I'd like to still have the proteins bound to much of the DNA.

How can I check the state of the DNA?  I hope I wouldn't need to use electron microscopy.  Maybe I could use a simple very-low concentration agarose gel? Like an Eckhart gel, but interested in the DNA+gunk, not the megaplasmid?

4.  How to kill off or remove all the donor cells that aren't lysed, without killing the recipient cells?

If the recipient cells are already resistant to an antibiotic that the donor cells are sensitive to, I can include this antibiotic in my selective plates.

Or maybe I could kill them by adding a bit of chloroform to the lysate, and then diluting or evaporating the chloroform before I add the lysate to the cells.  Chloroform is normally used to sterilize phage lysates, but I don't know if it would affect the nucleoid proteins.

Interaction of a ∆hfq mutation with the rpoD mutation

Here's the last part of the summary of what our senior co-op technician has been doing.  The last set of experiments tested the interaction between a new ∆hfq mutation we've been studying, which reduces competence) and the rpoD mutation (which increases competence).

The Hfq protein binds small regulatory RNAs, helping them to form base-pairs with the mRNAs that they regulate.  In other bacteria we know that this base pairing can either reduce the mRNA expression (usually mediated by RNase E degradation) or increase it (by reducing the effect of otherwise-inhibitory mRNA secondary structure).  Our ∆hfq mutation reduces competence in MIV-induced cells by about 10-fold, suggesting that it increases the translatability of sxy mRNA.

The technician tested whether this effect is still seen in cells with the rpoD mutation.  She first had to construct the double-mutant strain.  This was relatively easy because the ∆hfq mutation is 'marked' with a SpcR cassette, and the honours student who's studying this mutation had already made chromosomal DNA.  So she used his chromosomal DNA to transform strain RR753 (rpoD mutant) to spectinomycin resistance.

She then did a competence time course, following development of competence in rich medium in four strains.  In the graph below we see that the ∆hfq mutant (green line) develops competence later than the wildtype cells (KW20, blue line) and to a lower final level.  This mutation also reduces the competence of the rpoD mutant (compare red and purple lines), although not as severely.

So we conclude that, whatever Hfq is doing to promote competence, it's still at least partly needed by the rpoD mutant.

The honours student has been analyzing the interactions of the ∆hfq mutant with other factors and other hypercompetence mutations.  I'll do a separate post pulling this together, unless he does it on his blog first.

Effects of cAMP and AMP on competence development by the rpoD mutant

This is a continuation of yesterday's post on the phenotype of our hypercompetent rpoD mutant strain RR753.  Yesterday we wrote about its behaviour under 'normal  growth conditions, and now we're going to consider two new factors, cyclic AMP (cAMP), which induces competence under what are otherwise non-inducing conditions, and AMP, which inhibits competence development under what are normally inducing conditions.

First the effect of adding cAMP: We tested this by adding 1 mM cAMP to cells growing exponentially at an OD600 of 0.1, and measuring transformation 60 min. later.  At this growth stage, normal cells do not transform detectably, but addition of cAMP turns on sxy transcription.  Some of the resulting sxy mRNA is translated and the Sxy protein acts with CRP to stimulate transcription of genes encoding the DNA uptake machinery.  In the tech's experiment, cAMP addition raised the transformation frequency about 500-fold, from 1-3 x 10^-8 (just at the detection limit) to 6.5-8 x 10^-6.  The rpoD mutant is somewhat transformable even with out cAMP (01-3 x 10^-6), and cAMP addition raised this about 100-fold, to ~2 x 10^-4.

So we conclude that the rpoD mutant does not bypass the need for cAMP in competence induction. This rules out the boring hypothesis that changing Sigma 70 activity perturbs cellular metabolism, causing an elevation of baseline cAMP levels that in turn causes the mutant's increased competence. Instead it's consistent with our interesting hypothesis that the rpoD mutation likely changes one or more events after the sxy transcription is stimulated by cAMP and CRP.

Next, the effect of adding AMP:  Maximal competence is normally induced by transferring exponentially growing cells from rich medium to a 'starvation' medium called 'MIV' and incubating them for 100 min.   Previous work has shown that adding purine nucleotides or nucleosides (usually 1 mM AMP) to the MIV prevents normal competence development by reducing the translation of sxy mRNA (Sinha et al. 2013).  The next experiments tested whether AMP has the same effect in the rpoD mutant.

Both wildtype and rpoD mutant cells have high transformation frequencies after incubation in MIV. In these experiments the rpoD cells had slightly higher transformation frequencies (about 4 x 10^-3) than the wildtype cells (about 1 x 10^-3).  Adding AMP to the MIV used to induce competence reduced the transformation of wildtype cells more severely than seen in previous work, about 5000-fold (from 1.7 x 10^-3 to 3.4 x 10^-7) and at least 10,000-fold (from 3 x 10^-4 to 3 x 10^-8).  The AMP also reduced the viability of the cells by several fold. (Both replicates gave no transformants at all with added AMP, so these estimates are upper limits.)

Added AMP also reduced the transformability of the rpoD mutant.  The first replicate gave no transformants (the plated cells were too dilute) indicating that transformability as reduced at least 1000-fold, and the second replicate showed a ~6700-fold reduction.

These numbers are all lower than previous results, but the conclusion is clear that adding AMP to the MIV medium strongly inhibits the development of competence in the rpoD mutant.  We don't know how the added AMP caused the reduction in competence, but, based on other evidence from analysis of purine-biosynthesis mutants, I've hypothesized that the key factor is a decrease in the concentration of another metabolite (PPRPP), which maybe interacts with sxy mRNA.  Production of Sxy by the rpoD mutant is still sensitive to this effect, so... (I don't know what).

Phenotype of the rpoD mutant

This mutation causes H. influenzae cells to become competent prematurely, and to reach levels of competence in rich medium that are about 100 times higher than normal cells.  The mutation causes a single amino acid substitution in domain 3 of the 'housekeeping' transcription factor called 'sigma 70'.  rpoD is an essential genes, needed for transcription of most housekeeping genes.  Since the mutant strain (named RR753) shows only a very slight decrease in exponential growth rate we think the mutation causes only a very minor change in the protein's function

My earlier post this morning said I had never explained my hypothesis about how this mutation causes increased competence, but actually I did, briefly here.  Here's the key sentences from that post:
My hypothesis is that the mutation's effect on transcription of sxy mRNA increases competence by increasing sxy translation.   I've long hypothesized that slowing elongation or increasing pausing in the 100 nt segment of sxy mRNA that forms its regulatory secondary structure will promote sxy translation by increasing the ribosome's access to the sxy ribosome-binding site and start codon. 
Then I listed some low-tech phenotypic analyses that the senior of our two co-op technicians could do:
  • Is RR753 sensitive to the inhibition of competence by added purines?
  • What's the effect of an hfq deletion in this background?
  • How does this strain respond to added cAMP?
  • How does it respond to the standard competence-inducing MIV treatment?
  • Does the mutation increase competence of a sxy mutant (sxy6) that has an extra-stable secondary structure?
  • Does it further increase log-phase competence of the sxy hypercompetence mutants, which have weakened sxy mRNA secondary structures?
She's now done all but the last of these, and we're considering what she should do next.  So here we're going to summarize what she's found and what we think it means.

Her first experiments gave a better characterization of RR753's growth rate.  She's done both Bioscreen growth curves (high-precision analysis of exponential growth) and manual ones (lower precision but better for cultures at low and high densities).

First the Bioscreen results:  Here growth of RR753 (red line) is compared to wildtype cells (KW20) and two other hypercompetence mutants, RR563 (sxy-1) and RR749 (murE).  This clearly shows the slightly slower exponential growth of the rpoD mutant.

The Bioscreen analysis measures OD600, so it doesn't tell us about the actual numbers of viable cells. The tech has also done a number of manual growth curves, mostly as control parts of experiments examining other variables.  These agree with the Bioscreen results in showing usually a slightly slower exponential growth rate, and no obvious differences in later survival.  

Her next time courses replicated my earlier measures of transformation frequency.  It looks like the rpoD mutant differs from the other hypercompetence mutants (in sxy and murE) in having very low competence at very low cell density, perhaps as low as that of wildtype cells.  The other mutants at 100-1000-fold more competent than wildtype cells at very low cell density. The rpoD mutant may also become highly competent at lower cell densities than wildtype cells, but may not be any different than the other hypercompetent mutants - these data are hard to interpret.

What could be the significance of having very low competence at very low cell densities?  I'd been assuming that the moderate competence of the sxy-1 hypercompetent mutants at low cell density reflected a baseline level of sxy transcription and an increased efficiency of translation.  If the rpoD mutation acts as I've hypothesized, it should have the same effect.  Provided the cells are in real exponential growth, the cell density shouldn't matter.  Might the rpoD mutant have two different 'exponential' growth phases, one at very low cell density and another at moderately low cell density?

Is this an important issue?  It would be quite a bit of work to investigate carefully, so let's set it aside for now.

The next experiments analyzed the effects of adding cAMP (known to stimulate transcription of sxy) and AMP (known to reduce translation of sxy mRNA).  I'll leave these for the next post.

What should our 'senior' co-op tech do next?

We have two co-op (undergraduate) technicians at present (paid from the last of our leftover CIHR funds).  Each is with us for 8 months; one started in May and the other in September, so there's a 4 month overlap.

The senior one has done almost all of the work preparing the samples for the RNA-seq project, and lately she's also been doing competence time courses to characterize the phenotype of our hypercompetent rpoD mutant.  She's looked at growth conditions, at the effects of added cAMP (competence up) and added AMP (competence down), and the effects of knocking out the small-RNA regulator Hfq (down).  Writing this post makes me realize that we haven't summarized her results anywhere, so I'll sit down with her and pull it all together this morning (I think that will be another post).

Now we need to decide whether there are still more rpoD phenotype assays she should do, or whether she should move on to another project for her last two months.  Since I hypothesize that the rpoD mutation causes competence by slowing sxy transcription and increasing its mRNA translatability, she could assay the effects of of the rpoD mutation in combination with our various sxy mutations that affect its mRNA translatability.  But this would be very much a fishing expedition, lots of work but probably no new insights (because it's not testing any specific hypotheses).

Hmmm, looking back, I discover that I've never written a blog post clearly explaining the general hypothesis about how the rpoD mutation causes hypercompetence.  I think it's time I did that.