Field of Science

Wildtype strain weirdness

While looking over some of the RNA-seq analyses done by our summer student (former undergrad, someday grad student somewhere), I noticed something unexpected.

We know that cultures in the rich medium sBHI become moderately competent when they reach high density.  Their transformation frequencies reach about 10^-5 - 10^-4, which is 100-fold lower than fully induced cultures but 1000-fold higher than log phase cultures.  Consistent with this, in the microarray analyses we did about 12 years ago we saw modest induction of all the competence genes except ssb in this condition (we didn't publish the data but described it as 4-20-fold induction).

So I expected to see similar induction in in rich medium in the RNA-seq data.   But there's no consistent induction at all; most genes don't show any change at all, and a few go up or down a bit.  You can look at the complete data here, but unfortunately the individual graphs are very low resolution.  Here's a blowup of one pair of graphs, for comE, which encodes the secretin pore:

On the left are cells in the competence-inducing medium MIV.  We see very strong induction of comE in wildtype (KW20) cells (brown dots and line), and no induction when the Sxy regulator is knocked out (blue dots and line).  On the right are cells in rich medium.  Here we see strong induction in the presence of the sxy1 hypercompetence mutation (blue dots and line) but no induction at all in the wildtype KW20 cells.

So the summer student did more analyses.  First he did more work with the RNA-seq data:

1.  The baseline (log phase) levels of expression of each competence gene are the same in cultures grown for the rich-medium experiments and cultures grown for the MIV-induction experiments rich medium.  This suggests that there was nothing very wrong with the rich medium cultures.  (The rich medium experiments took their log phase samples from very dilute cultures (OD600 = 0.02), and the MIV-induction ones took theirs at OD600 = 0.2.  Later we plan to use this to test whether both densities are genuine log phase, by seeing if this density difference changes expression of any genes at all.)

2.  The sxy gene is slightly induced as culture density increases, though we don't really know how much it should be induced.

3. The genes required for induction of the competence regulon are intact in the reads of the rich-medium cultures: sxy, crp, cya.  That means we didn't accidentally use a strain carrying a knockout of one of these genes instead of wildtype cells.

4.  To check whether the supposedly wildtype strain might have a knockout of another gene, he looked for reads derived from antibiotic-resistance cassettes.  He found a few reads of a spectinomycin cassette (like the one we have used for many of our knockouts), but the numbers were so small they're probably just contaminants.

Then he managed to do comparisons between the RNA-seq data and the old microarray data:

The blue-line graph shows the microarray data.  The Y-axis is fold-change in expression of each competence gene, and the X-axis is time relative to when a separate sample was removed for competence induction.  The red-line graph shows the RNA-seq data, squished to make the spans of its axes roughly consistent with those of the other graph.  The Y-axis is again fold-change in expression, but now the X-axis is the density of the culture, measured as OD600.  

In the microarray data, some genes aren't induced at all but others are induced as much as 11-fold.  In the RNA-seq data, only one gene is induced even 2-fold, and many are down-regulated.  So we definitely do have a problem.

So then it became my turn to do some experiments to figure out what's going on.  Fortunately I had saved frozen samples of the cells used for every RNA sample we analyzed by RNA-seq.  

First I thawed out and transformed the three samples of supposedly wildtype cells at OD600 = 1.0. Their transformation frequencies were all a lot lower than they should have been ('FK': 3.6 x 10^-7; 'GK': 2.5 x 10^-7; 'HK': 7.6 x 1-^-7, rather than about 10^-5).  That's consistent with a genuine lack of induction of their competence genes (and not with my alternate hypothesis, that there was just some error in the analysis of this RNA-seq data).

I streaked the cells on spectinomycin plates, to check if they had an unexpected spcR cassette.  None grew, and they all grew on the control plates.  So we didn't use any of our spc-cassette knockout mutants by mistake.

Finally I inoculated two of the supposedly wildtype strains ('GK', from a plain-plate colony of the above transformation test, and 'FK', from its OD600 = 0.02 frozen sample) into rich medium, along with a wildtype control strain, and tested competence development under three conditions.  The left columns show the transformation frequency seen 60 min after adding 1mM cAMP to a log-phase sBHI culture - I did these in case we didn't get normal transformation in the other tests, since this would tell us if the strains were somehow unable to produce cAMP (e.g. if they had a phosphotransferase mutation).  The middle columns are cells at high density in sBHI; both of the suspect strains have near-normal transformation frequencies.  The right columns are cells transferred to the competence-induction medium MIV; one of the suspect strains has normal transformation and the other is down 10-fold.

(I must admit that I don't have high confidence in the numbers from this experiment, for several reasons.  First, the colony sizes and counts were very erratic (inconsistent from one dilution or plating volume to another), perhaps because I used mostly old novobiocin plates left by our now-gone Co-op tech. Second, the 'latelog' samples were allowed to grow longer than I intended, so their competence may have been decreasing, and this problem was slightly worse for the GK and FK cultures.  Third, I was testing a new competence protocol (see p.s. below).)

So what have we learned?  Mainly that there's nothing obviously wrong with the 'wildtype' cells used for the sBHI RNA-seq samples.

So what should we do next?  I don't know.  Maybe I should repeat the competence-induction tests with fresh plates, to get better numbers.

p.s.  The KW20 MIV transformation frequency is slightly lower than I usually see, probably because I was testing a new scaled-down protocol that doesn't use an expensive disposable filter funnel to collect and wash the cells.  Our usual protocol is to collect and wash 10 ml of cells using a Nalgene disposable filter funnel (0.2 µ size, designed for water sampling), and then resuspend them in 10 ml of MIV in a flask shaking in the waterbath for 100 min. But my new attention to economy has revealed that each funnel new costs nearly $7.  So this time I just pelleted 2 ml of cells in a microfuge tube, resuspended the cells in 1 ml MIV, pelleted them again, and resuspended them in 2 ml of MIV in a large glass culture tube, which I incubated on the roller wheel in our air incubator.  The transformation frequencies for KW 20 and GK are plenty high.  The lower transformation of the green 'FK' sample may be because I ran out of MIV and skipped its washing step.

Finally the final RNA-seq steps

Yesterday I did the Ribo-Zero treatment of the 24 samples for our final RNA-seq analysis.  It was a bit complicated to plan because I was using a kit designed for only 6 samples (because the kits are so expensive).  Luckily we had leftover reagents from the 72 samples we treated last summer, which let me set up reactions that were 3/4 of the standard volume.

 The kit removes the abundant ribosomal RNA from the sample, leaving only the desired mRNA and some small RNAs for sequencing.  It works by first annealing tagged oligos to the rRNA, and then removing the rRNAs with magnetic beads that bind to the oligo tags (probably biotin?).  We had quite a lot of the buffer and oligo solutions left from last time, but no beads.  A mixup had led to the beads in the new 6-sample kit being ruined by being stored at -80 °C rather than 4 °C, but Illumina very kindly provided us with replacement beads. 

The replacement beads came with a new tube of the solution used to resuspend the beads after their preliminary washing, so I was able to resuspend the beads in 3/4 of the volume needed for 24 samples rather than 1/4.  Using the larger volume was important, because the separation step depends on the right relationship between the liquid in the tube and its position in a special rack with magnets on one side.
The final step is to ethanol precipitate the depleted RNAs and resuspend them in a special buffer for the first library-prep step.  The former RA is going to do the library prep and sequencing in her new lab (she also provided the Ribo-Aero kit), and she's given us the buffer for the resuspension.  Right now the 24 ethanol precipitations are in the -20 °C freezer; this morning I'll spin them down (30 min, because the concentrations are so low), wash tyhe pellets carefully with 70% ethanol, dry them carefully, resuspend them, and deliver them to her.

Then I'll give a big sigh of relief and move on to other projects.

Today's work on the RNA-seq samples

The Co-op tech has pressed half the samples through the RNeasy kit spin columns, and will probably get the rest done today.

But she also is still working on the PCR checks (problems getting amplification from the colony-DNA material) and redoing some of the transformation assays (using the frozen cells I'd saved for each sample) because two of her original tests gave surprisingly low transformation frequencies.

I'm hoping we can also get started on the next steps today - checking the RNA concentrations using the Nanodrop and running aliquots in a gel to check that the rRNA bands are intact.  First step for this is to clean up a gel box and comb and see if we have any RNA-loading dye made up.

Other tasks for the soon-to-be-departing Co-op tech

The Co-op tech will be leaving us at the end of the month.  Last week she gave an excellent bab meeting presentation, and this revealed a couple of loose ends and interesting possibilities that should be cleared up before she leaves.

First, she and the sabbatical visitor (now back home in Regina) isolated a new mutation in rpoD that causes hypercompetence.  But this strain hasn't yet been added to our formal collection of frozen strains and the associated 'Strain List' database.  This is essential and urgent.

Second, the mutant hunt turned up a couple of other 'possibly hypercompetent' mutants whose phenotypes haven't been checked out.  These need to be checked in competence time courses, and added to the Strain List collection if they turn out to be genuinely hypercompetent.

A weird result that I'd forgotten about is the finding that cells transformed with a mixture of NovR and KanR DNA fragments (generated by PCR) showed a much LOWER cotransformation frequency than expected.  We suspect this reflects some chromosome-level interaction between these linked segments, but we'd first need to reproduce the result.

RNA-seq progress

I've collected and frozen all the 24 samples for our make-up RNA-seq run. (not the Trizol-prep ones - they've been deep-sixed).  And this morning the co-op tech learned to prep RNA from each sample. She's done the first 4, and will do the rest over the next couple of days.

The next steps are:

  1. Complete the PCR tests of strain genotypes and the analysis of transformation frequency data. She's still working on the PCR tests, but so far everything looks OK.
  2. Check the RNA concentration using the Nanodrop
  3. Run aliquots of the samples in a gel to check integrity of the rRNA bands (surrogate for integrity of the mRNA).
  4. Treat 5 µg of each sample with DNA-free.  We found our stock from last year, and there's still enough to treat all our samples.
  5. The former RA says we can take the DNA-free-treated samples directly to the RiboXero ste; we don't need to first do a 'clean-up' step with the RNeasy Minelute kit spin-columns.
  6. Treat an aliquot of each sample with RiboZero.  We will use only half as much RNA as recommended, and only 1/4 as much of the other reagents (in 1/4 of the recommended volume, of course). This will let us treat 24 samples with a 6-treatment RiboZero kit.
  7. Give the samples to the former RA in her new lab for library preparation and sequencing.

What can I recover from an old failed experiment

About 18 months ago I did a big mutagenesis experiment, intending to isolate new hypercompetent mutations.  I made several mistakes and the experiment was a failure, but I did freeze stocks of intermediate cultures.   At the time I thought that some of these could be used in a future attempt, because they came from stages before the mistakes were made.

I still want to repeat this experiment, and I just found the stocks in the -80 °C freezer.  Now I need to decide which are potentially useful, and throw out the rest.  Here's photos of what I found:


The letters A-G refer to different strains, each with a wildtype version of a gene known to give rise to hypercompetence-causing mutations, and to different levels of mutagenesis
  • A, B & C: wildtype cells, incubated in 0 (A), 0.05 (B) and 0.08 (C) M solutions of the mutagen EMS.
  • D & E: strain RR514, which has a Streptomycin-resistance mutation (StrR) close to the wildtype sxy gene, incubated in 0.05 (D) and 0.08 (E) M solutions of EMS.
  • F & G: strain RR805, which has a chloramphenicol cassette (CmR) inserted within a few kb of (= closely linked to) the wildtype murE gene, incubated in 0.05 (F) and 0.08 (G) M solutions of EMS.

The big tubes turn out to be useless, since they contain cells that were incubated with the wrong DNA after the mutagenesis.  Most of the small tubes also are from stages that have been incubated with DNA, (e.g. label 'F DNA'), but others (the ones labeled '90') were frozen after 90 min of post-mutagenesis growth, before the DNA addition step.  These ones I can use.

The first step now is to do a test I didn't do in the original experiment, to check that the EMS mutagenesis did indeed cause mutations by plate some of the cells on low-concentration novobiocin. I'll do this test on the wildtype cells (B & C), so not to unnecessarily use up the more valuable cells in the marked strains (D-F).  I don't have tubes of the control A culture, so I'll just use normal wildtype cells.

If this test shows that the mutagenesis worked, I have two alternatives.  1.  I could isolate DNA from the mutagenized marked cultures and use it to transform wildtype cells to StrR (D & E, to enrich for cells with sxy mutations) or CmR (F & G, to enrich for cells with murE mutations).  Then I'd enrich these transformants for hypercompetent mutants by transforming them with the PCR'd NovR fragment (after first testing that this works well).  2.  I could do the hypercompetence-selection transformation first, and then isolate DNA and transform wildtype cells with selection for the linked marker.  The advantage of 1 is that I can pool many thousands (millions?) of transformants, maintaining whatever genetic diversity my mutagenesis has created in the gene of interest.

New RNA-seq work

(OK, I just checked, RNA-seq should be hyphenated.)

I've made a couple of posts about plans for new RNA-seq work on the Sense Strand blog: and

Now it's time to get down to work.

Here's the planned samples.  We have 26 on the list, but a standard run will only be 24, so two need to be dropped.  Conveniently, the two KW20-in-Trizol samples might not be needed, depending on the available small-RNA data for H. influenzae, so I won't consider those right now.

For the rest, we have 6 mutant strains.  Given the mixups that have occurred so far, it would be prudent to check these every way we can.

We are checking antibiotic resistances and will transform each strain when we grow its culture and collect the samples for the RNA preps. One of the Honours Zoology undergrads has already checked the toxin and antitoxin strains by PCR (she's the one who suffered most from the mixup), and we'll use the same primers to check the cultures we're sampling from.  We need the former RA's help to find the primers for the ∆hfq mutant, but luckily its phenotype is quite distinctive - in our whole collection I can only think of one other mutant that has its competence down 10-fold.  RR753's phenotype is also distinctive, hypercompetent, but not very.  The crp and sxy mutants have the same phenotypes (same drug resistance, same complete lack of competence.  We have lots of sxy PCR primers, but we'll have to check to see which ones will work with this insertion mutant.  And, for both crp and sxy there's the additional problem that miniTn10 insertions do not amplify well because of their end repeats.  I wonder if we have an internal primer for miniTn10kan.

The co-op tech has checked antibiotic resistances and frozen fresh stocks of all the strains.  She's inoculated the 4 strains for the MIV-competence preps, and tomorrow we'll toy to collect all those samples.  (No, I haven't done anything in preparation yet.)

Planning the DNA sequencing part of the PhD student's project

The former post-doc (I'll call him the FPD) visited yesterday afternoon, and we had intense discussions of how to proceed with both the RNAseq work (summarized here on our Sense Strand blog) and with the PhD student's planned DNA uptake experiments.

His planned experiments take advantage of the phenotype of a rec2 knockout mutation.  These cells take up DNA normally across the outer membrane, into the periplasmic space, but they cannot transport it across the inner cell membrane.  This allows him to recover intact DNA that has been taken up, and to use DNA sequencing to compare it to the input DNA the ∆rec2 cells were given.

Some of the experiments will use genomic DNA of the species being tested, fragmented to appropriate length distributions, and some will use synthetic DNA fragments (~200 bp) containing a 30-50 bp stretch of random sequence (see figure).

The FPD, who developed the synthetic fragment protocol, pointed out that his experiments had used full lanes of Illumina sequencing only because it was not then possible for us to 'barcode' our different DNA samples and mix them for sequencing as a single lane.  The sequencing depth he obtained was useful, but it will be extreme overkill for the experiments the PhD student plans.  So we need to design barcoding into our analyses, so we can mix up to 24 samples in one lane for sequencing, and then separate the resulting sets of sequence reads by their different barcodes.  We'll still need to use two lanes, because each 'recovered' sample will need to have a corresponding identical 'input' sample.  Because these samples will have the same barcode they could not be distinguished if they were sequenced in the same lane.

So rather than doing one very-deeply sequenced experiment, he'll be able to do multiple replicates, each sequenced at a moderate but entirely adequate depth.  If he uses a HiSeq machine for the sequencing, he'll be able to get 1.6 x 10^8 reads for each of 12 samples; with a NextSeq this would give 4 x 10^8 reads per sample. (Is that right, per sample, not per lane?).

One issue to keep in mind is that it would be foolish to save all the sequencing for one big batch at the end of the thesis work.  Instead the work needs to be designed with an initial set of samples to be sequenced, so he can (1) tell whether everything is working as it should, and (2) begin analyzing sequence data from one part of the project while generating additional samples for other parts.  For a preliminary batch of sequencing, it might be better to use a MiSeq machine, whose smaller capacity would let us sequence a few samples more economically.

We also talked about how long the random-sequence segments should be in the 200 bp fragments, and about where to locate the barcode segments.  These consist of an independent sequencing primer followed by 8 bp that identify the source experiment.  Putting these to the right of the random segment will let him efficiently create the double-stranded 200 bp fragments, using the same long left-side oligo (containing the random segment) with many different right-side oligos, each containing a different barcode.

Sensitivity of the PhD student's planned analysis

The PhD student is proposing to use Illumina sequencing of input and recovered-after-uptake DNAs to detect possible biases in uptake of DNA by bacteria other than H. influenzae.  (This is a simplified version of the analysis proposed in our funded NSERC proposal.) We're discussing the factors that will affect the sensitivity of this analysis, so he can say how strong a bias would have to be in order for his experiment to detect it.

The factors we've thought of are:

A. Nature of the preferred sequence pattern: 
  1. How long is it (3 bp? 10 bp?)?  How specific is it (e.g. is each base specified, or just 'purine' or 'pyrimidine'?  Together these determine how often this pattern will occur in the input DNA (by chance or due to uptake bias-drive).
  2. How strong is the bias favouring uptake of fragments containing this pattern?  How strict is the preference (are variants of the specified pattern also taken up, but less strongly)?  Are fragments with more than one occurrence of the pattern more likely to be taken up?
 B. Properties of the input DNA:
  1. If this is genomic DNA, what is the size range of the fragments?   The sensitivity of the experiment will be low if the fragments are so large that each has at least one occurrence of the preferred pattern.
  2. If this is a synthetic fragment containing a fully degenerate segment, how long is the degenerate segment?
C. Sequencing coverage:
  1. How high is the sequencing coverage?  Is it the same for the control input DNA and for the recovered DNA?  This will determine the noise due to random factors.  
  2. Does the error rate of the sequencing matter?
  3. For genomic input DNA, are there position-specific differences in coverage across the genome?
  4. For degenerate-fragment DNA, are there non-random factors in the input DNA or in its sequence-ability?
He's going to start by working through the values for a very-strong-bias case, detecting the H. influenzae uptake sequence in genomic DNA (figure below), and then relaxing the inputs.

Mutagenesis plans

(I'll add some explanations later.)

1.  Mutagenize more RR805 DNA, using a range of high EMS doses (10, 15, 20, 25, 30, 40 min in 50 mM).  Transform this DNA directly into competent KW20 (without EMS inactivation or DNA purification) and select for CmR and maybe for NovR.

2.  Mutagenize RR805 cells, using a range of high EMS doses (from expt. #180, 80 mM for 1 hr gives ~10^-2 survival).  The cells don't need to survive, because I'll just grow the culture for a couple of hours and then extract all the DNA and use that DNA to transform KW20 to CmR.

For both 1 and 2, then pool CmR transformants and transform at low cell density to StrR with RR514 DNA.  Test individual StrR colonies for hypercompetence by colony transformation with MAP7 DNA.

3. Mutagenize NovR and NovS PCR fragments (made by the sabbatical visitor), using the same EMS concentrations as in experiment 1.  Then test the effects of the EMS mutagenesis by transforming each DNA into KW20, looking for gain of NovR in cells transformed with the NovS DNA, and loss of transforming ability of the NovR DNA.

I can do experiments 1 and 3 today (if I first pour lots of plates).  I can then do Experiment 2 tomorrow or on the weekend, once the cells have grown up.


1.  I must have put too little chloramphenicol in the Cm plates for this experiment, because all the cells grew on the Cm plates.  I need to repeat this experiment.

3.  Increasing exposure to EMS caused decreased transformation by the NovR fragment, as it should, but the corresponding exposures of the NovS fragment gave no NovR transformants, indicating no detectable mutagenesis.  So the decrease seen with the NovR fragment may just be due to damage, not mutation.

2.  My streak of RR805 cells has grown nice little colonies.


I've inoculated one of the RR805 colonies for an overnight culture, so I will be able to do the experiment 3 cell mutagenesis tomorrow.  And tomorrow I'll make lots and lots of Cm plates, with the right amount of chloramphenicol, so I can also repeat experiment 1.

No new candidate mutants (sigh...)

As I planned here, I pooled the CmR colonies resulting from transformation with EMS-mutagenized CmR murE+ DNA, and grew them to log phase (OD600 ~ 0.1).  The murE+ cells in the pool should have been non-competent under these conditions, but any murE* hypercompetenc mutants should have been competent.  To select for these mutants I transformed the cells in each pool with DNA carrying a streptomycin-resistance mutation, and plated on Str plates.  One pool gave several hundred StrR colonies (many more than I would have expected as transformants), but the other pools had very few or none (4 total). I then screened individual StrR colonies by mixing them with dilute NovR DNA and plating on Nov plates.

Unfortunately none of the StrR colonies transformed to NovR at the high frequency seen for the positive control (murE749) colonies.  In fact, none transformed any better than the murE+ negative control colonies.

This is a bit surprising, given that the 2-fold higher level of EMS mutagenesis reduced by 100-fold the ability of the CmR cassette to transform cells, and the 4-fold higher level eliminated it entirely.  I had assumed that this reduction/elimination was due to too-heavy mutagenesis, but perhaps it was a direct consequence of the DNA damage.  One possible explanation I'm considering is that damaged DNA is almost always repaired or destroyed, and rarely gives rise to recombinants.  Another possibility is that, when cells are mutagenized, the mutations arise mainly only when levels of damage are so high as to overwhelm the repair systems, allowing the damaged bases to be used as templates for DNA replication.  Maybe this also requires induction of the error-prone DNA polymerase.

So now the sabbatical visitor and I are designing a control experiment, to test whether this direct DNA mutagenesis is working as we think it should.  We're going to mutagenize two versions of a DNA fragment containing the gyrB locus.  One is wildtype, and the other has the novR allele we usually use in our transformation assays.  We expect the transformation efficincy of the novR allele to decline with high doses of EMS, and we hope that now novR mutations will arise from high doses to the wildtype allele

* Here's some wishful thinking: Ideally we should be selecting for a G->A transition mutation because those are what EMS induces best.  But we're using novR (G->T) because we have the porimers handy and know they work.  The mutation spectrum of EMS is reported to be much broader with the in vitro mutagenesis we're using, so we hope this will work.  But I just checked the numbers and they didn't see ANY of the kind of change we'd need.

Really we should use selection for streptomycin resistance, since its T->C mutation is a type that arose at high frequency with the in vitro EMS treatment.  I wonder if we have the primers for this - I think the post-doc might have gotten them for us.