Field of Science

Does bicyclomycin induce competence? (What was I thinking???)

Last summer I started the blog post below.
 Does bicyclomycin induce competence?
Yesterday the summer student pulled out the public data files for E. coli microarray experiments that had included measurements of sxy mRNA.  We don't know how sxy expression is controlled in E.coli - nobody has found a way to induce expression of the chromosomal gene (we used an inducible plasmid clone to study its effects on other genes).  So it's good to see that some treatments did induce it. 
In the diagram below, each coloured vertical bar represents a single microarray comparison of sxy mRNA under two different conditions.  Mousing over the bar brings up a box describing the comparison and results.  Most of the bars are black or blackish; these are comparisons where sxy mRNA levels are the same.  Yellow bars are ones where it is down (bright yellow is ≥ 8-fold down, and blue bars are ones where it is up (bright blue is ≥8-fold up) (the scale is 'log 2 expression ratio').  
It's hard for me to tell which (if any) patterns are biologically significant.  The one I'm excited about

And that's the end of the draft post!

Subsequently I found a colleague who kindly gave me some bicyclomicin (it's an antibiotic), and roughed out a simple experiment.  Now I'm planning to train up our new summer undergrad so she can do the experiment.

But I can't remember why I thought that bicyclomycin might induce competence! 

Bicyclomycin is an antibiotic.  I'd never heard of it until last summer, but it's of general interest because it's the only antibiotic that inhibits the Rho transcription termination protein.  Given that competence development is limited by folding of the 5' end of sxy mRNA, it could be that Rho-mediated termination plays a role in determining whether sxy mRNA is translated.

Searching my blog posts for 'bicyclomycin' found the unpublished post above, which tantalizingly breaks off in mid-sentence just at the point where I was about to explain my interest.  The figure is a screenshot from a microarray database, and I would expect that one of the bright-blue bars (sxy induction) would be from an array analysis involving bicyclomycin.  But that doesn't seem to be the case.  Of the five analyses with bright blue bars, one is UV irradiation, two are  biofilms, one is heat shock, and one is a glucose-lactose shift.  No mention of transcriptional termination.  Searching the microarray database for 'bicyclomycin' brings up the expression of the bcl gene, whose mutations confer resistance, and a study of transcription termination in which sxy expression is unchanged!

This microarray study of transcription used bicyclomycin to inhibit termination. So I dug farther into it to see if there were any changes in expression of the competence-gene homologs that sxy induces. Some of them are tantalizingly up (the major T4P pilin and the comABCDE-homologs that specify the secretin pore and components of the T4P motor responsible for DNA uptake), but others are unchanged.

Subsequent searching also found an email I'd send to the summer student, with a link to this termination paper (Cardinale et al. 2008), asking 'Is this the one?".  So I think this study is indeed what got me interested in bicyclomycin.

So let's see what the new summer student can find out!



More thinking/planning about the new uptake-sequencing data

Some housekeeping issues:

The sequence data:  The PhD student has found that some segments of the genome have very low coverage in the input data - some positions have coverage of zero.  This means that the calculated uptake ratios for these positions are either unreliable (low coverage) or missing (coverage = 0).  He's going to plot segments of the genome with the low coverage points in a different colour, so we can see how bad the problem is.

Part of the problem may be due to how the reads were originally mapped onto the donor genome. The mapping used a concatenated donor-recipient double genome to remove the contaminating recipient reads from the data.  Because the donor and recipient sequences used were those of NCBI reference genomes rather than of the exact cultures used for the experiment, sequencing errors in the reference genomes may have caused donor sequences to mis-align onto the recipient genome.


This can easily be checked by examining the full alignment of the input DNA.  This should not contain any contaminating recipient sequences, so any reads that align to the recipient are alignment errors.  The ideal solution would be to realign the reads using better reference sequences, but we could instead just add this misaligned coverage into the donor-aligned input dataset we're analyzing.

Any remaining positions with near-zero coverage in the input dataset should probably be flagged and removed from the analyses.

The USS-scoring matrices:  A careful reader might have noticed in yesterday's post that the two scoring matrices are not the same length.  The uptake-based matrix is 32 nt long, but the genome-based matrix is 37 nt long.  They are also not exactly aligned to each other; position 1 of the uptake-based matrix is position 3 of the genome-based matrix.  Rather than dealing with these discrepancies later (or forgetting to deal with them), we should create concordant matrices now to use for the scoring.


This requires deleting the first two positions and the last three positions of the genome-based matrix. Since the remaining last few positions have no 'information' in either matrix, we might as well delete a couple more, to give concordant matrices that are both 30 bp long.

Forward-strand and reverse-strand USSs:  Since the USS motif is not symmetric (not a palindrome in DNA language), we need to identify and specify the locations of the USSs in the two strands.  The top panel below illustrates the problem.  To keep the position references consistent, the two strands are initially scored in the same left-to-right direction, with the reverse-strand scoring done using a matrix with complementary bases in the reverse orientation.  For both strands the left end of each USS initially specifies its position in the genome, but this is a bit misleading since it's not the centre or most important position of the USS.  Worse, since the crucial 'core' of the USS motif isn't at its centre, the initial positions of the forward USSs are skewed differently than the reverse USSs.


The lower panels indicate the two possible solutions.  Both are technically easy - we just create new USS positions by adding numbers to the original positions.  In the solution shown in the middle panel, we'd add 13 (I think) to both the forward and reverse positions (sorry, the figure shows the trimmed 30 bp USS but the numbers haven't been corrected for the removal of two positions at the start).  In the solution shown in the lower panel we'd add 7 to the forward strand positions and 21 to the reverse-strand positions. (I'm not certain these are the correct numbers...)

I think either solution would be fine, but we need to pick one.



Uptake dataset progress

The PhD student has been making lots of progress in analyzing the data from the chromosomal DNA uptake experiment.

The big progress came because we realized that we needed to stop looking at the data for the whole genome and instead examine a representative 5 kb segment.  This has allowed us to relate the results of each analysis to the specific sequence features and uptake data for each position in the segment. So now we have a pretty good understanding of what the various analyses can show us, and what they can't.

Rather than detailing what we learned, here I want to consider what our goals are, and what steps we should take.

Goals:  For the analysis of transformation frequencies (the bigger project this work is part of), we want to know how much of the variation in transformation frequencies across the genome is due to differences in DNA uptake.  In principle this could just be a number, e.g. 37%.

I guess one (mindless) way to do this would be just to subtract the differences in uptake from the differences in transformation.  I don't know whether the former post-doc has done this - I'm pretty sure we haven't discussed it.

A second approach would be to determine the extent to which the already-characterized effect of USS (uptake signal sequences) on DNA uptake explains differences in transformation across the genome. Doing this doesn't require any of the new DNA-uptake sequencing data, just the sequence of the genome of the DNA source.  The former post-doc has done simple versions of this, and he has a rotation student working on a more sophisticated version.

We (the PhD student and I) are instead using the new sequence data to improve our understanding of how DNA sequences determine how efficiently a fragment will be taken up by a competent cell.  This better understanding can be then used to predict the contribution of uptake to the transformation differences (as above), but its main value is more direct - understanding how DNA sequence differences affect uptake will help us understand the evolution of uptake biases and uptake signal sequences, in H. influenzae and other organisms.

So what have we learned so far:

Size distribution of the input DNA:  We don't yet have the direct DNA-analyzer data on length distribution.  But we can indirectly estimate this by looking at the graphs of uptake ratio as a function of genome position.  Positions that are more than 500 bp from the center of an uptake peak (location of a USS) have a very small uptake ratio (~ 0.01, often not distinguishable from zero).  This means that almost all of the fragments in the short DNA sample were shorter than than 500 bp.  The mid-height widths of the (well-separated) peaks are about 400-500 bp, indicating that the average fragment was about 200-250 bp.  I haven't taken the time to get the best image for this analysis,so we can be more precise than this.

Importance of USS:  It's abundantly clear that most of the variation in uptake seen in our 'short' DNA sample is due to the locations of 'USS', sequences with strong matches to the USS motif.  Most fragments containing a strong match (score > 20 with the 'genomic' scoring matrix) are taken up several hundred times more efficiently than fragments without a good match.

We've only examined 5 kb in detail, but so far all the uptake peaks we've examined are centred on positions with strong USS scores.  The height of the peak correlates with the score.

Importance of the USS scoring matrix: We have two types of position-weight matrices for scoring how well a sequences matches an uptake-promoting motif.

The first is the 'genomic' matrix that the PhD student has been using so far., shown in the figure below. It's based on analysis of abundant USS elements in the H. influenzae Rd genome, identified using the Gibbs Motif Sampler (Maughan et al. 2010).  In the figure each bar represents a position in the motif, and its height represents the 'information content' at that position (the sum of the weighted values of each base at that position in the table).


The genomic analysis means that this matrix doesn't directly represent the preferences of the uptake machinery, but rather some combination of these preferences with other factors affecting how sequences accumulate in the genome over evolutionary time.

The second type of matrix comes from the former post-doc's direct analysis of uptake biases, done using a synthetic DNA fragment containing a degenerate USS (Mell et al. 2012).  This 'uptake' matrix gives a motif with a strong consensus only for the a much smaller region, with only four very important bases.


We haven't yet analyzed any genome uptake data using this matrix, but it's high on our priority list. We expect similar results with both matrices, but the uptake matrix may be better because it's directly based on uptake data.

How will we decide if it's 'better'?  Here, 'better' means that position USS scores better predict the uptake ratios of nearby sequences.  We're still working our way to deciding the best way to do this. In addition to the USS score from the matrix, the prediction will need to consider how far the position is from the nearest 'USS' (on a list using a good score cutoff), whether fragments containing it are likely to contain more than one 'USS', the size distribution of the DNA fragments in the prep).  Maybe some of this would be incorporated in a matrix of USS scores and distances...

Ideally (i.e. if computational time and resources were unlimited), for each focal position whose uptake we want to predict, the uptake prediction would incorporate:

  1. the USS scores at each distance from it (two scores for each distance), weighted by our observed correlation between USS score and height of uptake ratio peak
  2. For each distance, a weighting factor that reflects the probability that the focal position is in the same DNA fragment as the sequence being scored (based on the measured size distribution of the input DNA prep)
  3. A factor reflecting the interactions between USS scores at different positions, weighted by the probability that both USS would be in the same fragment.
In practice, our job is to characterize these effects and then distill the important ones into a computationally simple prediction algorithm. 

Understanding the results of the first analysis

The grad student did the analysis I had described in this post.  Here's what I had said I expected:


 And here's what he found:

His data extends over a larger scale, and there is no empty space on the left below the main peak of points, perhaps just because the dots are too big to resolve.  A few uptake ratios are as high as 10, which is  also expected.  Some of the distances to the nearest 'USS' (position on the USS list) were surprisingly large - outside of the common fragment sizes in the 'short' DNA prep, but these might represent the several places in the genome where USS are widely separated. 

The most surprising aspect was the appearance of well-defined lines of points forming peaks at distances longer than the fragment sizes, and the absence of the clusters of points I'd originally hypothesized.

These long-distance peaks made sense once the grad student identified the positions responsible for them  and checked their assigned USS scores.  At the site of the peak he found a position with a USS score only slightly lower than the cutoff he'd used when generating his list.  When he checked the USS scores for the positions of the other long-distance peaks he again found scores that were locally high but below the list cutoff. 

The figure below illustrates what we think is going on.  First consider the top graph, which is a simpler schematic version of the uptake-ratio graph in the earlier post.  It shows two local peaks in uptake, one at the site of a USS on the list, and one at the site of another uptake promoting sequence. In principle this sequence could be a lower-scoring USS, or it could be an unrelated sequence that also promotes uptake.


The lower graph shows what we expect when this data is replotted with the distance to the nearest 'USS' on the X axis.  As I originally expected, points close to the recognized USS give two lines heading down and away from position 0 (the position of that USS).  But because the other uptake-promoting position isn't recognized as a 'USS', its points show up farther along the x axis, according to their distance from the position-zero USS.

Are USS that fell below the list cutoff responsible for all of the long-distance peaks?  One simple test is to reduce the cutoff for the USS list, and see if the peaks go away.  Sure enough, when the grad student reduced his USS-score cutoff from 19.04 to 18, all but one of the peaks disappeared.  I'm a bit surprised that the long-distance low-uptake points disappeared too; I guess this means that they weren't just due to gaps in the genomic distribution of USSs.

Does this result mean that the genome doesn't contain any non-USS sequences that promote DNA uptake?  No.  There's still that one remaining peak at about 800 bp, whose USS scores need to be checked.  And there are all the points in the black part of the graph, where non-USS peaks may be obscured by all the other points.

More about analysis of the DNA-uptake sequencing data

The graph below shows the efficiency of DNA uptake relative to the 'input' DNA sample) across a 13 kb segment of the H. influenzae Rd genome.  The red dots are for a 'short' sample with average fragment size about 0.25 kb, and the blue dots are for a 'long' sample, with an average fragment size of about 6 kb  (The average lengths come from crude examination of agarose gels, which might underestimate the abundance of short fragments, so the actual length distributions will be measured with a DNA Analyzer).

The previous post considered why the red data are so spiky - each spike corresponds to the location in the DNA of a short sequence matching the uptake-signal-sequence (USS) motif. Fragments containing a USS sequence are taken up much better (maybe 25-50 times better?) that fragments lacking a USS.


But the blue data are also spiky, and I don't know why.  Ignoring the two big spikes for a minute, the spikes and dips have much smaller amplitude than the big red spikes (they don't go up as high or down as low), but they're also more frequent on the distance scale.    

The gradual rise and fall of the blue dots over distances of several kb is expected from the length distribution of the fragments, but this jaggedness is entirely unexpected, especially given the apparent smoothness of the red points between the USS spikes.  Is this just noise in the data?  Is it an artefact of how the uptake data were normalized to the input data?

The two high spikes might be a different puzzle, or they might be extreme cases of whatever is causing the low-amplitude spikiness.  How could variation in uptake of DNA fragments that are mostly at least several kb long give a spike that's only about 11 bp wide?  Could this be an alignment artefact that somehow affects 'uptake' DNA very differently than 'input' DNA?

Here's a different graph of the uptake ratios (over about 100 kb), made by the former post-doc; again we see much more spikiness in the long-fragment DNA than in the short-fragment DNA.
To investigate the cause(s), I think the first thing to do is to go back one step from the uptake ratio data and look separately at the coverage for the input DNA and the recovered 'uptake' DNA.  Luckily, the first thing the post-doc did when he got the sequencing results is to send us a screen shot of a 20 kb Integrated Genome Viewer view of the 4 sample types (long input and uptake, short input and uptake).


I'm surprised by how variable the input coverage is.  The very fine scale variation is perhaps noise, but the larger peaks and valleys (500-2000 bp) are quite consistent between the long and short input DNA samples.

Unfortunately I don't have the uptake ratio graph for the same region that I have this IGV analysis, and I don't have the R skills to generate it.  But I can ask the grad student to do it for me, and to send me his code so I can figure out how it's done.



How to analyze next-gen DNA uptake data

We want to understand why competent Haemophilus influenzae cells take up some parts of H. influenzae chromosomes more efficiently than others.

To this end, before Christmas the grad student reisolated preparations of DNA fragments of chromosomal DNA from strain 26-028NP (hence 'NP') that had been taken up by competent cells of the standard lab strain Rd.  He sent these DNA samples to the former post-doc for sequencing (with the original 'input DNAs as controls).  The post-doc has now sent us the sequencing data, and the grad student is going to analyze this, with two main goals:
  1. Determine how a DNA fragment's probability of uptake is affected by the presence of sequences matching the uptake signal sequence ('USS') motif.
  2. Identify other sequence factors that influence uptake.
The grad student has written up an overview of his plan for accomplishing these goals, and that has stimulated me to also think about how it could be done.

He (or the former post-doc?) has already done the first step, scoring the degree of preferential uptake for every position in the genome.  I think this was done by comparing each genome position's coverage in the recovered-DNA dataset to its coverage in the control 'input' dataset.  This gives a score they call the 'uptake ratio'.

Here's a graph made by the grad student, showing the uptake ratios for two different preps of chromosomal DNA, over a 13 kb segment of the 1830 kb H. influenzae Rd chromosome. The dark blue points are for a DNA prep whose average fragment size was about 6 kb, and the red points for a DNAS prep whose average fragment size was about 250 bp.  Because the actual distributions of fragment sizes in these preps have not yet been carefully measured, I'll refer to them as the large-fragment and small-fragment DNA preps respectively.


The first thing you notice is that the uptake ratios for the large-fragment prep are much less variable than those for the small-fragment prep.  We are very gratified to see this, because it's what we expected from the known contribution of the uptake motif.  Sequences with strong matches to this motif occur all around the chromosome, with an average spacing of about 1 kb.  Thus most fragments in the large-fragment prep will have contained at least one USS, but many fragments in the small-fragment prep will not have contained any USS.

The large-fragment prep does show two strong spikes of high uptake (at about 8000 and 18500 bp).  These are certainly very interesting, especially since they don't correspond to high uptake in the short-fragment prep.  But for now I'm just going to consider how we might analyze the short-fragment prep, since this provides much better resolution of what we think are the effects of individual USSs.

Here's a strategy I came up with:

Step 1:  Score each position of the NP genome for its match to either orientation of the 'genomic' USS motif.  This motif was identified by Gibbs Motif Sampler analysis of the RD genome (see this paper).  Each position will have a '+' score and a '-' score; we need to make sure the positions are aligned at the most important of the USS motif.  Because the score depends on correct alignment, the result will be punctate, with about one high-scoring position and about 999 low-scoring positions in each kb.  Here's a figure of what the analysis might look like for the 13 kb segment shown above.


Step 2:  Using a reasonable score cutoff, create a list of positions that qualify as 'USS' for the initial analysis of the uptake data.  In the case above we'd include all positions scoring higher than 15.  

Step 3: For each position in the genome, calculate its distance from the nearest 'USS' on the above list.  For now don't distinguish between 'USS' in + or - orientations.  (I'm keeping 'USS' in quotes to remind us that we used only one of many possible criteria to define our list.)

Step 4: For each genome position plot its uptake ratio as a function of its distance from a 'USS'.  Because most of the red peaks in the grad student's graph have uptake-ratio scores of about 4 and bases about 1 kb wide, I expect the graph to look something like this: 



There are a lot more points on this graph than on the previous one because there's a point for every position in the 1.8 Mb genome.  Most of the points fall on a rough band that drops from uptake ratios of 4 (peaks, for a very close 'USS') to uptake ratios that are about 0.1 (troughs, for positions that are more than 500 bp from a 'USS').

If we see a broad band with lots of scatter, this will mean that our distance-to-the-nearest-'USS' score doesn't capture other aspects of the USS that influence uptake.  These factors might include:
  1. whether the USS's orientation on the chromosome affects uptake (USS motifs are asymmetric)
  2. how well the USS's sequence matches the several different ways we can score sequences as possible USS (genome-based, uptake-based, and with or without internal interaction effects between positions)
  3. how much the presence and relative locations of additional USSs adds to uptake
We will come back to the above analysis and develop more nuanced measures of the affects of nearby USS, judging success by how much each nuance reduces the scatter of the points.

For now I'm more interested in identifying any non-USS sequence factors that influence uptake. These factors should appear in the above graph as outliers, positions whose uptake ratio is not correlated with their USS-distance score.  Our previous analysis suggests that these outliers should be common.  If they are common, they might be clustered as shown above, but they're probably more likely to be scattered all over the place and perhaps not easily distinguishable from the overall background scatter. 

The best way to see if these positions are not noise is to see if their scores correlate with genomic positions.  Below is one way I've thought of to do this.

Step 5. 
 Use the uptake vs USS-distance graph to develop an equation that best predicts uptake ratio (U) as a function of distance to nearest 'USS' (D, in bp).  For the above example, a very crude equation might be 

U = 0.1 or (4 - D/100), whichever is greater.

Step 6:  For each position in the genome, use this equation and the 'USS' list to predict an uptake ratio, and then calculate the difference between its predicted and observed uptake ratios.

Step 7:  Now plot this 'anomaly' as a function of genome position.  If we're lucky it will look something like this:


If some of the apparent scatter is due to positions where non-USS sequences influence uptake, these will show up as peaks and troughs above and below the main bands, and we can go on to analyze these sequences bioinformatically for shared features and experimentally for direct effects on uptake.  If the scatter really is due to noise, then it will be scattered over the genome and not fall into discrete peaks and troughs. 

Ready for sequencing?

I think I finally have the appropriate PCR fragments from my A. pleuropneumoniae mutants, to be sent for sequencing:


I have 3 knockout mutants, removing the toxin, antitoxin and toxin+antitoxin segments (∆T, ∆A, and ∆TA respectively).  I designed new 'S-up' and 'S-dn' primers to use with the original 'F' and 'R' primers amplify the segments on either side of the Spectinomycin-resistance cassette that's inserted at the sites of deletion.  I need to check the sequences of these to be sure that the appropriate segments have been removed, and that the remaining gene is intact.

I've successfully used these primers (black arrows above) to amplify the ∆T and ∆A segments shown above (light blue and lilac bars).  Now I just need to clean up the PCR products, check their concentrations, and send them with the appropriate S-up and S-dn primers (red arrows above) for sequencing.  I don't need to sequence the far ends of the fragments.

I also tried to use these primers for the ∆TA double knockout but for some reason I can't get any amplification.  This may mean that there's something wrong with the mutant, but I've decided I don't really need to discuss this mutant at all in our paper, since both the ∆T and ∆A mutants have normal growth and competence phenotypes.  (Well, I think I do need to do at least one more check of the transformation frequencies, since there's been a lot of variation in my colony counts.)

[Ooh, idea!  Maybe the ∆TA mutant won't amplify because its Spec cassette is inserted in the opposite orientation to the others!  The Honours student created each mutant by blunt-end ligation, so either orientation is possible.  I'll go set up one more pair of PCR reactions with the alternate combinations of primers right now...

And YES!  Reversing the primers gave the expected amplification!

What do the toxin and antitoxin gene products do?

Now that I'm finally close to finishing my benchwork task for the Honours student's manuscript, I've gone back to thinking about the results and implications of our RNA-seq analysis.

When the Honours student wrote the manuscript (actually her Honours thesis, but in excellent manuscript format), we had only incomplete RNA-seq results - specifically we had only one replicate of the critical antitoxin mutant.  The other two replicates were in the pipeline at the time, and the full dataset was analyzed subsequently by the other Honours student when he stayed on for the summer.

I'm going to just summarize the results now, and come back to them later.

Basic points:

  1. The antitoxin knockout mutant has normal RNA levels for all the genes that regulate the competence regulon (crp and sxy, which encode the transcription enhancers CRP and Sxy, and cya, which encodes the adenylate synthase that synthesizes the essential cyclic AMP cofactor for CRP.
  2. Consistent with this, the expression levels of the competence regulon genes are not very different than in wildtype cells.  A few genes are down by 40-50%, but most are near-normal, with error bars that overlap the range of wildtype expression (see his complicated green figure below - compare the heights of the bright-green bars with the spans of the grey shaded areas, which represent the normal expression levels at the bright-green timepoint).  
  3. The double knockout (∆toxinantitoxin) transforms normally, so the competence defect of the antitoxin mutant is due to competence-blocking activity of the toxin.
  4. The transformation defect of the antitoxin knockout is much more extreme than these expression levels would predict.  We see few or no transformants (transformation frequencies less than 10^-8), whereas wildtype cells give transformation frequencies higher than 10^-3.  
  5. The antitoxin mutant also has an extreme DNA uptake defect, so the transformation defect is not caused by defective recombination machinery. 
  6. The summer student also did an RNA-seq analysis of the hfq knockout mutant he had worked on for his Honours project. This mutant has a more severe reduction in expression of all the competence-induced genes, but a much less severe defect in transformation (only about ten-fold lower than wildtype cells).  Thus the antitoxin mutant's competence defect is unlikely to be due to modestly lower expression of one or more key competence genes.
  7. In the antitoxin mutant the toxin mRNA is overexpressed during exponential growth.  This is consistent with the roles of related antitoxin's in other systems, where it acts as a repressor of transcription of the toxin-antitoxin operon.
  8. The antitoxin knockout cells have a normal doubling time in exponential growth, and survive competence induction and stationary phase just as well, so the toxin protein must not be toxic for growth or survival.



Where does all this leave us?  One possibility is that the toxin directly blocks DNA uptake, by some mechanism we are completely ignorant of.  But related toxins are known to act by cutting mRNAs on the ribosome, so it's possible that the RNA-seq results are misleading in that they detect all RNAs, including ones that have been cut.

Luckily the summer student wrote an R script to compare coverage patterns between wildtype and mutant cells, and generated lovely graphics showing the effect of the antitoxin knockout on coverage of segments containing competence-induced genes.  Just as an example, here's his comparison of expression of the pilABCD operon in wildtype (purple) and hypercompetent (green) cells.


He's generated data for all the competence-induced genes in the antitoxin knockout, so I'll check these to see if there are any alterations in transcript profiles that might indicate the action of a mRNA-cleaving toxin.


Toxin/antitoxin knockout updates, and bonus DNA uptake results

My last post was all about failure, so it's high time I updated things with some successes.

Constructing an Actinobacillus pleuropneumoniae antitoxin gene knockout:  At the last report, I had what I thought were four independent knockout mutants, but my attempts to PCR- amplify the genomic segment containing the knockout were not working.

I eventually switched to using a different thermostable polymerase (NEB's standard OneTaq) rather then the fancier Q5 polymerase I had been using.  Eureka - the PCRs all worked perfectly, giving strong bands of approximately the expected sizes.

...then I let everything sit around for a month while I dealt with other things...

Now I'm finally following up.  The first step is to digest these PCR products with a few other enzymes that should cut in either the genomic segments or the inserted SpecR cassette.  I've made rough predictions of the expected fragment sizes, which are all different for the ∆A mutant, wildtype cells, and the two mutants made by the Honours student (∆T and ∆TA).

The next step will be to do more PCR amplifications.  My original amplifications used the F and R primers that amplify a 2.6 kb segment containing the toxin and antitoxin genes (~300 bp each).  Now I'll use the F primer with the S-R reverse primer for the SpecR canssette, and the R primer with the S-F forward primer for the cassette.

If these both give the expected fragments then I'll (probably) send the PCR amplicons for each mutant to be sequenced.

If the sequencing confirms that the knocked-out genes are gone but the remaining gene is intact, then I'll give a sigh of relief.

Determining the competence phenotype of the Actinobacillus pleuropneumoniae antitoxin gene knockout:  My first test of the transformability of my first two ∆antitoxin mutants showed transformation defects, but in later tests they transformed within the range of the wildtype control.  But there was a lot of experiment-to-experiment variation in transformation levels (see graph below), so I'd like to do it one more time, to get clean publishable data.


Bonus DNA uptake results:  Just before Christmas the grad student finished his DNA preps of H. influenzae chromosomal DNA fragments that had been recovered after being taken up into the periplasm of competent H. influenzae.  He sent these to the former post-doc for sequencing, and the post-doc has now sent us some lovely preliminary results.  

The grad student had used DNA preps that had been sheared to two different size ranges.  We expected the genome coverage of the long fragments (mean length ~6 kb) to be fairly uniform, since almost all of them should contain at least one instance of the preferred uptake sequence motif.  These 'USS' motifs are distributed fairly evenly around the chromosome, with a mean spacing of about 1 kb.  We do see this, but with enough anomalies to keep things interesting.  And we expected coverage by the short fragments (mean length ~0.25 kb) to be much more strongly dependent on chromosomal position, since many such fragments would not include a USS.  And we do see this, again with interesting anomalies.
  

DAMN! Complete PCR failure!

Yesterday I ran a PCR amplification using DNAs from single colonies of 7 different A. pleuropneumonia isolates, and got absolutely no DNA fragments from any of them.

This amplification worked fine last time.  Can I figure out what went wrong?

  • I checked the run record of the PCR machine - it looks fine.
  • I checked the freezer box with the tubes of dNTP stock, 5X buffer, and Q5 polymerase, to be sure I hadn't picked up a wrong tube.
  • I checked my notes, to be sure I hadn't left out any component of the reaction mix.  I'd checked off each reagent as I added it, and the final volume was as expected.
  • I checked the 'F' and 'R' primer tubes (in another freezer box) to make sure I'd used the correct ones.  I'd made up more of the 10 mM dilution stock, so I also checked that I'd used the right tubes of the more-concentrated 100 mM stock to do this.  I even checked the remaining volumes in the two primer tubes - if I'd added one primer twice and not the other these volumes should differ by about 17 µl, but they're within a few µl.
  • I prepped the colony DNAs slightly differently.  Last time (prep 1) I put a whole colony into 100 µl of medium, then diluted 5 µl of that into 45 µl water and heated to 98 °C for 10 min to lyse the cells and free their DNA.  This time (prep 2) I put part of a colony into 100 µl water, heated that, and then pelleted out any cell debris.  Both times 1 used 1 µl of the heated sample.
What could I try now?
  • Use leftover Prep 1 colony DNA as template
  • Vortex the Prep 2 colony DNA tubes
  • Use as template purified DNA from lab stocks
  • Use a different pair of primers (the Spec-cassette ones worked well last time)
  • Repeat with the same reagents and template I used this time
  • Make fresh colony DNA preps
  • Make proper DNA stocks to use as templates
Plan:  
  • Prep 2 14-1 colony DNA, Spec primers
  • Vortexed Prep 2 14-1 colony DNA, F & R primers
  • Prep 1 14-1 colony DNA, F & R primers
  • Prep 1 14-1 colony DNA, Spec primers
  • 1/100 dilution of lab-stock DNA, F & R primers




    Success

    When I last posted, nearly 3 weeks ago, my first attempt to generate the desired full-length knockout construct had given a mixture of fragments rather than just the desired full-length one.  But this mixture did include a relatively faint fragment of the desired size (3.6 kb).

    I did try to get a better PCR product, but increasing the annealing temperature made things worse, and I couldn't find a PCR app that would let me diagnose which incorrect-priming reactions were producing the unwanted fragments.  So I went ahead and transformed competent Actinobacillus pleuropneumoniae cells with the mixture, selecting for spectinomycin resistance.

    My logic was that only the desired fragment is likely to efficiently transform cells to SpecR, because other fragments were unlikely to have the correct homologous DNAs flanking the SpecR cassette.  If the 3.6 kb fragment was what I hoped it was, I should get thousands of transformants even though it was only about 10% of the total DNA in the mixture.  If it wasn't what I wanted, then it would probably transform very inefficiently if at all and I would get very few transformants.

    I got thousands of transformants in my first try.  Since the real goal of this project is to find out whether knocking out the antitoxin gene prevents transformation in A. pleuropneumoniae as it does in H. influenzae, I did a quick-and-dirty competence assay, using 7 pooled SpecR colonies and some kanamycin-resistant A. pleuropneumoniae chromosomal DNA.  This gave lots of KanR transformants, but luckily I didn't take this as a final result.

    Instead I went back and redid the transformation of A. pleuropneumoniae with the PCR mixture, this time using a lot less  DNA.  I did this because the high DNA concentration used in the original transformation meant that many cells could have taken up multiple DNA fragments.  In H. influenzae such fragments are known to undergo ligation in the periplasm, allowing formation of chimeric recombinants that give very confusing results.  Using 100-fold less DNA still gave plenty of SpecR transformants, and I streaked 4 of these to get clean single colonies.  (Two of the picked colonies were large, and two were smaller, but all gave large colonies on their streak plates.)

    I tested 2 of these colonies by PCR.  Only one (14-1) gave the expected 3.6 kb full-length product and 1.1 kb Spec cassette products.  The other (14-2) gave no product with the full-length primers and what looked to be a slightly small product with the Spec primers.  The control wildtype cells gave the expected 2.6 kb full-length fragment and no Spec fragment.



    At the same time I tested both colonies for the ability to be transformed.  Both were defective, with transformation frequencies 100-fold lower than the wildtype cells.  This is the most interesting result - it suggests that the Toxin-antitoxin system in A. pleuropneumoniae plays the same role in competence as its homologue does in H. influenzae.

    Next steps: More comprehensive characterization of all the A. pleuropneumoniae mutants.  First do the full-length PCR on colonies 14-1, -2, -3 and -4, on the ∆toxin and ∆toxin/antitoxin mutants made by the honours student, and on the wildtype control, this time running the gel more slowly to better characterize the fragment lengths.  Then do additional PCRs using other primers, to confirm the mutant structures, and repeat the competence assays on all these strains.

    I'll also need to get all the final mutants sequenced, to confirm that they have only the expected deletions. I'll email the former RA to ask her the best way to do this (do I send genomic DNA or PCR products, what primers are best...).

    Semi-success

    Here are the results of the first attempt at getting full-length PCR products:





    In the left lane (high template) I see a faint full-length band, and a stronger band the expected size of one of the expected intermediates.  In the right lane (low template) I see only the intermediate band.

    This is a fine result.  I'm now using 0.5 µl of the high-template reaction as the template in a new reaction with the same F and R primers and an annealing temperature optimized for them.  I hope this will give me lots of the full-length product.

    In anticipation of having the desired full-length DNA fragment, I've just streaked out the recipient Actinobacillus pleuropneumoniae cells I will transform this fragment into.  There are several steps I need to do before the final transformation:

    • Streak out the honours undergrad's A. pleuropneumoniae SpcR mutants (she made three different ones with the same cassette I'm using).
    • Check the sensitivity of A. pleuropneumoniae to spectinomycin, since this is the selection I will be using for transformation by my fragment.  The honours undergrad did this but her notes are not very good here.  I need to identify a concentration that will prevent colony formation by the sensitive cells but allow it by the resistant cells.
    • Make a competent stock of the recipient (SpcS) by growing the cells in MIV starvation medium.  
    • Check the competence of these cells by transforming them with genetically marked DNA.  I know I have some old DNA for this purpose (NalR?), but it would be good to select for SpcR using DNA from one of the undergrad's SpcR strains, if I can find this.
    Before doing the final transformation I should also digest my transforming fragment with a couple of diagnostic restriction enzymes, just to be sure it is what I want.

    Progress! The big fragment-assembling PCR is running!

    The PCR amplification of the SpcR fragment worked fine yesterday (after I spent an hour staring at (forward and reverse and complement and reverse-complement) primer sequences to reassure myself that I'd ordered the correct ones).  And today I did a column cleanup and a gel purification.  And now I'm running the PCR reaction that I hope will assemble the complete DNA fragment.

    The part I'm doing now is shown below.  You can see the whole plan in this post.


    When I was getting ready to do the PCR I realized that first I needed to think in more detail about the events.  So here they are, in point form:
    1. Most of the green strands will just reanneal to their complements.  Probably many of the red and blue strands do too.  This is an unwanted process that reduces the availability of strands for the desired annealings and elongations. In normal PCR this is hindered by the very high concentrations of the primers, but here we only have the F and R primers.
    2. Primer F anneals to the lower red strand and primes synthesis of the upper red strand.  At the same time, primer R anneals to the upper blue strand and primes synthesis of the lower blue strand.  Both of these new strands accumulate linearly, not exponentially.  This provides more of the appropriate strands red and blue for STEP 4.
    3. STEP 4 above: Sometimes the upper red strand anneals to the lower green strand (by their 17 bp of complementarity), and elongation in both directions produces hybrid upper and lower (red + green) strands.  Similarly, sometimes the upper green strand anneals to the lower blue strand (by their 16 bp of complementarity), and elongation in both directions produces hybrid upper and lower (green +_ blue) strands.  All these strands also accumulate linearly, not exponentially.  They provide the effective strands for STEP 5.
    4. STEP 5 above: Sometimes the hybrid strands from STEP 4 anneal to each other instead of to their complements (by their 1260 bp of complementarity). Elongation in both directions produces the full-length strands.
    5. STEP 6 above: Now primers F and R can anneal to the full-length strands, leading to exponential amplification of this desired product. 
    Miscellaneous thoughts:
    • I don't know how much template is appropriate, so I'm trying two concentrations, one 100-fold lower than the other.
    • The rate-limiting step is STEP 4, the annealing of the red and green strands by their 17 bp overlap, and of the green and blue strands by their 16 bp overlap.  I'm using a fairly low annealing temperature to facilitate this.  If I don't get any product I'll try lowering it further.
    • If I hadn't mixed the red and blue fragments for purification I might have set up partial reactions (red + green and blue + green), to make it easier to monitor progress.  If the all-in-one reaction doesn't work I'll probably make some more of each fragment and purify them separately so I can troubleshoot in separate reactions.




    Successful fragment cleanup

    Last night I designed and ordered the two complex primers I'll need to amplify and insert the SpcR gene.  They won't get here until Monday, so today I did column and agarose-gel cleanups on the two other PCR fragments I'll be using

    First I pooled what was left of the two PCR reactions (A and B).  The volume of each was about equal to the amount I had run in my gel yesterday (left panel below).  Then I did a column cleanup to get rid fo the bulk of the PPCR primers, using an old EconoSpin column that I'd revived by passing 0.2 N NaOH through it.


    Then I ran all of the eluate from that column in one lane of a 1.2% agarose gel.  I used only about 1/5 of the usual concentration of Ethidium Bromide, and I viewed the gel only with our hand-held long-wave UV lamp to avoid UV damage to the DNA.  The bands were faint but clear and I cut them out  together in one gel slice.

    Then I used our new Zymoclean gel-recovery kit to dissolve the agarose and recover the DNA, and ran 3 µl of the resulting 24 µl in unused lanes of the same gel I used yesterday, to see how much DNA I had recovered.

    The results look great.  The bands are sharp and bright and the right sizes, and the intensities suggest that I've recovered most and perhaps all of the DNA I began with.  So now this prep is in the fridge, waiting for the other piece of the construct.

    In the lab! Doing PCR! Successfully!

    I have primers for the A-R and A-F sites shown in the previous post, but they weren't designed to work with the F and R primers I also have (F with A-R, and R with A-F).  But I decided to go ahead and test them anyway before I order the new S-F and S-R primers I will need for amplifying the SpcR cassette, and for linking the other amplicons to it.


    The New England Biolabs primer evaluation software recommends against PCR using a pair of primers whose Tms differ by more than 5 °C, but I don't have much to lose, so I'm running them anyway.  I'm also testing another version of the A-F primer, designed by the honours student.

    I'm using our high-fidelity Q5 polymerase instead of Taq, because I need the product to be error-free.  Unfortunately each pair of primers requires a different annealing temperature, so I'm doing three PCR runs, each with a single tube.

    And tomorrow morning I can run a gel to see if any of them worked.

    It's tomorrow, and OMG!!!, all three PCRs gave excellent products!

    So I just need to design and order SpcR primers with tails that will base-pair to the inner ends of products A and B.


    New no-cloning plan for the antitoxin knockout

    The Methods section of a manuscript I was reviewing reminded me that it's possible to use PCR to create a gene knockout, and then transform the PCR product directly into naturally competent cells, without an intermediate cloning step.

    We actually had tried out this method bout 20 years ago, when the H. influenzae genome sequence first became available, but after some experiments we decided that cloning was usually more reliable.  But now I'm in a situation where cloning is being very unreliable indeed, enough so that I can't bring myself to try again.  So instead I'm going to try the direct-transformation method.

    Basic plan:
    1. In two independent PCR reactions, amplify the left and right genome segments flanking the gene to be deleted (the antitoxin gene).  These segments need to be long enough to later allow efficient homologous recombination of the final construct into the chromosome. 
    2. Separately amplify a Spectinomycin-resistance cassette (SpcR), using primers designed with tails complementary to the inner ends of the two genome fragments.  
    3. Remove all primers from these PCR products and mix them together in a PCR reaction mix with no added primers.
    4. Do one cycle of strand melting, strand annealing (of the primer tails), and strand extension by Taq. This produces two mid-length fragments that both contain the SpcR segment.
    5. Do another cycle of strand melting, strand annealing (this time of the full SpcR segment), and strand extension by Taq. This produces one full-length fragment containing both genome segments and the SpcR segment.
    6. Add the outermost genome primers (left primer of the left segment and right primer of the right segment) and carry out a normal PCR amplification.
    7. Transform the resulting fragment into competent A. pleuropneumoniae cells.
    8. Select for SpcR and confirm the new genotype by PCR.



    Now I need to dig into the sequence file and primer files to figure out whether I can reuse primers we already have, and to design the new SF and SR primers with the appropriate tails.

    What else are my experiments telling me?

    The previous post discussed what I can call Problem area #1: the evidence that my plasmid prep results have been unreliable - that the absence of plasmid in the prep didn't mean that the cells I started with didn't contain plasmid. So now I need to go back through the other experiments, to check if my conclusions are still solid.

    Problem area #2:  Can the specR PCR fragment be ligated into a blunt-cut plasmid, when phosphorylation isn't needed?  Answer: NO.

    My first experiment said 'No', but it was flawed by using too high a ratio of plasmid to insert.  So I repeated it using much more specR fragment than plasmid.  This time the results were a cleaner 'No'. Religation of the cut plasmid gave 437 AmpR colonies for 510 µl transformation mix.  Ligation of the same amount of plasmid in the presence of the specR fragment gave only 29 colonies from 150 µl of the transformation mix, and no SpecR colonies (from 150 µl) or AmpR SpecR colonies (from 150 µl).  A positive control transformation using a plasmid that carries both AmpR and SpecR gave several hundred colonies of each from the same volumes.

    So I conclude that the specR PCR fragment cannot be ligated, and that it also interferes with ligation of the blunt-end plasmid.  One way to interpret this is that one end of the specR fragment is OK and the other is unligatable.  I'm discouraged but not surprised by this result, because the specR PCR product always behaves oddly in agarose gels: a bit blurry before cleanup and worse after cleanup.

    Solutions: (1) I could design new specR primers and try again.  We might even have other specR primers for this cassette. (2) I could try amplifying with the original primers from a different template.  The undergrad used a plasmid I think.  I should have a plasmid with specR in a long-enough segment to encompass my primer sites..  (3) I could cut my existing specR PCR-product with NheI to generate sticky ends.  This is what the undergrad had originally planned.  I would need to design new primers for the inverse-PCR step that generates the rest of the desired plasmid.  Or blunt-cut specR fragment out of a plasmid and ligate this to the inverse-PCR product.


    Problem area #3:  Are the kinase (phosphorylation) reactions working?  Answer: YES.

    I originally tried to test this by phosphorylating the inverse-PCR product and self-ligating it, transforming E. coli and selection for AmpR.  I got no transformants, so either the kinase reaction failed or there was another problem.  The ligation control, transformation control and plate-selection controls all worked fine.  This was when I discovered I'd been using very old kinase, but repeating the experiment with new kinase gave the same result.  Was the ATP stock bad?  No, repeating the kinase reactions using ligase buffer (contains its own ATP) gave the same result.

    A better test of the kinase function comes from the grad student, who has been using it to label chromosomal DNA with 32P from 32P-ATP.  He's getting modestly successful incorporation, suggesting that this reaction is working OK.


    Problem area #4:  Is the inverse-PCR product's intact toxin gene toxic to E. coli?  Answer: Weak No (no evidence that it is).

    I tested this by making a different inverse-PCR product using the undergrad's old primers.  These cut off the last 5 amino acids of the toxin gene.  She was able to get successful ligation of this product to her specR PCR product, after kinasing a mixture of both fragments.  I ligated this with my kinased SpecR fragment.  This produced one AmpR SpecS colony and one AmpS SpecR colony.  The negative control self-ligation of the inverse-PCR fragment alone (not kinased) gave a few AmpR colonies.  If these do result from self-ligation (I didn't do a plasmid prep on them).

    I had only kinased the specR fragment, because I didn't want the inverse-PCR fragment to be able to self-ligate.  In retrospect, especially now that I know that the specR fragment is unligatable, I should have also kinased the inverse-PCR product as a better control to show that the kinase was working.  


    THE BIG QUESTION: SHOULD I KEEP TRYING?

    This project shouldn't have been such a big deal.  But it's the bottleneck in getting the toxin-antitoxin work finished and published.

    One possible plan:  Buy some NheI, cut the (blurry) specR PCR fragment and run it in a gel.  Do I get a nice sharp fragment of the right size?  If yes, design and order new inverse-PCR primers with NheI sites.  (Why did the undergrad choose NheI?  I have no idea?)  Cut the new inverse-PCR product with NheI and ligate it with the specR fragment.  Promers are cheap, so I oculd instead jsut design new specR and inverse-PCR primers with matching sites.

    Another plan:  I recently found a note I made at last summer's Gordon Conference, reminding me that a favourite colleague had offered to make this damned mutation for me.  But I don't like to ask this of him...


    What next in the antitoxin knockout endeavour?

    OK, time to end my two three? four? weeks of sulking because my experiments won't work...

    Where was I?  (... must consult my notebook)

    I was trying to resolve two issues.  The first was why ligation of my phosphorylated PCR fragments did not produce a plasmid that could transform E. coli to AmpR SpcR.  The second was why my plasmid preps were not producing any plasmid, from cells that I was quite sure contained a high-copy-number plasmid.

    This second issue was compromising my ability to investigate the first issue, so let's deal with it first.

    I was doing my plasmid preps using Econo-spin spin columns (from Epoch Life Sciences) and column reagents we had made up ourselves.  This is much cheaper than using spin-column kits from Qiagen or Sigma.  But the columns and reagents were quite old, and I was not even sure that I was using the right volumes of the reagents.

    Here's what I was doing:

    The basic procedure is to start with a version of the standard alkaline-lysis procedure:  
    1. Pellet cells and resuspend in a neutral buffer containing EDTA
    2. Add 0.2M NaOH +1% SDS.  The SDS will lyse the cells and the high pH will cause the base-paired DNA strands to separate, 'denaturing' both chromosomal DNA and the plasmids.  Each plasmid's two DNA strands will be looped together because they are interlocked circular molecules.
    3. Neutralize the NaOH and make the SDS insoluble by adding a low pH potassium acetate solution.  The two plasmid strands will regain their base pairing returning the plasmid to its normal configuration.
    4. Centrifuge the tube to pellet the cell debris, including the chromosomal DNA and most of the SDS.  The plasmid DNA and various soluble components remain in the supernatant

    In the old-fashioned procedure the supernatant is extracted with phenol and then chloroform, the DNA (and RNA) is precipitated with ethanol, and the pellet is resuspended in TE buffer (often with RNase A added to degrade all the RNA).

    With a spin column the supernatant is instead placed in the top part of the column (see figure above, from Perkin-Elmer), and spun through it.  The DNA sticks to the filter membrane in the base of the column, and all the unwanted soluble material washes through and is discarded. The DNA stuck on the membrane is further cleaned with one or two washes of a special solution containing ethanol, and then the DNA is eluted from the membrane into a clean tube using a small volume of TE buffer or water (usually 50 µl).

    The kits typically are expensive (Qiagen: $1.35 per column). They include all the reagents, but the reagent recipes are kept secret.  The same columns work for a number of different DNA-purification procedures, but you need to buy a different kit for each to get the specific reagents used in each procedure.

    Epoch and other budget suppliers will sell you just the columns (Epoch, about $0.40 each) and provide recipes so you can make your own solutions for all the procedures.  But as I said, our Epoch columns were old (several years?) and I wasn't confident that my hand-written notes specified the correct volumes of reagents to use.  We had the sheet with the reagent recipes but I couldn't find one with the protocols.

    Anyway, I did a test.  With replicate samples I used our homemade reagents to do the alkaline lysis steps 1-4 above, and then finished some sample with a spin-column cleanup and some with phenol extraction and ethanol precipitation.  The column samples had some chromosomal DNA contamination but no plasmid and no RNA.  The phenol-etc samples had no chromosomal DNA and no plasmid but a lot of RNA.  Bummer.

    So I did another test, this time adding to the two treatments above samples where I did the alkaline lysis steps using reagents specified by an old non-column protocol, followed by the old-fashioned phenol extraction and ethanol precipitation steps.  This time I did get some plasmid from the old-style prep, but again none from either prep using the column reagents.

    So this suggests that the column reagents or volumes are at fault.  I checked online, but found a slightly different set of protocols, with some reagent names we didn't have the recipes for.  I contacted Epoch and they kindly sent me a new recipe sheet.

    Epoch also provided this advice about column storage and regeneration:
    If you have access to cold room you can store the column in a cold room. Or if you still have room in your refrigerator which does not go through defrost automatically you can also store your column there. We can also ship you some "moisture pack", most conveniently with your next order, which can be sealed in the plastic bag. Finally you can apply 20 ul of 0.2N NaOH to the old column 30 minutes before you do miniprep. This will regenerate the membrane to the maximum bidding capacity.  
    So...  I suspect my preps didn't give plasmid because I was using the wrong volumes of the column alkaline-lysis reagents volumes, not because the cells didn't contain plasmids.

    How does this affect my thinking about the rest of the work?  Might some of the colonies I discarded have contained a desired plasmid?  I'll put that in the next post.

    Spot 42 RNA doesn't explain our ∆hfq competence phenotype

    Gisela Storz from NIH gave a great seminar here yesterday about the many roles that small RNAs and very small proteins play in E. coli gene regulation.  It gave me an idea about the role the small-RNA-accessory protein Hfq might be playing in competence regulation.

    One example she discussed is the abundant Spot 42 RNA, so named because it was discovered as an RNA chromatography spot long before any function was discovered.  This small (109 nt) RNA acts in a feed-forward regulatory loop with the cAMP-dependent transcriptional activator CRP.  This regulation fine-tunes the control of the CRP-activated genes that provide the cell with alternative carbon sources when its preferred sugars are depleted.

    When the preferred sugars are available, Spot 42 RNA is expressed and limits the translation of genes for using the alternative carbon sources.  Expression of these genes is already low because there is no active CRP to stimulate their transcription, and Spot 42 RNA prevents translation of any transcripts that might inadvertently get made.  The contribution of Hfq (the yellow donut) is to help Spot 42 RNA interact with its target mRNAs and prevent their translation.

    But when the preferred sugars run out, active CRP represses transcription of the spf gene, so Spot 42 RNA isn't made. (Yes, I know I said that CRP is a transcriptional activator, but it can also be a repressor when its binding site is on top of or downstream of the promoter.)  Now the transcription that CRP stimulates produces mRNAs that are efficiently translated.


    So in E. coli Spot 42 RNA improves the stringency of the regulation that we have been giving CRP full credit for.

    Where does H. influenzae competence come in?  Our Hfq knockout lowers competence, and the other honours student showed that this is because it makes competence more dependent on high cAMP than it usually is.  So I got excited by the idea that maybe Hfq's role in competence had to do with its role in mediating Spot 42 activity on CRP-dependent transcripts.  This could either be activity on CRP-dependent transcripts of the competence-specific transcriptional activator Sxy, or on the various CRP-dependent transcripts of the DNA uptake machinery.

    But this falls apart in two independent ways.

    First, my logic is backwards.  Now that I've created these diagrams it's obvious that Hfq's contribution to competence regulation acts in the opposite direction.  Knocking out hfq would be expected to increase the expression of competence genes under non-inducing conditions, not decrease it under inducing conditions.

    Second, it looks like H. influenzae doesn't even have a Spot 42 homolog.  Homologs are ubiquitous in the Enterobacteraceae, and have recently been shown to also be common in the Vibrionaceae, with about 85% sequence identity.  But BLAST searches of the Pasteurellaceae with either type of sequence didn't find a single homolog.  Given the relationships shown in the tree below. I think the ancestral Pasteurellacean must have lost its Spot 42 homolog.