Field of Science

How to analyze next-gen DNA uptake data

We want to understand why competent Haemophilus influenzae cells take up some parts of H. influenzae chromosomes more efficiently than others.

To this end, before Christmas the grad student reisolated preparations of DNA fragments of chromosomal DNA from strain 26-028NP (hence 'NP') that had been taken up by competent cells of the standard lab strain Rd.  He sent these DNA samples to the former post-doc for sequencing (with the original 'input DNAs as controls).  The post-doc has now sent us the sequencing data, and the grad student is going to analyze this, with two main goals:
  1. Determine how a DNA fragment's probability of uptake is affected by the presence of sequences matching the uptake signal sequence ('USS') motif.
  2. Identify other sequence factors that influence uptake.
The grad student has written up an overview of his plan for accomplishing these goals, and that has stimulated me to also think about how it could be done.

He (or the former post-doc?) has already done the first step, scoring the degree of preferential uptake for every position in the genome.  I think this was done by comparing each genome position's coverage in the recovered-DNA dataset to its coverage in the control 'input' dataset.  This gives a score they call the 'uptake ratio'.

Here's a graph made by the grad student, showing the uptake ratios for two different preps of chromosomal DNA, over a 13 kb segment of the 1830 kb H. influenzae Rd chromosome. The dark blue points are for a DNA prep whose average fragment size was about 6 kb, and the red points for a DNAS prep whose average fragment size was about 250 bp.  Because the actual distributions of fragment sizes in these preps have not yet been carefully measured, I'll refer to them as the large-fragment and small-fragment DNA preps respectively.


The first thing you notice is that the uptake ratios for the large-fragment prep are much less variable than those for the small-fragment prep.  We are very gratified to see this, because it's what we expected from the known contribution of the uptake motif.  Sequences with strong matches to this motif occur all around the chromosome, with an average spacing of about 1 kb.  Thus most fragments in the large-fragment prep will have contained at least one USS, but many fragments in the small-fragment prep will not have contained any USS.

The large-fragment prep does show two strong spikes of high uptake (at about 8000 and 18500 bp).  These are certainly very interesting, especially since they don't correspond to high uptake in the short-fragment prep.  But for now I'm just going to consider how we might analyze the short-fragment prep, since this provides much better resolution of what we think are the effects of individual USSs.

Here's a strategy I came up with:

Step 1:  Score each position of the NP genome for its match to either orientation of the 'genomic' USS motif.  This motif was identified by Gibbs Motif Sampler analysis of the RD genome (see this paper).  Each position will have a '+' score and a '-' score; we need to make sure the positions are aligned at the most important of the USS motif.  Because the score depends on correct alignment, the result will be punctate, with about one high-scoring position and about 999 low-scoring positions in each kb.  Here's a figure of what the analysis might look like for the 13 kb segment shown above.


Step 2:  Using a reasonable score cutoff, create a list of positions that qualify as 'USS' for the initial analysis of the uptake data.  In the case above we'd include all positions scoring higher than 15.  

Step 3: For each position in the genome, calculate its distance from the nearest 'USS' on the above list.  For now don't distinguish between 'USS' in + or - orientations.  (I'm keeping 'USS' in quotes to remind us that we used only one of many possible criteria to define our list.)

Step 4: For each genome position plot its uptake ratio as a function of its distance from a 'USS'.  Because most of the red peaks in the grad student's graph have uptake-ratio scores of about 4 and bases about 1 kb wide, I expect the graph to look something like this: 



There are a lot more points on this graph than on the previous one because there's a point for every position in the 1.8 Mb genome.  Most of the points fall on a rough band that drops from uptake ratios of 4 (peaks, for a very close 'USS') to uptake ratios that are about 0.1 (troughs, for positions that are more than 500 bp from a 'USS').

If we see a broad band with lots of scatter, this will mean that our distance-to-the-nearest-'USS' score doesn't capture other aspects of the USS that influence uptake.  These factors might include:
  1. whether the USS's orientation on the chromosome affects uptake (USS motifs are asymmetric)
  2. how well the USS's sequence matches the several different ways we can score sequences as possible USS (genome-based, uptake-based, and with or without internal interaction effects between positions)
  3. how much the presence and relative locations of additional USSs adds to uptake
We will come back to the above analysis and develop more nuanced measures of the affects of nearby USS, judging success by how much each nuance reduces the scatter of the points.

For now I'm more interested in identifying any non-USS sequence factors that influence uptake. These factors should appear in the above graph as outliers, positions whose uptake ratio is not correlated with their USS-distance score.  Our previous analysis suggests that these outliers should be common.  If they are common, they might be clustered as shown above, but they're probably more likely to be scattered all over the place and perhaps not easily distinguishable from the overall background scatter. 

The best way to see if these positions are not noise is to see if their scores correlate with genomic positions.  Below is one way I've thought of to do this.

Step 5. 
 Use the uptake vs USS-distance graph to develop an equation that best predicts uptake ratio (U) as a function of distance to nearest 'USS' (D, in bp).  For the above example, a very crude equation might be 

U = 0.1 or (4 - D/100), whichever is greater.

Step 6:  For each position in the genome, use this equation and the 'USS' list to predict an uptake ratio, and then calculate the difference between its predicted and observed uptake ratios.

Step 7:  Now plot this 'anomaly' as a function of genome position.  If we're lucky it will look something like this:


If some of the apparent scatter is due to positions where non-USS sequences influence uptake, these will show up as peaks and troughs above and below the main bands, and we can go on to analyze these sequences bioinformatically for shared features and experimentally for direct effects on uptake.  If the scatter really is due to noise, then it will be scattered over the genome and not fall into discrete peaks and troughs. 

Ready for sequencing?

I think I finally have the appropriate PCR fragments from my A. pleuropneumoniae mutants, to be sent for sequencing:


I have 3 knockout mutants, removing the toxin, antitoxin and toxin+antitoxin segments (∆T, ∆A, and ∆TA respectively).  I designed new 'S-up' and 'S-dn' primers to use with the original 'F' and 'R' primers amplify the segments on either side of the Spectinomycin-resistance cassette that's inserted at the sites of deletion.  I need to check the sequences of these to be sure that the appropriate segments have been removed, and that the remaining gene is intact.

I've successfully used these primers (black arrows above) to amplify the ∆T and ∆A segments shown above (light blue and lilac bars).  Now I just need to clean up the PCR products, check their concentrations, and send them with the appropriate S-up and S-dn primers (red arrows above) for sequencing.  I don't need to sequence the far ends of the fragments.

I also tried to use these primers for the ∆TA double knockout but for some reason I can't get any amplification.  This may mean that there's something wrong with the mutant, but I've decided I don't really need to discuss this mutant at all in our paper, since both the ∆T and ∆A mutants have normal growth and competence phenotypes.  (Well, I think I do need to do at least one more check of the transformation frequencies, since there's been a lot of variation in my colony counts.)

[Ooh, idea!  Maybe the ∆TA mutant won't amplify because its Spec cassette is inserted in the opposite orientation to the others!  The Honours student created each mutant by blunt-end ligation, so either orientation is possible.  I'll go set up one more pair of PCR reactions with the alternate combinations of primers right now...

And YES!  Reversing the primers gave the expected amplification!

What do the toxin and antitoxin gene products do?

Now that I'm finally close to finishing my benchwork task for the Honours student's manuscript, I've gone back to thinking about the results and implications of our RNA-seq analysis.

When the Honours student wrote the manuscript (actually her Honours thesis, but in excellent manuscript format), we had only incomplete RNA-seq results - specifically we had only one replicate of the critical antitoxin mutant.  The other two replicates were in the pipeline at the time, and the full dataset was analyzed subsequently by the other Honours student when he stayed on for the summer.

I'm going to just summarize the results now, and come back to them later.

Basic points:

  1. The antitoxin knockout mutant has normal RNA levels for all the genes that regulate the competence regulon (crp and sxy, which encode the transcription enhancers CRP and Sxy, and cya, which encodes the adenylate synthase that synthesizes the essential cyclic AMP cofactor for CRP.
  2. Consistent with this, the expression levels of the competence regulon genes are not very different than in wildtype cells.  A few genes are down by 40-50%, but most are near-normal, with error bars that overlap the range of wildtype expression (see his complicated green figure below - compare the heights of the bright-green bars with the spans of the grey shaded areas, which represent the normal expression levels at the bright-green timepoint).  
  3. The double knockout (∆toxinantitoxin) transforms normally, so the competence defect of the antitoxin mutant is due to competence-blocking activity of the toxin.
  4. The transformation defect of the antitoxin knockout is much more extreme than these expression levels would predict.  We see few or no transformants (transformation frequencies less than 10^-8), whereas wildtype cells give transformation frequencies higher than 10^-3.  
  5. The antitoxin mutant also has an extreme DNA uptake defect, so the transformation defect is not caused by defective recombination machinery. 
  6. The summer student also did an RNA-seq analysis of the hfq knockout mutant he had worked on for his Honours project. This mutant has a more severe reduction in expression of all the competence-induced genes, but a much less severe defect in transformation (only about ten-fold lower than wildtype cells).  Thus the antitoxin mutant's competence defect is unlikely to be due to modestly lower expression of one or more key competence genes.
  7. In the antitoxin mutant the toxin mRNA is overexpressed during exponential growth.  This is consistent with the roles of related antitoxin's in other systems, where it acts as a repressor of transcription of the toxin-antitoxin operon.
  8. The antitoxin knockout cells have a normal doubling time in exponential growth, and survive competence induction and stationary phase just as well, so the toxin protein must not be toxic for growth or survival.



Where does all this leave us?  One possibility is that the toxin directly blocks DNA uptake, by some mechanism we are completely ignorant of.  But related toxins are known to act by cutting mRNAs on the ribosome, so it's possible that the RNA-seq results are misleading in that they detect all RNAs, including ones that have been cut.

Luckily the summer student wrote an R script to compare coverage patterns between wildtype and mutant cells, and generated lovely graphics showing the effect of the antitoxin knockout on coverage of segments containing competence-induced genes.  Just as an example, here's his comparison of expression of the pilABCD operon in wildtype (purple) and hypercompetent (green) cells.


He's generated data for all the competence-induced genes in the antitoxin knockout, so I'll check these to see if there are any alterations in transcript profiles that might indicate the action of a mRNA-cleaving toxin.


Toxin/antitoxin knockout updates, and bonus DNA uptake results

My last post was all about failure, so it's high time I updated things with some successes.

Constructing an Actinobacillus pleuropneumoniae antitoxin gene knockout:  At the last report, I had what I thought were four independent knockout mutants, but my attempts to PCR- amplify the genomic segment containing the knockout were not working.

I eventually switched to using a different thermostable polymerase (NEB's standard OneTaq) rather then the fancier Q5 polymerase I had been using.  Eureka - the PCRs all worked perfectly, giving strong bands of approximately the expected sizes.

...then I let everything sit around for a month while I dealt with other things...

Now I'm finally following up.  The first step is to digest these PCR products with a few other enzymes that should cut in either the genomic segments or the inserted SpecR cassette.  I've made rough predictions of the expected fragment sizes, which are all different for the ∆A mutant, wildtype cells, and the two mutants made by the Honours student (∆T and ∆TA).

The next step will be to do more PCR amplifications.  My original amplifications used the F and R primers that amplify a 2.6 kb segment containing the toxin and antitoxin genes (~300 bp each).  Now I'll use the F primer with the S-R reverse primer for the SpecR canssette, and the R primer with the S-F forward primer for the cassette.

If these both give the expected fragments then I'll (probably) send the PCR amplicons for each mutant to be sequenced.

If the sequencing confirms that the knocked-out genes are gone but the remaining gene is intact, then I'll give a sigh of relief.

Determining the competence phenotype of the Actinobacillus pleuropneumoniae antitoxin gene knockout:  My first test of the transformability of my first two ∆antitoxin mutants showed transformation defects, but in later tests they transformed within the range of the wildtype control.  But there was a lot of experiment-to-experiment variation in transformation levels (see graph below), so I'd like to do it one more time, to get clean publishable data.


Bonus DNA uptake results:  Just before Christmas the grad student finished his DNA preps of H. influenzae chromosomal DNA fragments that had been recovered after being taken up into the periplasm of competent H. influenzae.  He sent these to the former post-doc for sequencing, and the post-doc has now sent us some lovely preliminary results.  

The grad student had used DNA preps that had been sheared to two different size ranges.  We expected the genome coverage of the long fragments (mean length ~6 kb) to be fairly uniform, since almost all of them should contain at least one instance of the preferred uptake sequence motif.  These 'USS' motifs are distributed fairly evenly around the chromosome, with a mean spacing of about 1 kb.  We do see this, but with enough anomalies to keep things interesting.  And we expected coverage by the short fragments (mean length ~0.25 kb) to be much more strongly dependent on chromosomal position, since many such fragments would not include a USS.  And we do see this, again with interesting anomalies.
  

DAMN! Complete PCR failure!

Yesterday I ran a PCR amplification using DNAs from single colonies of 7 different A. pleuropneumonia isolates, and got absolutely no DNA fragments from any of them.

This amplification worked fine last time.  Can I figure out what went wrong?

  • I checked the run record of the PCR machine - it looks fine.
  • I checked the freezer box with the tubes of dNTP stock, 5X buffer, and Q5 polymerase, to be sure I hadn't picked up a wrong tube.
  • I checked my notes, to be sure I hadn't left out any component of the reaction mix.  I'd checked off each reagent as I added it, and the final volume was as expected.
  • I checked the 'F' and 'R' primer tubes (in another freezer box) to make sure I'd used the correct ones.  I'd made up more of the 10 mM dilution stock, so I also checked that I'd used the right tubes of the more-concentrated 100 mM stock to do this.  I even checked the remaining volumes in the two primer tubes - if I'd added one primer twice and not the other these volumes should differ by about 17 µl, but they're within a few µl.
  • I prepped the colony DNAs slightly differently.  Last time (prep 1) I put a whole colony into 100 µl of medium, then diluted 5 µl of that into 45 µl water and heated to 98 °C for 10 min to lyse the cells and free their DNA.  This time (prep 2) I put part of a colony into 100 µl water, heated that, and then pelleted out any cell debris.  Both times 1 used 1 µl of the heated sample.
What could I try now?
  • Use leftover Prep 1 colony DNA as template
  • Vortex the Prep 2 colony DNA tubes
  • Use as template purified DNA from lab stocks
  • Use a different pair of primers (the Spec-cassette ones worked well last time)
  • Repeat with the same reagents and template I used this time
  • Make fresh colony DNA preps
  • Make proper DNA stocks to use as templates
Plan:  
  • Prep 2 14-1 colony DNA, Spec primers
  • Vortexed Prep 2 14-1 colony DNA, F & R primers
  • Prep 1 14-1 colony DNA, F & R primers
  • Prep 1 14-1 colony DNA, Spec primers
  • 1/100 dilution of lab-stock DNA, F & R primers




    Success

    When I last posted, nearly 3 weeks ago, my first attempt to generate the desired full-length knockout construct had given a mixture of fragments rather than just the desired full-length one.  But this mixture did include a relatively faint fragment of the desired size (3.6 kb).

    I did try to get a better PCR product, but increasing the annealing temperature made things worse, and I couldn't find a PCR app that would let me diagnose which incorrect-priming reactions were producing the unwanted fragments.  So I went ahead and transformed competent Actinobacillus pleuropneumoniae cells with the mixture, selecting for spectinomycin resistance.

    My logic was that only the desired fragment is likely to efficiently transform cells to SpecR, because other fragments were unlikely to have the correct homologous DNAs flanking the SpecR cassette.  If the 3.6 kb fragment was what I hoped it was, I should get thousands of transformants even though it was only about 10% of the total DNA in the mixture.  If it wasn't what I wanted, then it would probably transform very inefficiently if at all and I would get very few transformants.

    I got thousands of transformants in my first try.  Since the real goal of this project is to find out whether knocking out the antitoxin gene prevents transformation in A. pleuropneumoniae as it does in H. influenzae, I did a quick-and-dirty competence assay, using 7 pooled SpecR colonies and some kanamycin-resistant A. pleuropneumoniae chromosomal DNA.  This gave lots of KanR transformants, but luckily I didn't take this as a final result.

    Instead I went back and redid the transformation of A. pleuropneumoniae with the PCR mixture, this time using a lot less  DNA.  I did this because the high DNA concentration used in the original transformation meant that many cells could have taken up multiple DNA fragments.  In H. influenzae such fragments are known to undergo ligation in the periplasm, allowing formation of chimeric recombinants that give very confusing results.  Using 100-fold less DNA still gave plenty of SpecR transformants, and I streaked 4 of these to get clean single colonies.  (Two of the picked colonies were large, and two were smaller, but all gave large colonies on their streak plates.)

    I tested 2 of these colonies by PCR.  Only one (14-1) gave the expected 3.6 kb full-length product and 1.1 kb Spec cassette products.  The other (14-2) gave no product with the full-length primers and what looked to be a slightly small product with the Spec primers.  The control wildtype cells gave the expected 2.6 kb full-length fragment and no Spec fragment.



    At the same time I tested both colonies for the ability to be transformed.  Both were defective, with transformation frequencies 100-fold lower than the wildtype cells.  This is the most interesting result - it suggests that the Toxin-antitoxin system in A. pleuropneumoniae plays the same role in competence as its homologue does in H. influenzae.

    Next steps: More comprehensive characterization of all the A. pleuropneumoniae mutants.  First do the full-length PCR on colonies 14-1, -2, -3 and -4, on the ∆toxin and ∆toxin/antitoxin mutants made by the honours student, and on the wildtype control, this time running the gel more slowly to better characterize the fragment lengths.  Then do additional PCRs using other primers, to confirm the mutant structures, and repeat the competence assays on all these strains.

    I'll also need to get all the final mutants sequenced, to confirm that they have only the expected deletions. I'll email the former RA to ask her the best way to do this (do I send genomic DNA or PCR products, what primers are best...).

    Semi-success

    Here are the results of the first attempt at getting full-length PCR products:





    In the left lane (high template) I see a faint full-length band, and a stronger band the expected size of one of the expected intermediates.  In the right lane (low template) I see only the intermediate band.

    This is a fine result.  I'm now using 0.5 µl of the high-template reaction as the template in a new reaction with the same F and R primers and an annealing temperature optimized for them.  I hope this will give me lots of the full-length product.

    In anticipation of having the desired full-length DNA fragment, I've just streaked out the recipient Actinobacillus pleuropneumoniae cells I will transform this fragment into.  There are several steps I need to do before the final transformation:

    • Streak out the honours undergrad's A. pleuropneumoniae SpcR mutants (she made three different ones with the same cassette I'm using).
    • Check the sensitivity of A. pleuropneumoniae to spectinomycin, since this is the selection I will be using for transformation by my fragment.  The honours undergrad did this but her notes are not very good here.  I need to identify a concentration that will prevent colony formation by the sensitive cells but allow it by the resistant cells.
    • Make a competent stock of the recipient (SpcS) by growing the cells in MIV starvation medium.  
    • Check the competence of these cells by transforming them with genetically marked DNA.  I know I have some old DNA for this purpose (NalR?), but it would be good to select for SpcR using DNA from one of the undergrad's SpcR strains, if I can find this.
    Before doing the final transformation I should also digest my transforming fragment with a couple of diagnostic restriction enzymes, just to be sure it is what I want.

    Progress! The big fragment-assembling PCR is running!

    The PCR amplification of the SpcR fragment worked fine yesterday (after I spent an hour staring at (forward and reverse and complement and reverse-complement) primer sequences to reassure myself that I'd ordered the correct ones).  And today I did a column cleanup and a gel purification.  And now I'm running the PCR reaction that I hope will assemble the complete DNA fragment.

    The part I'm doing now is shown below.  You can see the whole plan in this post.


    When I was getting ready to do the PCR I realized that first I needed to think in more detail about the events.  So here they are, in point form:
    1. Most of the green strands will just reanneal to their complements.  Probably many of the red and blue strands do too.  This is an unwanted process that reduces the availability of strands for the desired annealings and elongations. In normal PCR this is hindered by the very high concentrations of the primers, but here we only have the F and R primers.
    2. Primer F anneals to the lower red strand and primes synthesis of the upper red strand.  At the same time, primer R anneals to the upper blue strand and primes synthesis of the lower blue strand.  Both of these new strands accumulate linearly, not exponentially.  This provides more of the appropriate strands red and blue for STEP 4.
    3. STEP 4 above: Sometimes the upper red strand anneals to the lower green strand (by their 17 bp of complementarity), and elongation in both directions produces hybrid upper and lower (red + green) strands.  Similarly, sometimes the upper green strand anneals to the lower blue strand (by their 16 bp of complementarity), and elongation in both directions produces hybrid upper and lower (green +_ blue) strands.  All these strands also accumulate linearly, not exponentially.  They provide the effective strands for STEP 5.
    4. STEP 5 above: Sometimes the hybrid strands from STEP 4 anneal to each other instead of to their complements (by their 1260 bp of complementarity). Elongation in both directions produces the full-length strands.
    5. STEP 6 above: Now primers F and R can anneal to the full-length strands, leading to exponential amplification of this desired product. 
    Miscellaneous thoughts:
    • I don't know how much template is appropriate, so I'm trying two concentrations, one 100-fold lower than the other.
    • The rate-limiting step is STEP 4, the annealing of the red and green strands by their 17 bp overlap, and of the green and blue strands by their 16 bp overlap.  I'm using a fairly low annealing temperature to facilitate this.  If I don't get any product I'll try lowering it further.
    • If I hadn't mixed the red and blue fragments for purification I might have set up partial reactions (red + green and blue + green), to make it easier to monitor progress.  If the all-in-one reaction doesn't work I'll probably make some more of each fragment and purify them separately so I can troubleshoot in separate reactions.




    Successful fragment cleanup

    Last night I designed and ordered the two complex primers I'll need to amplify and insert the SpcR gene.  They won't get here until Monday, so today I did column and agarose-gel cleanups on the two other PCR fragments I'll be using

    First I pooled what was left of the two PCR reactions (A and B).  The volume of each was about equal to the amount I had run in my gel yesterday (left panel below).  Then I did a column cleanup to get rid fo the bulk of the PPCR primers, using an old EconoSpin column that I'd revived by passing 0.2 N NaOH through it.


    Then I ran all of the eluate from that column in one lane of a 1.2% agarose gel.  I used only about 1/5 of the usual concentration of Ethidium Bromide, and I viewed the gel only with our hand-held long-wave UV lamp to avoid UV damage to the DNA.  The bands were faint but clear and I cut them out  together in one gel slice.

    Then I used our new Zymoclean gel-recovery kit to dissolve the agarose and recover the DNA, and ran 3 µl of the resulting 24 µl in unused lanes of the same gel I used yesterday, to see how much DNA I had recovered.

    The results look great.  The bands are sharp and bright and the right sizes, and the intensities suggest that I've recovered most and perhaps all of the DNA I began with.  So now this prep is in the fridge, waiting for the other piece of the construct.

    In the lab! Doing PCR! Successfully!

    I have primers for the A-R and A-F sites shown in the previous post, but they weren't designed to work with the F and R primers I also have (F with A-R, and R with A-F).  But I decided to go ahead and test them anyway before I order the new S-F and S-R primers I will need for amplifying the SpcR cassette, and for linking the other amplicons to it.


    The New England Biolabs primer evaluation software recommends against PCR using a pair of primers whose Tms differ by more than 5 °C, but I don't have much to lose, so I'm running them anyway.  I'm also testing another version of the A-F primer, designed by the honours student.

    I'm using our high-fidelity Q5 polymerase instead of Taq, because I need the product to be error-free.  Unfortunately each pair of primers requires a different annealing temperature, so I'm doing three PCR runs, each with a single tube.

    And tomorrow morning I can run a gel to see if any of them worked.

    It's tomorrow, and OMG!!!, all three PCRs gave excellent products!

    So I just need to design and order SpcR primers with tails that will base-pair to the inner ends of products A and B.


    New no-cloning plan for the antitoxin knockout

    The Methods section of a manuscript I was reviewing reminded me that it's possible to use PCR to create a gene knockout, and then transform the PCR product directly into naturally competent cells, without an intermediate cloning step.

    We actually had tried out this method bout 20 years ago, when the H. influenzae genome sequence first became available, but after some experiments we decided that cloning was usually more reliable.  But now I'm in a situation where cloning is being very unreliable indeed, enough so that I can't bring myself to try again.  So instead I'm going to try the direct-transformation method.

    Basic plan:
    1. In two independent PCR reactions, amplify the left and right genome segments flanking the gene to be deleted (the antitoxin gene).  These segments need to be long enough to later allow efficient homologous recombination of the final construct into the chromosome. 
    2. Separately amplify a Spectinomycin-resistance cassette (SpcR), using primers designed with tails complementary to the inner ends of the two genome fragments.  
    3. Remove all primers from these PCR products and mix them together in a PCR reaction mix with no added primers.
    4. Do one cycle of strand melting, strand annealing (of the primer tails), and strand extension by Taq. This produces two mid-length fragments that both contain the SpcR segment.
    5. Do another cycle of strand melting, strand annealing (this time of the full SpcR segment), and strand extension by Taq. This produces one full-length fragment containing both genome segments and the SpcR segment.
    6. Add the outermost genome primers (left primer of the left segment and right primer of the right segment) and carry out a normal PCR amplification.
    7. Transform the resulting fragment into competent A. pleuropneumoniae cells.
    8. Select for SpcR and confirm the new genotype by PCR.



    Now I need to dig into the sequence file and primer files to figure out whether I can reuse primers we already have, and to design the new SF and SR primers with the appropriate tails.

    What else are my experiments telling me?

    The previous post discussed what I can call Problem area #1: the evidence that my plasmid prep results have been unreliable - that the absence of plasmid in the prep didn't mean that the cells I started with didn't contain plasmid. So now I need to go back through the other experiments, to check if my conclusions are still solid.

    Problem area #2:  Can the specR PCR fragment be ligated into a blunt-cut plasmid, when phosphorylation isn't needed?  Answer: NO.

    My first experiment said 'No', but it was flawed by using too high a ratio of plasmid to insert.  So I repeated it using much more specR fragment than plasmid.  This time the results were a cleaner 'No'. Religation of the cut plasmid gave 437 AmpR colonies for 510 µl transformation mix.  Ligation of the same amount of plasmid in the presence of the specR fragment gave only 29 colonies from 150 µl of the transformation mix, and no SpecR colonies (from 150 µl) or AmpR SpecR colonies (from 150 µl).  A positive control transformation using a plasmid that carries both AmpR and SpecR gave several hundred colonies of each from the same volumes.

    So I conclude that the specR PCR fragment cannot be ligated, and that it also interferes with ligation of the blunt-end plasmid.  One way to interpret this is that one end of the specR fragment is OK and the other is unligatable.  I'm discouraged but not surprised by this result, because the specR PCR product always behaves oddly in agarose gels: a bit blurry before cleanup and worse after cleanup.

    Solutions: (1) I could design new specR primers and try again.  We might even have other specR primers for this cassette. (2) I could try amplifying with the original primers from a different template.  The undergrad used a plasmid I think.  I should have a plasmid with specR in a long-enough segment to encompass my primer sites..  (3) I could cut my existing specR PCR-product with NheI to generate sticky ends.  This is what the undergrad had originally planned.  I would need to design new primers for the inverse-PCR step that generates the rest of the desired plasmid.  Or blunt-cut specR fragment out of a plasmid and ligate this to the inverse-PCR product.


    Problem area #3:  Are the kinase (phosphorylation) reactions working?  Answer: YES.

    I originally tried to test this by phosphorylating the inverse-PCR product and self-ligating it, transforming E. coli and selection for AmpR.  I got no transformants, so either the kinase reaction failed or there was another problem.  The ligation control, transformation control and plate-selection controls all worked fine.  This was when I discovered I'd been using very old kinase, but repeating the experiment with new kinase gave the same result.  Was the ATP stock bad?  No, repeating the kinase reactions using ligase buffer (contains its own ATP) gave the same result.

    A better test of the kinase function comes from the grad student, who has been using it to label chromosomal DNA with 32P from 32P-ATP.  He's getting modestly successful incorporation, suggesting that this reaction is working OK.


    Problem area #4:  Is the inverse-PCR product's intact toxin gene toxic to E. coli?  Answer: Weak No (no evidence that it is).

    I tested this by making a different inverse-PCR product using the undergrad's old primers.  These cut off the last 5 amino acids of the toxin gene.  She was able to get successful ligation of this product to her specR PCR product, after kinasing a mixture of both fragments.  I ligated this with my kinased SpecR fragment.  This produced one AmpR SpecS colony and one AmpS SpecR colony.  The negative control self-ligation of the inverse-PCR fragment alone (not kinased) gave a few AmpR colonies.  If these do result from self-ligation (I didn't do a plasmid prep on them).

    I had only kinased the specR fragment, because I didn't want the inverse-PCR fragment to be able to self-ligate.  In retrospect, especially now that I know that the specR fragment is unligatable, I should have also kinased the inverse-PCR product as a better control to show that the kinase was working.  


    THE BIG QUESTION: SHOULD I KEEP TRYING?

    This project shouldn't have been such a big deal.  But it's the bottleneck in getting the toxin-antitoxin work finished and published.

    One possible plan:  Buy some NheI, cut the (blurry) specR PCR fragment and run it in a gel.  Do I get a nice sharp fragment of the right size?  If yes, design and order new inverse-PCR primers with NheI sites.  (Why did the undergrad choose NheI?  I have no idea?)  Cut the new inverse-PCR product with NheI and ligate it with the specR fragment.  Promers are cheap, so I oculd instead jsut design new specR and inverse-PCR primers with matching sites.

    Another plan:  I recently found a note I made at last summer's Gordon Conference, reminding me that a favourite colleague had offered to make this damned mutation for me.  But I don't like to ask this of him...


    What next in the antitoxin knockout endeavour?

    OK, time to end my two three? four? weeks of sulking because my experiments won't work...

    Where was I?  (... must consult my notebook)

    I was trying to resolve two issues.  The first was why ligation of my phosphorylated PCR fragments did not produce a plasmid that could transform E. coli to AmpR SpcR.  The second was why my plasmid preps were not producing any plasmid, from cells that I was quite sure contained a high-copy-number plasmid.

    This second issue was compromising my ability to investigate the first issue, so let's deal with it first.

    I was doing my plasmid preps using Econo-spin spin columns (from Epoch Life Sciences) and column reagents we had made up ourselves.  This is much cheaper than using spin-column kits from Qiagen or Sigma.  But the columns and reagents were quite old, and I was not even sure that I was using the right volumes of the reagents.

    Here's what I was doing:

    The basic procedure is to start with a version of the standard alkaline-lysis procedure:  
    1. Pellet cells and resuspend in a neutral buffer containing EDTA
    2. Add 0.2M NaOH +1% SDS.  The SDS will lyse the cells and the high pH will cause the base-paired DNA strands to separate, 'denaturing' both chromosomal DNA and the plasmids.  Each plasmid's two DNA strands will be looped together because they are interlocked circular molecules.
    3. Neutralize the NaOH and make the SDS insoluble by adding a low pH potassium acetate solution.  The two plasmid strands will regain their base pairing returning the plasmid to its normal configuration.
    4. Centrifuge the tube to pellet the cell debris, including the chromosomal DNA and most of the SDS.  The plasmid DNA and various soluble components remain in the supernatant

    In the old-fashioned procedure the supernatant is extracted with phenol and then chloroform, the DNA (and RNA) is precipitated with ethanol, and the pellet is resuspended in TE buffer (often with RNase A added to degrade all the RNA).

    With a spin column the supernatant is instead placed in the top part of the column (see figure above, from Perkin-Elmer), and spun through it.  The DNA sticks to the filter membrane in the base of the column, and all the unwanted soluble material washes through and is discarded. The DNA stuck on the membrane is further cleaned with one or two washes of a special solution containing ethanol, and then the DNA is eluted from the membrane into a clean tube using a small volume of TE buffer or water (usually 50 µl).

    The kits typically are expensive (Qiagen: $1.35 per column). They include all the reagents, but the reagent recipes are kept secret.  The same columns work for a number of different DNA-purification procedures, but you need to buy a different kit for each to get the specific reagents used in each procedure.

    Epoch and other budget suppliers will sell you just the columns (Epoch, about $0.40 each) and provide recipes so you can make your own solutions for all the procedures.  But as I said, our Epoch columns were old (several years?) and I wasn't confident that my hand-written notes specified the correct volumes of reagents to use.  We had the sheet with the reagent recipes but I couldn't find one with the protocols.

    Anyway, I did a test.  With replicate samples I used our homemade reagents to do the alkaline lysis steps 1-4 above, and then finished some sample with a spin-column cleanup and some with phenol extraction and ethanol precipitation.  The column samples had some chromosomal DNA contamination but no plasmid and no RNA.  The phenol-etc samples had no chromosomal DNA and no plasmid but a lot of RNA.  Bummer.

    So I did another test, this time adding to the two treatments above samples where I did the alkaline lysis steps using reagents specified by an old non-column protocol, followed by the old-fashioned phenol extraction and ethanol precipitation steps.  This time I did get some plasmid from the old-style prep, but again none from either prep using the column reagents.

    So this suggests that the column reagents or volumes are at fault.  I checked online, but found a slightly different set of protocols, with some reagent names we didn't have the recipes for.  I contacted Epoch and they kindly sent me a new recipe sheet.

    Epoch also provided this advice about column storage and regeneration:
    If you have access to cold room you can store the column in a cold room. Or if you still have room in your refrigerator which does not go through defrost automatically you can also store your column there. We can also ship you some "moisture pack", most conveniently with your next order, which can be sealed in the plastic bag. Finally you can apply 20 ul of 0.2N NaOH to the old column 30 minutes before you do miniprep. This will regenerate the membrane to the maximum bidding capacity.  
    So...  I suspect my preps didn't give plasmid because I was using the wrong volumes of the column alkaline-lysis reagents volumes, not because the cells didn't contain plasmids.

    How does this affect my thinking about the rest of the work?  Might some of the colonies I discarded have contained a desired plasmid?  I'll put that in the next post.

    Spot 42 RNA doesn't explain our ∆hfq competence phenotype

    Gisela Storz from NIH gave a great seminar here yesterday about the many roles that small RNAs and very small proteins play in E. coli gene regulation.  It gave me an idea about the role the small-RNA-accessory protein Hfq might be playing in competence regulation.

    One example she discussed is the abundant Spot 42 RNA, so named because it was discovered as an RNA chromatography spot long before any function was discovered.  This small (109 nt) RNA acts in a feed-forward regulatory loop with the cAMP-dependent transcriptional activator CRP.  This regulation fine-tunes the control of the CRP-activated genes that provide the cell with alternative carbon sources when its preferred sugars are depleted.

    When the preferred sugars are available, Spot 42 RNA is expressed and limits the translation of genes for using the alternative carbon sources.  Expression of these genes is already low because there is no active CRP to stimulate their transcription, and Spot 42 RNA prevents translation of any transcripts that might inadvertently get made.  The contribution of Hfq (the yellow donut) is to help Spot 42 RNA interact with its target mRNAs and prevent their translation.

    But when the preferred sugars run out, active CRP represses transcription of the spf gene, so Spot 42 RNA isn't made. (Yes, I know I said that CRP is a transcriptional activator, but it can also be a repressor when its binding site is on top of or downstream of the promoter.)  Now the transcription that CRP stimulates produces mRNAs that are efficiently translated.


    So in E. coli Spot 42 RNA improves the stringency of the regulation that we have been giving CRP full credit for.

    Where does H. influenzae competence come in?  Our Hfq knockout lowers competence, and the other honours student showed that this is because it makes competence more dependent on high cAMP than it usually is.  So I got excited by the idea that maybe Hfq's role in competence had to do with its role in mediating Spot 42 activity on CRP-dependent transcripts.  This could either be activity on CRP-dependent transcripts of the competence-specific transcriptional activator Sxy, or on the various CRP-dependent transcripts of the DNA uptake machinery.

    But this falls apart in two independent ways.

    First, my logic is backwards.  Now that I've created these diagrams it's obvious that Hfq's contribution to competence regulation acts in the opposite direction.  Knocking out hfq would be expected to increase the expression of competence genes under non-inducing conditions, not decrease it under inducing conditions.

    Second, it looks like H. influenzae doesn't even have a Spot 42 homolog.  Homologs are ubiquitous in the Enterobacteraceae, and have recently been shown to also be common in the Vibrionaceae, with about 85% sequence identity.  But BLAST searches of the Pasteurellaceae with either type of sequence didn't find a single homolog.  Given the relationships shown in the tree below. I think the ancestral Pasteurellacean must have lost its Spot 42 homolog.




    Still no cloning success

    Here's what I've done and what I've learned:

    At the end of my last post I was about to test the toxicity to E. coli of the 'toxin' gene adjacent to the gene I'm trying to knock out.

    I did this by instead trying to create the same plasmid the undergrad had created, one that (unintendedly) deletes the end of the toxin orf as well as the adjacent gene.  So I repeated the inverse-PCR reaction using her old primers instead of my new ones, and used this fragment in my kinase-ligate-transform experiment.  I tested both the ability of this fragment to self-ligate after being treated with kinase, and its ability to ligate to a SpecR PCR fragment that had been treated with kinase.  The former produced only four AmpR colonies, and the latter no AmpR SpecR colonies.  I didn't do plasmid preps on the AmpR colonies to see if they contained the expected plasmid, though maybe I should have.

    Although failure of this experiment does not disprove the hypothesis that the toxin gene is toxic to E. coli, it does disprove the hypothesis that the hypothesized toxicity is the reason that my previous experiments failed.  Sorry, that's a nasty sentence - try again: Since this experiment worked for the undergrad last spring but not for me now, toxin-toxicity is unlikely to be why my experiments are failing.

    Next I made a list of the various approaches I could try - I'll get back to these below.

    The next test I did was to see whether I could ligate the SpecR PCR fragment to a plasmid that didn't need phosphorylation, and whether simple blunt-end ligation was working at all.  If this worked I'd know that the problem is the kinase reaction.  So I cut a simple plasmid (pUC18) with EcoRI (Makes sticky ends) and separately with SmaI (makes blunt ends).  Both these cuts leave ends with 5' phosphates that should be good substrates for ligase.  I heat-inactivated the enzymes and set up three ligation reactions:

    1.  EcoRI-cut plasmid with ligase
    2. SmaI-cut plasmid with ligase
    3. SmaI-cut plasmid with SpecR fragment and ligase
    Results: 

    1. ~3500 AmpR colonies (all of the ligation reaction)
    2. 639 AmpR colonies (all of the ligation reaction)
    3. 198 AmpR colonies (half of the ligation reaction) and 1 SpecR AmpR colony (other half of the reaction).  But my plasmid miniprep of cells from this colony didn't give any plasmid
    As a positive control, the same amount of uncut plasmid gave about 3000 colonies.

    So I concluded that my blunt-end ligation reaction conditions were fine.  Can I then conclude that the problem is the kinase reactions?  Unfortunately this wasn't a very stringent test of the ability of the SpecR fragment to be blunt-end-ligated into a vector, because I didn't pay attention to the relative proportions of the vector and insert.  I should have used a limiting amount of the vector and lots of insert, but I actually used about equal amounts.  Since self-ligation of the vector is a unimolecular reaction it is expected to be much more efficient than insertion of the Spec fragment. 

    This experiment used up the last of my control SpecR AmpR plasmid stock so I grew up the cells and did a miniprep.  But this didn't give any plasmid at all either, just some chromosomal DNA!  Can something also be wrong with the plasmid prep solutions, or my procedure?

    The grad student is trying to use the same kinase to label his DNA with 32P-ATP.  He's not having much success either so he's going to test my fragments, which are much simpler substrates than the sheared chromosomal DNA he's been using.


    Maybe the toxin is toxic!

    Recap of the last few posts:  Starting with a plasmid containing the toxin-antitoxin operon of Actinobacillus pleuropneumoniae, I've been trying to create a derivative plasmid whose toxin gene is functional but whose antitoxin gene has been replaced by a specR cassette.  This involves several steps: PCR of the specR cassette and inverse-PCR of the plasmid (to produce a fragment lacking the antitoxin gene), phosphorylation of the specR fragment with T4 kinase, ligation of the two fragments, transformation into E. coli, and selection for SpecR and AmpR cells.  The PCR steps work but the rest fails to produce any resistant colonies



    After the first attempt failed I introduced several controls: EcoRI-cut pUC18 as a ligation control, another SpecR AmpR plasmid as a transformation control, and the kinase-treated inverse-PCR fragment as a kinase control.  The ligation and transformation controls worked, so I decided the kinase was at fault.  This hypothesis was supported by finding that I had been using a long-expired stock of kinase rather than the one bought earlier this year.

    But two more attempts using the new kinase have also failed.  The first used the supplied kinase buffer and my stock of ATP, and the second used new ligation buffer (recommended by the supplier) which contains its own ATP.  The second time I also preheated and rapid-chilled the substrates, which is recommended to help expose the blunt ends to the kinase.  Both times I got no transformants from either the test or the kinase control, but got lots with my transformation control plasmid. (I didn't bother repeating the ligation control.)

    I've been trying to think of what else could be going wrong with the kinase reaction, but it just occurred to me that maybe there's a completely different problem - maybe this toxin is toxic to E. coli.

    Although the honours student had mentioned this concern when she handed this project over to me, I had discounted it because the H. influenzae homolog is not toxic in either H. influenzae or E. coli. But both my desired construct and the recircularized inverse-PCR fragment I'm using as a control are expected to express the toxin, possibly at high levels.  So maybe my reactions are all working, but the plasmid they produce is not tolerated by E. coli.

    How to test this?  Directly testing for lethality is tricky.  But I can do a different kinase control using a different inverse-PCR fragment, one that won't express the toxin. If the problem is the toxin, this should give AmpR transformants.  I can also use the same fragment with the spec cassette, and I should now get SpecR AmpR transformants.  I have all the honours student's primers and her CR conditions so this should be straightforward.

    Think Check Submit - can't we do better than this?


    The Scholarly Kitchen blog has a post about a new initiative from a large consortium of scholarly publishing societies and individual publishers, intended to help inexperienced researchers avoid journals from 'predatory publishers'.  This is a very worthwhile goal, but the actual advice provided so far isn't going to exclude most of the bad guys.   

    The first step just explains why researchers need to be careful where we publish:


    The third step is just reassurance:



    The second step is the one that matters; it tells researchers what they should look for:




    There's nothing wrong with this advice, but it's certainly not treading on any publishers' toes.

    Most importantly, there's no mention of the most valuable resource we have, Beale's List.  This is a frighteningly long list of open-access scholarly publishers whose tactics are potentially, possibly or probably predatory.  It's maintained by Jeffrey Beale, a librarian at the University of Colorado, at his Scholarly Open Access blog.  The last time I checked, a couple of years ago, there were about 300 publishers on this list, but today there are 882!  And this is just the publishers - most of these have multiple journals.

    Beale's list isn't just a list.  Beale also provides explicit sets of criteria for evaluating individual publishers and journals.  The absence of Beale's List from the THINK CHECK SUBMIT campaign isn't really surprising, but it reinforces my concern that we can't rely on the publishers to look out for researchers' interests.


    New kinase stock found


    Along with new stocks of other enzymes.

    Somebody apparently thought it was a good idea to stop putting enzyme stocks in their usual place (the 'Special Enzymes' box) in the -20 °C freezer, and instead put them in this new 'coloning' box.

    They then put the new box in a bin in the freezer where its label couldn't be seen.

    It's the kinase!



    Yesterday's experiment worked very well, in that the thorough controls clearly tell me where the problem is.  But the actual experiment produced only three candidate colonies.

    Control E: No DNA. No SpecR or AmpR colonies.  GOOD-selective plates kill non-resistant cells

    Control F: AmpR SpecR Plasmid.  p∆TA::Spec: ~350 AmpR and ~350 SpecR transformants   GOOD-selective plates select for cells carrying the resistance genes on a plasmid, and the competent cells transformed efficiently.

    Controls D and G: EcoRI-cut pUC18 ± ligase.  ~12,500 AmpR colonies after ligation, only about 250 without ligase.  GOOD- ligation worked.

    Control C: No-ligation control. Kinased spec PCR product plus not-kinased inverse-PCR product, ligation reaction with no ligase.  No SpecR or AmpR colonies.  GOOD-The fragments do not spontaneously circularize and transform cells, and the fragment mixtures do not contain any unwanted intact plasmid.

    Control B: Kinase control.  Ligation of kinased inverse-PCR fragment.  Should have given AmpR colonies, but none.  BAD- Kinase failure.

    Experiment: Ligation of kinased spec PCR product plus not-kinased inverse-PCR product.  Gave only 1 AmpR colony and two SpecR colonies.

    Next steps:  

    I've streaked the three candidate colonies on both Spec and Amp plates.  The desired plasmid should confer resistance to both.

    And I looked at the expiration date on the tube of BioLabs T4 polynucleotide kinase I've been using. 03/09!!!!  AAARRRGGGHH!!!!

    Have I been using the wrong tube of kinase?  Is this not the kinase that the former undergrad and sabbatical visitor used successfully last year?  Searching the 'Special Enzymes' freezer box turned up another tube of T4 polynucleotide kinase, but this one looks even older.  So I've just emailed the undergrad and sabbatical visitor to ask what they used.








    Positive control problem solved

    I did the test experiment described in the previous post, and then spent the past few days figuring out why my positive control transformation didn't work any more.

    The test experiment was to kinase, ligate and transform into DH5alpha the product of the inverse-PCR reaction.  If the T4 polynucleotide kinase reaction worked, its blunt ends would acquire 5' phosphates that would allow it to be circularized by T4 DNA ligase, and to then transform DH5alpha to ampicillin resistance.  The negative control was no DNA and the positive control was the same p∆TA:spec plasmid that had given thousands of AmpR and SpecR transformants in the previous experiment.

    Sounds great, but this time the positive control didn't give any transformants at all!  Background small colonies were frequent, possibly because the plates were a bit old and the ampicillin had lost its potency, so I didn't trust the few larger colonies on the inverse-PCR reactions plates.  I streaked a few of the large colonies to check if they were genuinely AmpR - one was.

    I repeated the control transformation and negative control with new Amp plates; the no-DNA control plates were clean but so were the p∆TA:spec plates.

    I thought the problem might be the plasmid, but I wasn't sure I have another reliable positive control. So I did a miniprep from the one genuine AmpR colony I had streaked and transformed the cells with that DNA.  I also had the usual no-DNA control, the undergrad's p∆TA:spec, another plasmid made by the undergrad (used successfully by me as the inverse-PCR template, and some pUC18 left by a sabbatical visitor.

    Success all around.  The miniprep DNA, the other undergrad plasmid and the pUC18 all gave lots of transformants (the photo shows part of a pUC18 plate), and the undergrad's p∆TA:spec and the no-DNA control gave none.  I don't know why the p∆TA:spec plasmid worked well in my first experiment - maybe I had grabbed a 'wrong' (i.e. good DNA)tube.



    Next step, repeating my original experiment (the one in the previous post), this time with better controls.

    1. DNA clean-up: I did a new inverse-PCR reaction because the old one got used up in the tests.  I need to start by doing a spin-column cleanup of it.
    2. Two kinase reactions: (i) the 'blurry' spec PCR product and (ii) the not-blurry inverse-PCR product.  Heat-inactivate the kinase before step 3 (65°C 20 min).  This time I'll use a newer stock of ATP, and the official kinase buffer.
    3. Four ligase reactions: A. The kinased spec fragment plus not-kinased inverse-PCR fragment. B.  (kinase control) The kinased inverse-PCR fragment. C. (negative control) The not-kinased spec fragment plus the not-kinased inverse-PCR fragment . D. (positive control) pUC18 cut with EcoRI and heat-inactivated (65°C 20 min).
    4. Six transformations: Ligations A, B, C and D, plus 1 µl pUC18 as positive control and no DNA as negative control.
    Preparations:  We have enough kinase, and I've just sent the grad student to buy more ligase.  Luckily I have lots of frozen competent cells for the transformations.  I'll need to digest the pUC18 and check it in a gel, and pour lots of Amp plates and some Spec plates.