The former post-doc (I'll call him the FPD) visited yesterday afternoon, and we had intense discussions of how to proceed with both the RNAseq work (summarized here on our Sense Strand blog) and with the PhD student's planned DNA uptake experiments.
His planned experiments take advantage of the phenotype of a rec2 knockout mutation. These cells take up DNA normally across the outer membrane, into the periplasmic space, but they cannot transport it across the inner cell membrane. This allows him to recover intact DNA that has been taken up, and to use DNA sequencing to compare it to the input DNA the ∆rec2 cells were given.
Some of the experiments will use genomic DNA of the species being tested, fragmented to appropriate length distributions, and some will use synthetic DNA fragments (~200 bp) containing a 30-50 bp stretch of random sequence (see figure).
The FPD, who developed the synthetic fragment protocol, pointed out that his experiments had used full lanes of Illumina sequencing only because it was not then possible for us to 'barcode' our different DNA samples and mix them for sequencing as a single lane. The sequencing depth he obtained was useful, but it will be extreme overkill for the experiments the PhD student plans. So we need to design barcoding into our analyses, so we can mix up to 24 samples in one lane for sequencing, and then separate the resulting sets of sequence reads by their different barcodes. We'll still need to use two lanes, because each 'recovered' sample will need to have a corresponding identical 'input' sample. Because these samples will have the same barcode they could not be distinguished if they were sequenced in the same lane.
So rather than doing one very-deeply sequenced experiment, he'll be able to do multiple replicates, each sequenced at a moderate but entirely adequate depth. If he uses a HiSeq machine for the sequencing, he'll be able to get 1.6 x 10^8 reads for each of 12 samples; with a NextSeq this would give 4 x 10^8 reads per sample. (Is that right, per sample, not per lane?).
One issue to keep in mind is that it would be foolish to save all the sequencing for one big batch at the end of the thesis work. Instead the work needs to be designed with an initial set of samples to be sequenced, so he can (1) tell whether everything is working as it should, and (2) begin analyzing sequence data from one part of the project while generating additional samples for other parts. For a preliminary batch of sequencing, it might be better to use a MiSeq machine, whose smaller capacity would let us sequence a few samples more economically.
We also talked about how long the random-sequence segments should be in the 200 bp fragments, and about where to locate the barcode segments. These consist of an independent sequencing primer followed by 8 bp that identify the source experiment. Putting these to the right of the random segment will let him efficiently create the double-stranded 200 bp fragments, using the same long left-side oligo (containing the random segment) with many different right-side oligos, each containing a different barcode.
- Home
- Angry by Choice
- Catalogue of Organisms
- Chinleana
- Doc Madhattan
- Games with Words
- Genomics, Medicine, and Pseudoscience
- History of Geology
- Moss Plants and More
- Pleiotropy
- Plektix
- RRResearch
- Skeptic Wonder
- The Culture of Chemistry
- The Curious Wavefunction
- The Phytophactor
- The View from a Microbiologist
- Variety of Life
Field of Science
-
-
From Valley Forge to the Lab: Parallels between Washington's Maneuvers and Drug Development2 weeks ago in The Curious Wavefunction
-
Political pollsters are pretending they know what's happening. They don't.2 weeks ago in Genomics, Medicine, and Pseudoscience
-
-
Course Corrections5 months ago in Angry by Choice
-
-
The Site is Dead, Long Live the Site2 years ago in Catalogue of Organisms
-
The Site is Dead, Long Live the Site2 years ago in Variety of Life
-
Does mathematics carry human biases?4 years ago in PLEKTIX
-
-
-
-
A New Placodont from the Late Triassic of China5 years ago in Chinleana
-
Posted: July 22, 2018 at 03:03PM6 years ago in Field Notes
-
Bryophyte Herbarium Survey7 years ago in Moss Plants and More
-
Harnessing innate immunity to cure HIV8 years ago in Rule of 6ix
-
WE MOVED!8 years ago in Games with Words
-
-
-
-
post doc job opportunity on ribosome biochemistry!9 years ago in Protein Evolution and Other Musings
-
Growing the kidney: re-blogged from Science Bitez9 years ago in The View from a Microbiologist
-
Blogging Microbes- Communicating Microbiology to Netizens10 years ago in Memoirs of a Defective Brain
-
-
-
The Lure of the Obscure? Guest Post by Frank Stahl12 years ago in Sex, Genes & Evolution
-
-
Lab Rat Moving House13 years ago in Life of a Lab Rat
-
Goodbye FoS, thanks for all the laughs13 years ago in Disease Prone
-
-
Slideshow of NASA's Stardust-NExT Mission Comet Tempel 1 Flyby13 years ago in The Large Picture Blog
-
in The Biology Files
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
2 comments:
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS
Subscribe to:
Post Comments (Atom)
So, of the 200bp DNa you want to give to the bacteria, 150 bp will always be the same?
ReplyDeleteI'd fear to get a strong bias there. What if one barcode happens to be a stretch of DNA the bacteria preferentially take up (or don't like to take up)? This would skew all experiments with different barcodes.
Or the bacteria like the flowcell priming sequence -maybe so much that the random sequence becomes completely irrelevant!
I'd use only the DNA you are interested in as an input and add all the DNa needed for the sequencing afterwards.
Rosie, you should get over 160M from HiSeq running V3 chemistry and up to 250M from V4, talk to your privoder about the version they are using. It is unclear to me why the two samples(random and gDNA) need to get the same barcodes? Pooling 12 samples per lane is likely to return 13-20M reads per sample on HiSeq; this wil only be the case if you gett the balance of samples spot-on. I'd recommend qPCR of the individual libraries if this is important (KAPA). Sounds like a fun PhD project!
ReplyDelete