Here's one way:
- We already have made samples of pure donor DNA reads (from strain NP or GG) that have been deliberately contaminated with reads from the recipient Rd (10% Rd, 90% NP or GG). These REMIX samples have already been mapped to the donor genomes.
- Make a second set of these samples, using the same pure donor samples but this time contaminating them to 10% with an independent set of Rd reads - pretend this is a 'contaminated uptake' sample.
- Map the new 'contaminated uptake samples onto the donor reference genome
- Divide the coverage at each position in the contaminated uptake samples by the coverage in the corresponding contaminated input samples.
- Examine plots to see how coverage differs across the genome in the two contaminated samples and in the pure donor samples.
For comparison, we can calculate and plot the ratios of coverage when the 'contaminated uptake' coverage is divided by the corresponding pure input coverage.
Aside: Way back we spent a lot of time wondering why the GG-short experiment had different peak uptake ratios than the NP-short experiment. Most uptake-ratio peaks in the NP-short experiment had heights between 3 and 4 (very few above 5). But most of the peak heights in the GG-short experiment were between 2 and 3, with many others have ing much higher peaks.
Now that I've been thinking about consequences of contamination, I wonder if this difference could have been caused by the much lower contamination levels in the GG samples (NP-short: 13.9, 17.4, 21.5; GG-short: 2.7, 4.8, 7.7).
No comments:
Post a Comment
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS