Our Perl model of USS evolution in a genome runs well, but USS-like sequences accumulate only slightly. I've been playing around with various factors that might be responsible but haven't really gotten anywhere. I need to get these factors clear in my mind, so it's time for a blog post about them.
What the model does (before I started fiddling with it on Friday): (I should clarify that this version (USSv5.pl) doesn't yet have all the formatting bells and whistles that would make it graceful to use, but it does have the basic functionality we think we want.) It first creates a random 'genome' sequence of specified length and base composition, whose evolution the model is going to simulate. In each evolutionary cycle it first takes random segments of this genome, mutates them according to a specified mutation rate, and scores them for sequences similar to a sequence motif specified in the form of a matrix (see figure in this post for examples). This version of the model uses a new multiplicative scoring procedure rather than our original additive procedure. Each segment's sequence has a chance to replace its original sequence, with probability proportional to its score expressed as a fraction of some preset maximum score. The modified genome is scored by the same procedure used for the segments, and then becomes the starting sequence for the next cycle. (We had intended that the genome would undergo mutation in each cycle but this step has been bypassed because it was causing the USS-like motifs to degenerate faster than they accumulated.)
We first tested the additive and multiplicative scoring procedures to see how much difference a perfect USS sequence made to the score of otherwise random-sequence genomes. As we already knew, the additive procedure gave sequences with USS scores that were at best only about 1% higher than sequences without USS - the precise difference depends on the length of the sequence (we tested 100, 1000 and 10000 bp) and on the numbers in the scoring matrix .
The scores obtained with the multiplicative procedure were far more sensitive to the presence of a USS. For the 100bp test sequences, scores with USS were from 2-fold to 700-fold higher than for the same sequence without a USS, depending on how much higher USS-consensus bases scored than non-consensus bases. The lowest sensitivities were seen when this ratio was 2 for all positions, with higher sensitivities when the ratios were from 5-fold to 50-fold.
So this looked very promising - with such a sensitive scoring system I expected USS-like sequences to accumulate rapidly and to a high frequency. But this didn't happen. The genome scores did go up dramatically, but this turned out to be due to the much more sensitive scoring system acting on only a few base changes.
I played around with different genome sizes and fragment sizes and numbers and mutation rates and matrix values, but nothing seemed to make much difference. "Seemed" is the right word here, as I didn't keep careful records. I created or reinstated various reporting steps, so I could get a better idea of what was going on. I also replaced the preset maximum segment score with a variable score (= 5% of the previous cycle's genome score), so that the strength of the simulated bias would increase ass USS-like sequences accumulated in the genome.
But USS-like sequences didn't accumulate much at all, and I don't know why. There could be a problem with the code, but none of the informal checks I've done has set off any alarm bells. There could instead be a fundamental problem with the design of the simulation, so that what we are telling the simulation to do cannot lead to the outcome we expect. Or perhaps only a very small portion of 'parameter-space' allows USS-like sequences to accumulate.
The post-doc and I came up with two approaches. One is to meticulously investigate the effects of the various factors we can manipulate, keeping everything else as simple and constant as possible. The other is to use our brains to think through what should be happening. While we're doing this our programming assistant will be adding a few more sophisticated tricks to the program, to make our analysis easier.
I'll end with a list of factors we need to method
- Home
- Angry by Choice
- Catalogue of Organisms
- Chinleana
- Doc Madhattan
- Games with Words
- Genomics, Medicine, and Pseudoscience
- History of Geology
- Moss Plants and More
- Pleiotropy
- Plektix
- RRResearch
- Skeptic Wonder
- The Culture of Chemistry
- The Curious Wavefunction
- The Phytophactor
- The View from a Microbiologist
- Variety of Life
Field of Science
-
-
-
Political pollsters are pretending they know what's happening. They don't.5 weeks ago in Genomics, Medicine, and Pseudoscience
-
-
Course Corrections6 months ago in Angry by Choice
-
-
The Site is Dead, Long Live the Site2 years ago in Catalogue of Organisms
-
The Site is Dead, Long Live the Site2 years ago in Variety of Life
-
Does mathematics carry human biases?4 years ago in PLEKTIX
-
-
-
-
A New Placodont from the Late Triassic of China5 years ago in Chinleana
-
Posted: July 22, 2018 at 03:03PM6 years ago in Field Notes
-
Bryophyte Herbarium Survey7 years ago in Moss Plants and More
-
Harnessing innate immunity to cure HIV8 years ago in Rule of 6ix
-
WE MOVED!8 years ago in Games with Words
-
-
-
-
post doc job opportunity on ribosome biochemistry!9 years ago in Protein Evolution and Other Musings
-
Growing the kidney: re-blogged from Science Bitez9 years ago in The View from a Microbiologist
-
Blogging Microbes- Communicating Microbiology to Netizens10 years ago in Memoirs of a Defective Brain
-
-
-
The Lure of the Obscure? Guest Post by Frank Stahl12 years ago in Sex, Genes & Evolution
-
-
Lab Rat Moving House13 years ago in Life of a Lab Rat
-
Goodbye FoS, thanks for all the laughs13 years ago in Disease Prone
-
-
Slideshow of NASA's Stardust-NExT Mission Comet Tempel 1 Flyby13 years ago in The Large Picture Blog
-
in The Biology Files
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS