Testing the new scoring system

The undergrad is creating a new version of the USS model program with the scoring done multiplicatively rather than additively (see previous post). I was originally thinking we'd start by just playing around with it, to see what happens. But after a conversation with the post-doc I realize that we should start out being more systematic.

What we need to do first is just find out what scores are produced by different sequences using this system:
  1. What score does a random sequence of a specified length (and base composition) produce? We'll test 100, 1000 and 10000bp.
  2. What score does a sequence produce that differs only in containing one USS perfectly matched to the matrix consensus?
And we'll do the same tests with differently weighted matrices, using both additive and multiplicative scoring:
  1. Additively scored matrix with all consensus bases worth 1 and all non-consensus bases worth 0 (like the yellow one in the previous post).
  2. Additively scored matrix with consensus bases at different positions weighted differently, according to our measures of their contribution to uptake. For example, some consensus bases might be worth 1 and some 3 or 5 or 10.
  3. Multiplicatively scored matrix with all consensus bases worth the same value (say 2 or 5), and all non-consensus bases worth 1.
  4. Multiplicatively scored matrix with consensus bases at different positions weighted differently, but all non-consensus bases still weighted 1.
  5. Multiplicatively scored matrix with the different non-consensus bases also weighted differently, perhaps with some values smaller than 1.
Only after we've done these basic tests of the different scoring systems will we decide whether to test their effects on USS accumulation. These preliminary tests shouldn't take very long. It's just a matter of generating the 6 test sequences and the 5-10 test matrices in a text file (needing maybe half an hour), and then pasting the different permutations into the two (additive and multiplicative) test versions of the program. The actual scoring runs will only take a few seconds each.

No comments:

Post a Comment

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS