The factors we've thought of are:
A. Nature of the preferred sequence pattern:
- How long is it (3 bp? 10 bp?)? How specific is it (e.g. is each base specified, or just 'purine' or 'pyrimidine'? Together these determine how often this pattern will occur in the input DNA (by chance or due to uptake bias-drive).
- How strong is the bias favouring uptake of fragments containing this pattern? How strict is the preference (are variants of the specified pattern also taken up, but less strongly)? Are fragments with more than one occurrence of the pattern more likely to be taken up?
- If this is genomic DNA, what is the size range of the fragments? The sensitivity of the experiment will be low if the fragments are so large that each has at least one occurrence of the preferred pattern.
- If this is a synthetic fragment containing a fully degenerate segment, how long is the degenerate segment?
- How high is the sequencing coverage? Is it the same for the control input DNA and for the recovered DNA? This will determine the noise due to random factors.
- Does the error rate of the sequencing matter?
- For genomic input DNA, are there position-specific differences in coverage across the genome?
- For degenerate-fragment DNA, are there non-random factors in the input DNA or in its sequence-ability?