- Home
- Angry by Choice
- Catalogue of Organisms
- Chinleana
- Doc Madhattan
- Games with Words
- Genomics, Medicine, and Pseudoscience
- History of Geology
- Moss Plants and More
- Pleiotropy
- Plektix
- RRResearch
- Skeptic Wonder
- The Culture of Chemistry
- The Curious Wavefunction
- The Phytophactor
- The View from a Microbiologist
- Variety of Life
Field of Science
-
-
From Valley Forge to the Lab: Parallels between Washington's Maneuvers and Drug Development4 weeks ago in The Curious Wavefunction
-
Political pollsters are pretending they know what's happening. They don't.4 weeks ago in Genomics, Medicine, and Pseudoscience
-
-
Course Corrections5 months ago in Angry by Choice
-
-
The Site is Dead, Long Live the Site2 years ago in Catalogue of Organisms
-
The Site is Dead, Long Live the Site2 years ago in Variety of Life
-
Does mathematics carry human biases?4 years ago in PLEKTIX
-
-
-
-
A New Placodont from the Late Triassic of China5 years ago in Chinleana
-
Posted: July 22, 2018 at 03:03PM6 years ago in Field Notes
-
Bryophyte Herbarium Survey7 years ago in Moss Plants and More
-
Harnessing innate immunity to cure HIV8 years ago in Rule of 6ix
-
WE MOVED!8 years ago in Games with Words
-
-
-
-
post doc job opportunity on ribosome biochemistry!9 years ago in Protein Evolution and Other Musings
-
Growing the kidney: re-blogged from Science Bitez9 years ago in The View from a Microbiologist
-
Blogging Microbes- Communicating Microbiology to Netizens10 years ago in Memoirs of a Defective Brain
-
-
-
The Lure of the Obscure? Guest Post by Frank Stahl12 years ago in Sex, Genes & Evolution
-
-
Lab Rat Moving House13 years ago in Life of a Lab Rat
-
Goodbye FoS, thanks for all the laughs13 years ago in Disease Prone
-
-
Slideshow of NASA's Stardust-NExT Mission Comet Tempel 1 Flyby13 years ago in The Large Picture Blog
-
in The Biology Files
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
BLAST problem solved
The reason BLAST was finding no variation at the central positions of the 39 nt sequences was that I had set its 'word' length to 20. This told it to begin searching by looking for perfect matches to an initial sequence that was 20 nt long. This meant that it could never find a mismatch at the central position because such a mismatch would have been flanked by matched segments that were, at best, 19 nt long.
So I tested word lengths of 10 and of 8. Both eliminated the central mismatch problem, at the trivial expense of increasing search time to at worst 10 seconds for the whole genome.
I'm gradually getting a better understanding of how BLAST searches work (it takes me a while to absorb the complexities), so I've also improved the searches in other ways - allowing higher "E-values" so I get sequences with more than one mismatch, and setting the maximum number or results to a value appropriate for my database size. I've also improvved my Word/Excel shuffle methods, so I get a cleaner dataset. And I now carefully note the numbers of sequences at the various steps.
The graph above is the control analysis for only one orientation of only one of the geneome sequences. So now I'm ready to search all three genomes against the control dataset and, if this looks good, against the USS dataset.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS