In a previous post I showed logos from four replicate Gibbs analyses of Goodgal and Mitchell's set of plasmid insert sequences. Because the replicates didn't produce identical motif sites, the logos differed. I've now done 10 replicate runs (each testing 100 seeds) and pooled the results to create a single logo.
Not only did the different replicates find different numbers of sites (between 21 and 28), they settled on motifs with slightly different centers and (once) a different length. This gives me more confidence that this consensus logo represents most of the variation in sites.
One remaining concern is that these searches were not unbiased, in that I started them all with a 'fragmentation mask' that specified the positions of gaps in the motif. See this post for an explanation. What would a completely unbiased search find? I suspect it will close the gaps, and ignore the consensus at positions that aren't close to the core, as this is what happens in unbiased searches of the genome. But I should try (now).
...Short break while I do this (28 short sequences means it runs very fast)...
Well I'll be hornswoggled! (Sorry, Language Log influence.) It found the reverse version of the full USS motif!
...Another slightly longer break while I do ten replicate runs...
And here's two versions of the logo from the unbiased searches. The top one is what the searches find - the USS motif in reverse orientation . The lower logo is the reverse complement of this (same data, but in back-to-front order and with all the As changed to Ts, Ts to As, Gs to Cs and Cs to Gs).
This 'unbiased' motif is actually a better match to the genome consensus than is the motif I got using the fragmentation mask. How nice!
I've been (sometimes) putting 'unbiased' in quotes, because I don't think any pattern search can be truly unbiased. The Gibbs motif sampler program I'm using for these searches has a bias towards compact motifs - there's a built-in penalty for introducing gaps.
- Home
- Angry by Choice
- Catalogue of Organisms
- Chinleana
- Doc Madhattan
- Games with Words
- Genomics, Medicine, and Pseudoscience
- History of Geology
- Moss Plants and More
- Pleiotropy
- Plektix
- RRResearch
- Skeptic Wonder
- The Culture of Chemistry
- The Curious Wavefunction
- The Phytophactor
- The View from a Microbiologist
- Variety of Life
Field of Science
-
-
-
Political pollsters are pretending they know what's happening. They don't.5 weeks ago in Genomics, Medicine, and Pseudoscience
-
-
Course Corrections6 months ago in Angry by Choice
-
-
The Site is Dead, Long Live the Site2 years ago in Catalogue of Organisms
-
The Site is Dead, Long Live the Site2 years ago in Variety of Life
-
Does mathematics carry human biases?4 years ago in PLEKTIX
-
-
-
-
A New Placodont from the Late Triassic of China5 years ago in Chinleana
-
Posted: July 22, 2018 at 03:03PM6 years ago in Field Notes
-
Bryophyte Herbarium Survey7 years ago in Moss Plants and More
-
Harnessing innate immunity to cure HIV8 years ago in Rule of 6ix
-
WE MOVED!8 years ago in Games with Words
-
-
-
-
post doc job opportunity on ribosome biochemistry!9 years ago in Protein Evolution and Other Musings
-
Growing the kidney: re-blogged from Science Bitez9 years ago in The View from a Microbiologist
-
Blogging Microbes- Communicating Microbiology to Netizens10 years ago in Memoirs of a Defective Brain
-
-
-
The Lure of the Obscure? Guest Post by Frank Stahl12 years ago in Sex, Genes & Evolution
-
-
Lab Rat Moving House13 years ago in Life of a Lab Rat
-
Goodbye FoS, thanks for all the laughs13 years ago in Disease Prone
-
-
Slideshow of NASA's Stardust-NExT Mission Comet Tempel 1 Flyby13 years ago in The Large Picture Blog
-
in The Biology Files
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
No comments:
Post a Comment
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS