In the paper I'm working on, we'll be comparing the USS motifs of various species. But of course there is no one true Gibbs motif, as the results depend on both random factors and ones I control. I don't see the randomness as a concern. It arises from both the random events in the history of the sequenced genome and the random-number seed that each Gibbs run starts with. The effectiveness of the searches, and the high numbers of USSs in the genomes, mean that the randomness isn't a big issue.
But factors I control can have a big effect on the results of a search. Probably most important is the specification of an 'expected' number of occurrences of the motif. If I set this low, the search will be very stringent, reporting only occurrences that are very well matched to the motif it's found. If I set it high many poorer matches will be reported. There's no 'right' setting, because there's no 'right' USS.
In order to compare USS motifs between genomes I need to have done the searches with comparable stringencies. The simple method I'll try is to use 'expected numbers' that are 1.5 times the number of perfect matches to the standard 'core' consensus. The identification of 'core' is somewhat arbitrary and historically contingent, but using it lets me treat all the genomes thought to have the same consensus in the same way. So for H. influenzae, H. somnus, Pasteurella multocida, Actinobacillus actinomycetemcomitans and Mannheimia succiniciproducens I'll use 1.5 X the number of occurrences of AAGTGCGTT, for H. ducreyi, A. pleuropneumoniae and M. haemolytica I'll use 1.5 X the number of occurrences of ACAAGCGGT, and for the Neisserias I'll use 1.5 X the number of occurrences of ATGCCGTCTGAA.
The Gibbs searches I queued two days ago were terminated that night because I'd forgotten to set the memory allocation high enough. I re-queue'd them yesterday with more memory requested. The A. pleuropneumoniae was terminated again last night, I think because the long genome and long motif put too big a demand on the program, so I've separated the 'forward' and 'reverse complement' sequences and requeue'd them as two separate jobs. The Neisseria meningitidis one is still running; I hope it doesn't run out of allocated time before finishing.
- Home
- Angry by Choice
- Catalogue of Organisms
- Chinleana
- Doc Madhattan
- Games with Words
- Genomics, Medicine, and Pseudoscience
- History of Geology
- Moss Plants and More
- Pleiotropy
- Plektix
- RRResearch
- Skeptic Wonder
- The Culture of Chemistry
- The Curious Wavefunction
- The Phytophactor
- The View from a Microbiologist
- Variety of Life
Field of Science
-
-
From Valley Forge to the Lab: Parallels between Washington's Maneuvers and Drug Development4 weeks ago in The Curious Wavefunction
-
Political pollsters are pretending they know what's happening. They don't.4 weeks ago in Genomics, Medicine, and Pseudoscience
-
-
Course Corrections5 months ago in Angry by Choice
-
-
The Site is Dead, Long Live the Site2 years ago in Catalogue of Organisms
-
The Site is Dead, Long Live the Site2 years ago in Variety of Life
-
Does mathematics carry human biases?4 years ago in PLEKTIX
-
-
-
-
A New Placodont from the Late Triassic of China5 years ago in Chinleana
-
Posted: July 22, 2018 at 03:03PM6 years ago in Field Notes
-
Bryophyte Herbarium Survey7 years ago in Moss Plants and More
-
Harnessing innate immunity to cure HIV8 years ago in Rule of 6ix
-
WE MOVED!8 years ago in Games with Words
-
-
-
-
post doc job opportunity on ribosome biochemistry!9 years ago in Protein Evolution and Other Musings
-
Growing the kidney: re-blogged from Science Bitez9 years ago in The View from a Microbiologist
-
Blogging Microbes- Communicating Microbiology to Netizens10 years ago in Memoirs of a Defective Brain
-
-
-
The Lure of the Obscure? Guest Post by Frank Stahl12 years ago in Sex, Genes & Evolution
-
-
Lab Rat Moving House13 years ago in Life of a Lab Rat
-
Goodbye FoS, thanks for all the laughs13 years ago in Disease Prone
-
-
Slideshow of NASA's Stardust-NExT Mission Comet Tempel 1 Flyby13 years ago in The Large Picture Blog
-
in The Biology Files
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
No comments:
Post a Comment
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS