The second of my three questions about the USS motif was whether, for the large subset of USSs that are in genes, the orientation of USSs with respect to the proteins they help code for affects their motif consensus. So my plan was to assemble all the coding sequences of the genome, all oriented in the direction their proteins are specified (not in the direction their DNA is replicated), and to then compare the motifs of the USSs in the two possible USS orientations.
Assembling the sequences seemed straightforward (download in one file from TIGR, remove unwanted characters). But the motif-search program couldn't find the USSs I knew were there (see last week's post). I spent a week or more trying more tests and variations, to try to figure out what was going wrong, because I didn't feel that I understood the problem well enough to clearly explain it in an email to the helpful expert. Was the number of sequences over the limit? Were the 'N's I'd had to insert causing problems? Were the sequences too short? Was the problem dominant or recessive to a well-behaved sequence?
Yesterday the same problem appeared in some new sequence files, and then was corrected (see previous post). I wasn't entirely sure what I'd done that made the difference, but this did give me confidence that the problem with my gene sequence files was in the formatting, and my prime suspect was the hated carriage returns. These are a nightmare for Unix beginners like me - they're often invisible, they come in several incompatible flavours (Mac vs PC vs Unix), and Unix/Linux is very fussy about them. I can't remember exactly what I did, but I think it involved global search-and-destroy missions against carriage returns in both Word and Unix, then global restoration of the important returns in Word, then a passage through the text editor Mi to convert any Mac-style returns to Unix ones. And presto, the problem seems to be solved!
So while I've been sleeping the program has been busy searching the gene sequence file for USS motifs, and later this morning I hope to be able to compare the forward- and reverse-direction motifs. We know that protein coding constraints do affect the reading frame that USSs are found in - for each USS orientation there's a preferred reading frame that USSs are best tolerated in. So it's reasonable to suspect that the USS consensuses might also differ between the orientations. If they do, we'll understand a bit more about how natural selection acts on USSs.
Later: No, I was overly optimistic. The program is able to find a short version of the USS motif (10bp) but it can't find anything when asked to search for the full-length motif (22bp). I suspect it needs to be given a stronger prior expectation than just the spacing I'm giving it. Maybe I'll try suggesting the consensus sequence.
- Home
- Angry by Choice
- Catalogue of Organisms
- Chinleana
- Doc Madhattan
- Games with Words
- Genomics, Medicine, and Pseudoscience
- History of Geology
- Moss Plants and More
- Pleiotropy
- Plektix
- RRResearch
- Skeptic Wonder
- The Culture of Chemistry
- The Curious Wavefunction
- The Phytophactor
- The View from a Microbiologist
- Variety of Life
Field of Science
-
-
From Valley Forge to the Lab: Parallels between Washington's Maneuvers and Drug Development4 weeks ago in The Curious Wavefunction
-
Political pollsters are pretending they know what's happening. They don't.4 weeks ago in Genomics, Medicine, and Pseudoscience
-
-
Course Corrections5 months ago in Angry by Choice
-
-
The Site is Dead, Long Live the Site2 years ago in Catalogue of Organisms
-
The Site is Dead, Long Live the Site2 years ago in Variety of Life
-
Does mathematics carry human biases?4 years ago in PLEKTIX
-
-
-
-
A New Placodont from the Late Triassic of China5 years ago in Chinleana
-
Posted: July 22, 2018 at 03:03PM6 years ago in Field Notes
-
Bryophyte Herbarium Survey7 years ago in Moss Plants and More
-
Harnessing innate immunity to cure HIV8 years ago in Rule of 6ix
-
WE MOVED!8 years ago in Games with Words
-
-
-
-
post doc job opportunity on ribosome biochemistry!9 years ago in Protein Evolution and Other Musings
-
Growing the kidney: re-blogged from Science Bitez9 years ago in The View from a Microbiologist
-
Blogging Microbes- Communicating Microbiology to Netizens10 years ago in Memoirs of a Defective Brain
-
-
-
The Lure of the Obscure? Guest Post by Frank Stahl12 years ago in Sex, Genes & Evolution
-
-
Lab Rat Moving House13 years ago in Life of a Lab Rat
-
Goodbye FoS, thanks for all the laughs13 years ago in Disease Prone
-
-
Slideshow of NASA's Stardust-NExT Mission Comet Tempel 1 Flyby13 years ago in The Large Picture Blog
-
in The Biology Files
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
No comments:
Post a Comment
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS