This afternoon the post-doc and I were discussing the uptake-sequence analysis we're proposing for the CIHR grant. We wanted a way to compare the actual uptake of genome fragments with that predicted by their scores with the matrix I'd derived from the genome sequence using the Gibbs motif sampler. I realized that we might be able to use my existing Perl program (the one that evolves uptake sequences in simulated genomes) to score these.
So, after I burned out on rewriting paragraphs in the proposal, I opened up the giant Perl program US.pl. In the 965 lines of code I found just the spot I wanted, and told the program to print the score it had just calculated. I put the first 10 kb of the H. influenzae genome sequence into a text file, and changed the program's settings file so it would score this sequence but wouldn't try to evolve it. And I created a scoring matrix using the output from one of my old Gibbs runs on the H. influenzae genome.
It ran! I pasted the output into a Word file and converted the '-' signs of the exponents of the scores (range -32 to -8) to a tab mark, and reopened the file in Excel. I then subtracted these values from 32, so now the lowest pseudoscore value was zero and the highest was 24. Then I plotted this 'score' against the row numbers
And here's the result (black bars). To check that I hadn't screwed up, I looked up the position of the highest score in the Excel file (score = 24, at position 5319) and checked the sequence I'd started with. Sure enough, at position 5319 I found AAAGTGCGGTCAATTTTTCTGGTATTTTTT, a near-perfect match to the USS consensus.
What this graph doesn't show very well is the large number of positions with very low scores. So here's another graph (blue bars) to the same scale, but showing only the first 300 positions. Now we see that most positions have scores less than 10.
Versions of these figures will go in the Appendix of the proposal, as preliminary data.
- Home
- Angry by Choice
- Catalogue of Organisms
- Chinleana
- Doc Madhattan
- Games with Words
- Genomics, Medicine, and Pseudoscience
- History of Geology
- Moss Plants and More
- Pleiotropy
- Plektix
- RRResearch
- Skeptic Wonder
- The Culture of Chemistry
- The Curious Wavefunction
- The Phytophactor
- The View from a Microbiologist
- Variety of Life
Field of Science
-
-
-
Political pollsters are pretending they know what's happening. They don't.5 weeks ago in Genomics, Medicine, and Pseudoscience
-
-
Course Corrections6 months ago in Angry by Choice
-
-
The Site is Dead, Long Live the Site2 years ago in Catalogue of Organisms
-
The Site is Dead, Long Live the Site2 years ago in Variety of Life
-
Does mathematics carry human biases?4 years ago in PLEKTIX
-
-
-
-
A New Placodont from the Late Triassic of China5 years ago in Chinleana
-
Posted: July 22, 2018 at 03:03PM6 years ago in Field Notes
-
Bryophyte Herbarium Survey7 years ago in Moss Plants and More
-
Harnessing innate immunity to cure HIV8 years ago in Rule of 6ix
-
WE MOVED!8 years ago in Games with Words
-
-
-
-
post doc job opportunity on ribosome biochemistry!9 years ago in Protein Evolution and Other Musings
-
Growing the kidney: re-blogged from Science Bitez9 years ago in The View from a Microbiologist
-
Blogging Microbes- Communicating Microbiology to Netizens10 years ago in Memoirs of a Defective Brain
-
-
-
The Lure of the Obscure? Guest Post by Frank Stahl12 years ago in Sex, Genes & Evolution
-
-
Lab Rat Moving House13 years ago in Life of a Lab Rat
-
Goodbye FoS, thanks for all the laughs13 years ago in Disease Prone
-
-
Slideshow of NASA's Stardust-NExT Mission Comet Tempel 1 Flyby13 years ago in The Large Picture Blog
-
in The Biology Files
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
No comments:
Post a Comment
Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS