RRResearch: Reanalysis of old uptake data

I've started reanalyzing the old DNA-uptake data (see New bottles for old wine). Yesterday I succeeded in using the Gibbs motif-search software (thank you RSA Tools!) to analyze the sequences from the 1990 paper, and was encouraged when it did find a USS motif in 15 of the 28 sequences. These 15 were fragments that cells had strongly preferred to take up, and the USS motif looks very much like the one derived previously from the whole-genome consensus. This result is very preliminary (I haven't yet kept any notes or done it meticulously), but it suggests that the bias of the uptake machinery does correspond well to the consensus of the genome repeats.

Today I did the preliminary analysis (this time keeping notes) of the phage-derived sequences from one of the earlier papers (1984). These sequences had not been put into GenBank as a neat set, so I had to download the phage sequence and use a nice shareware DNA-analysis program (Sequence Analysis; thank you Will Gilbert!) to identify the sequences of the five short fragments I will analyze.

I still need to deal with an annoying format problem. The motif-search programs accept DNA sequences only in particular formats, of which the simplest is "FASTA". FASTA identifies comment lines by starting them with an ">", but for some reason these programs treat the text after my ">"s as sequence. Of course they choke, because the text contains non-sequence characters (i.e. not just A G C T and N). If I paste FASTA-format sequence in directly from GenBank there's no problem, so I think Word is doing something weird with the ">" character. I need to find a better text editor for Macs (maybe Mi). Unfortunately TextEdit has been 'improved' to the point where it can no longer handle plain text - it insists on saving all files as RTF or HTML.

2 comments:

AnonymousAugust 23, 2006 at 7:15 PM
I second Greg's comments. It's really important to use the right software tool for the right job. Word is a word processor, so use it for word processing. Editing plain ASCII text requires a plain ASCII text editor. What you see in Word is not plain text, even if it looks like it is - as a friend of mine says, it's a graphical representation of plain text. You'll have all kinds of trouble if you try to use Word as an editor.
Rosie RedfieldAugust 24, 2006 at 6:31 AM
Thanks guys. I did suspect that I shouldn't trust Word as a text editor, but I only now discovered that changing TextEdit's Preferences would allow it to creat text files. And thanks for pointing me to nodalpoint.

Greg, I'm amazed that you even try to keep up with the science I'm describing. I only really expect members of my lab to do that - I put it on the blog just to let the general publi see what "doing science" looks like.

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS

Field of Science

Reanalysis of old uptake data

2 comments: