The third question I asked about the USS motif was whether there is evidence for interactions. My query to the EvolDir list produced three applicable programs. One looked difficult so I left it as a last resort. A second had been written by a colleague (in Fortran! He's an old-fashioned guy (we were post-docs together)). He kindly offered to try running our preliminary sequence set for us, and sent a monster Excel file full of the statistical results, with the 24 significant ones highlighted. There's a strong risk of spurious correlations in this kind of analysis, but the ones he found seem likely to be genuine, as they are almost all between adjacent positions.
In the meantime I'd also been trying out a program that had a lovely simple web interface. But it found only two covarying positions, and these seemed very weak (i.e. their squares on the matrix were only a tiny bit darker than the background. I was attracted to this web program because its matrix display of the results seemed so intuitive, but quickly realized that this simplicity was failing to tell me what I need to know. After a lot of back and forth with a helpful expert (= person who let his email address be linked to the web page) I now have a folder full of the software and associated files (ReadMe, Help), and can begin working out how to run it for myself.
Aaarrgghhhh! It's written in a programming language called GAWK/NAWK. Wikipedia says AWK was a precursor to Perl, and runs in Unix; GAWK is GNU-AWK. Thanks, that's a big help. Mac OS 10.4 doesn't have GAWK, just AWK. I hope Westgrid has GAWK.
Information and Structure in Complex Systems
1 day ago in PLEKTIX