Field of Science


Blog update1 day ago in The Phytophactor

The clearest essay on physics I have read: The beauty and clarity of Paul Dirac's prose1 day ago in The Curious Wavefunction






ReBlog: June Was 6th Warmest Globally1 week ago in The View from a Microbiologist


What does the latest research say on religion decreasing the risk of suicide?3 weeks ago in Epiphenom

Geology and Generals: How Geology influenced the Gettysburg Campaign (Part I.)4 weeks ago in History of Geology



Microtwjc 49: Discussion and Criticisms1 month ago in Memoirs of a Defective Brain

Palaeontology Online: Fossil Focus: Placodonts1 month ago in Chinleana

Welcome to Curious Bends2 months ago in The Allotrope

Rule of 6ix has moved3 months ago in Rule of 6ix





Live coverage of Big Protist Conference (ICOP) in Vancouver, 28 Jul  02 Aug1 year ago in Skeptic Wonder



The Lure of the Obscure? Guest Post by Frank Stahl2 years ago in Sex, Genes & Evolution

Finding a new translation factor, and verifying it with help from my experimental friends2 years ago in Protein Evolution and Other Musings



The Large Picture Blog Has Moved2 years ago in The Large Picture Blog

Lab Rat Moving House2 years ago in Life of a Lab Rat

Goodbye FoS, thanks for all the laughs3 years ago in Disease Prone

Branson getting into microbial diversity in the deep sea3 years ago in The Greenhouse
Not your typical science blog, but an 'open science' research blog. Watch me fumbling my way towards understanding how and why bacteria take up DNA, and getting distracted by other cool questions.
I should have paid more attention in stats class
One of the reviewers of the manuscript I'm revising for Genome Biology and Evolution asked if we could do some statistical analysis of the data we present in a graph. On the left I've put the graphs and the data . The lower graph panel and lower block of data are the controls; we can ignore them for now. I think we can also safely ignore what the data represent.
I'll describe the significance questions with respect to the toppanel graph (A):
We want to know the following:
In the left group (4 blocks of four bars, labels SAV, TAL, KEG, PHF/L), are the four blue bars significantly higher than the red, yellow and green bars beside them?
In the middle group (4 blocks of 4 bars, labels QAV, TAC, TSG, PLV), are the four red bars significantly higher than the blue, yellow and green bars beside them?
In the right group,(5 blocks of 4 bars, labels PSE, SDG, FRR, QTA, RLN/K), are the five yellow bars significantly higher than the blue, red and green bars beside them?
The actual numbers are in the upper part of the table, in the correspondingly coloured cells, and below I'll restate the above questions in terms of these numbers.
In the top four rows of the table (blue), are the numbers in the brightblue cells significantly higher than the numbers in the lightblue cells in the same rows?
In the next four rows of the table (pink), are the numbers in the brightpink cells significantly higher than the numbers in the lightpink cells in the same rows?
In the next four rows of the table (yellow), are the numbers in the brightyellow cells significantly higher than the numbers in the lightyellow cells in the same rows?
I suspect this is an ANOVA (analysis of variance) type of problem. But I'm pretty sure it would require more complicated analysis than the simple ANOVA described the new statistics textbook my authorcolleague kindly gave me (probably to get me off his back with dumb statistics questions). Hmmm, maybe it would be possible to do a separate ANOVA on each group  i.e. one for the blue data, one for the red data, and one for the yellow data.
UPDATE:
My basic version of EXCEL doesn't have the statistics addin needed for ANOVAs, and I can't even remember the name of the statistics/graphing package the lab owns (it's not installed on my computer). But I found an online applet to do twoway ANOVAs here ( I need twoway because I have two variables, the rows and the columns). So I pasted the data from the blue cells into the applet, with the following results.
"Conclusion on Treatments Effects: Very strong evidence against the null hypothesis." The null hypothesis is that all treatments (columns) gave the same results, so there are very significant differences between the data in the different columns (p=0.00058).
"Conclusion on Blocks Effects: Moderate evidence against the null hypothesis." The null hypothesis is that all blocks (rows) gave the same results, so there are moderately significant differences between the data in the different rows (p=0.011).
This is definitely the kind of information I want, so I guess I should find the lab's statistical/graphing package and find someone to show me how to use it to do ANOVAs properly.
But this analysis doesn't let me see whether it's only the brightblue column that's significantly different from the others. I guess I could repeat the analysis, leaving out the brightblue data, and see if the others are not significantly different, but I'm sure there's a better way to do this. After I play around with our statistical/graphing package for a bit, I might be knowledgeable enough to go ask my colleague for help without embarrassing myself too badly.
2 comments:
Markup Key:
 <b>bold</b> = bold
 <i>italic</i> = italic
 <a href="http://www.fieldofscience.com/">FoS</a> = FoS
Subscribe to:
Post Comments (Atom)
I really like using Graphpad Prism for that kind of stuff. All of the statistical analysis tools are linked to the extensive help files, which almost make up a statistics textbook themselves. Very clear, very understandable. Even for someone like me, who also didn't pay that much attention in statistics class ;)
ReplyDeleteI think Graphpad Prism is the name of the package we already have (the one I need to learn to use)!
ReplyDelete