but only in Statistics. (I promise, this is the last Bayesian post, at least for a while.)

I've always thought that 'probability' and 'likelihood' were synonyms, but yesterday I learned that in Statistics they have distinct and complementary meanings. Unfortunately it's hard to define either of them without using the other, so I'll use the word 'chance' and clarify with examples.

Consider that you have some data and that you have a hypothesis about the reality that produced this data. For example, the data could be that plating the same volume of bacteria on a novobiocin-agar plate and a plain-agar plate gave 43 and 321 colonies respectively, and your hypothesis about reality is that 15.0% of the cells in the culture are able to grow in the presence of novobiocin (are NovR).

Likelihood (as defined for statistical work) is the chance that a real culture with 15.0% NovR cells would have given these numbers of cells when that volume was plated. More generally, it's the chance that the reality you've hypothesized (often your 'null hypothesis') could have produced the particular data you got. This is what classic 'frequentist' statistical methods deal with. Phylogenetic methods using 'maximum likelihood' presumably take this approach.

Probability (as defined for statistical work) reasons the other way around. It's the chance that the culture really has 15.0% NovR cells, given that your plating experiment produced 43 NovR colonies out of 321 total colonies. More generally it's the chance that the reality you're considering is true, given the data you have. This is what Bayesian methods deal with. The phylogenetic software 'Mr. Bayes' presumably takes this approach.

For now I'm not going to worry about why this might matter.

"We have met the enemy, and he is us."

2 hours ago in The Phytophactor

"For now I'm not going to worry about why this might matter."

ReplyDeleteIt may matter even less than you think, at least in phylogenetics. MrBayes does use Bayesian statistics to infer phylogenetic relationships among taxa. However, most people give the program a "flat" prior, which basically makes it a Maximum Likelihood analysis.

So why even use MrBayes? Why not just use a ML method? ML analyses are notoriously computationally intensive and that is just to produce one tree and it is best to resample the data and produce a distribution of trees to calculate statistical support for branches (bootstrap).

However, bootstrapping is not needed for MrBayes because the program outputs the probability of each node. So basically, MrBayes saves time and produces similar results to ML. And you can use a lot of sophisticated evolutionary models in your analysis.

Sorry if this bores you......but now you have me thinking of Bayesian versus Maximum Likelihood and since I am publishing a paper using these methods, I keep asking.........what is the point?

I think you could put your general definition even ore simply: probability is the chance of an event/data given certain values of parameters and likelihood is the chance of certain values of parameters given the event/data.

ReplyDeleteJust to add, likelihood function need not define a probability measure i.e. the integral need not add up to 1.

ReplyDeleteI think your definitions are the wrong way around. From wikipedia:

ReplyDelete"In non-technical parlance, "likelihood" is usually a synonym for "probability" but in statistical usage, a clear technical distinction is made. One may ask "If I were to flip a fair coin 100 times, what is the probability of it landing heads-up every time?" or "Given that I have flipped a coin 100 times and it has landed heads-up 100 times, what is the likelihood that the coin is fair?" but it would be improper to switch "likelihood" and "probability" in the two sentences."

Probability talks about the chances of the observation, likelihood refers to the chances of the parameters being correct given an observation.