When you use Bayesian inference to estimate the probability of heads for a coin toss, you get a beta distribution. It is a distribution for the probability of heads, i.e. it is a probability for a probability.

If you know nothing about the coin, then it starts out as a flat uniform distribution. With no information there is no way to know if the coin has 2 heads, 2 tails, or some kind of bias in between. As you toss the coin the parameters for the distribution are updated and it starts to become peaked and more narrow. If you keep tossing the coin it will eventually converge (after an infinite number of tosses) to a delta function located at the true probability of heads.

The video below is an animation that shows how this happens. The true probability for the coin in the video is 0.6 and you can see the peak of the distribution homing in on that value as the number of tosses increases.

The simulation stops after 300 tosses. At that point the peak is close to 0.6 but still a small distance away. This is why it is nice to have a distribution for the probability instead of just a point estimate. That way you can see the range of values over which the true probability has a good chance of being found.

If you're interested in more information about the beta distribution and how it relates to the coin toss, see The Coin Toss: The Hydrogen Atom of Probability.

© 2010-2012 Stefan Hollos and Richard Hollos

blog comments powered by Disqus