More than 250 years ago, the challenge of making predictions from small data weighed strongly on presbyterian minister Reverend Bayes of Tunbridge Wells, England.

Looking to the easily banal Raffles of the 18th Century England he wondered what one's chances of winning them were. If five tickets out of ten bought won, then the chances of a win were quite simply 50%. But what if one bought a single ticket and it came out the winner? Were the chances of winning the Raffle really a 100%? It sounded far too simplistic to our dear Reverend who balanced scholarly and theological interests almost all his life. Ordained like his father and a man of keen intellect he was elected to the Royal Society in 1742.

For somebody who had such an immense impact in the field of reasoning under uncertainty, little can be said about Bayes's history with certainty. He is certainly known to have published two works in his lifetime, one theological and one mathematical:

● Divine Benevolence, or an Attempt to Prove That the Principal End of the Divine Providence and Government is the Happiness of His Creatures (1731)
● An Introduction to the Doctrine of Fluxions, and a Defence of the Mathematicians Against the Objections of the Author of The Analyst (published anonymously in 1736), in which he defended the logical foundation of Isaac Newton's calculus ("fluxions") against the criticism of George Berkeley, author of The Analyst.

However, we have no records of his other published works. Be that as it may, we know for certain that his work and findings on probability theory were passed in manuscript form to his friend Richard Price after his death. For it was in his friend's perusal that Bayes's Critical Insight underlying the elegance of the Bayes theorem came into light.

Bayes noted that, 'Reasoning forward from hypothetical pasts lays the foundation for us to then work backward to the most probable past' [Algorithms to Live By: The Computer Science of Human Decisions]. He explained that, if we bought three tickets and all three of them are winners then the chances of winning were 100%, a rather generous find. If half of the tickets sold in a raffle are winners then our chances are 1/2*1/2*1/2  i.e 1/8.If the raffle awarded only 1 in thousand tickets then our chances of winning in the three tickets bought are 1/1000 * 1/1000 * 1/1000  or 1 in a billion.Bayes 's logic quantifies our intuition that it is exactly 8 times likelier that all tickets are winners than half of them are. Furthermore it is exactly 125 million times likelier that half of them are winners as opposed to 1 in a thousand.

He probably came up with this Critical Insight in an experiment where he sat with his back to a perfectly round, perfectly square table and asked his assistant to throw a ball on the table and describe its position, using the words "left", "right" "front" and "back". He noted it down on his pad and consistently made his hypothesis about the original position of the ball better. But the insight never saw ink to paper as Bayes probably abandoned the same, thinking of it as rather ordinary, common knowledge.

Thinking back to our example, what are the odds of winning a raffle away? In 1774, Laplace solved the problem of making inferences backwards from their observed effects to their probable causes. Laplace was born in Normandy in 1749, and his father sent him to the University of Caen to study theology. Unlike Bayes, Laplace abandoned his theist pursuit entirely for mathematics. Using Calculus, Laplace could distill the vast number of hypotheses( as in the case of a raffle) to a single concise value with a superbly succinct formula w+1/n+2 where w is number of winning tickets in n attempts. If you try once and it works out, Laplace's Law estimates 2/3 viz comes out around 67% as opposed to 100% by Bayes intuition.

Laplace also considered a crucial modification to the Bayes theorem. He added a mechanism that would assign more weight to hypotheses that are simply more probable than others. Suppose a friend shows you nine fair coins and one biased coin, puts all of them in a bag and draws a Head. Now is the coin a fair or a biased one? A fair coin is half as likely to pull up heads as a biased coin. But now it is also nine times likely to be drawn in the first place. We simply multiply these considerations and find out that it is 4.5 more likely that the Heads came from a fair coin.

The mathematical formula that describes this relation tying together our previous notions and data before our eyes, has come to be known as the Bayes's Rule, ironically, as the vital industry is provided by Laplace. If it seems like Laplace went without credit, then you need not worry, for he went on to incepting a field we now know as mathematical physics. He also formulated Laplace's equation, and pioneered the Laplace transform which appears in many branches of mathematical physics. The Laplacian differential operator, widely used in mathematics, is also named after him. He restated and developed the nebular hypothesis of the origin of the Solar System and was one of the very first scientists to postulate the existence of black holes and the idea of gravitational collapse. Reverend Bayes too went down in history as the Father of modern Reasoning. So, it’s was a happy ending after all.

Author of this piece: Aishwarya Mali, Associate Data Scientist @AlgoAnalytics

Posted in Machine Learning and tagged , , .

Leave a Reply

Your email address will not be published. Required fields are marked *