## Bayesian probability, sans maths (mostly)

Okay time to get back to sciency stuff, to try to get my head around things I should know more about. Bayesian statistics and probability have been brought to the periphery of my attention many times over the years, but my current slow reading of Daniel Kahneman’s *Thinking fast and slow *has challenged me to master it once and for all (and then doubtless to forget about it forevermore).

I’ve started a couple of pieces on this topic in the past week or so, and abandoned them along with all hope of making sense of what is no doubt a doddle for the cognoscenti, so I clearly need to keep it simple for my own sake. The reason I’m interested is because critics and analysts of both scientific research and political policy-making often complain that Bayesian reasoning is insufficiently utilised, to the detriment of such activities. I can’t pretend that I’ll be able to help out though!

So Thomas Bayes was an 18th century English statistician who left a theorem behind in his unpublished papers, apparently underestimating its significance. The person most responsible for utilising and popularising Bayes’ work was the French polymath Pierre-Simon Laplace. The theorem, or rule, is captured mathematically thusly:

where *A* and *B* are events, and *P(B)*, that is, the probability of event *B*, is not equal to zero. In statistics, the probability of an event’s occurrence ranges from 0 to 1 – meaning zero probability to total certainty.

I do, at least, understand the above equation, which, wordwise, means that the probability of *A *occurring, given that *B *has occurred, is equal to the probability of *B *occurring, given that *A *has occurred, multiplied by the probability of *A’s *occurrence, all divided by the probability of B’s occurrence. However, after tackling a few video mini-lectures on the topic I’ve decided to give up and focus on Kahneman’s largely non-mathematical treatment with regard to decision-making. The theorem, or rule, presents, as Kahneman puts it, ‘the logic of how people should change their mind in the light of evidence’. Here’s how Kahneman first describes it:

Bayes’ rule specifies how prior beliefs… should be combined with the diagnosticity of the evidence, the degree to which it favours the hypothesis over the alternative.

D Kahneman,Thinking fast and slow,p154

In the most simple example – if you believe that there’s a 65% chance of rain tomorrow, you really need to believe that there’s a 35% chance of no rain tomorrow, rather than any alternative figure. That seems logical enough, but take this example re US Presidential elections:

… if you believe there’s a 30% chance that candidate x will be elected President, and an 80% chance that he’ll be re-elected if he wins first time, then you must believe that the chances that he will be elected twice in a row are 24%.

This is also logical, but not obvious to a surprisingly large percentage of people. What appears to ‘throw’ people is a story, a causal narrative. They imagine a candidate winning, somewhat against the odds, then proving her worth in office and winning easily next time round – this story deceives them into defying logic and imagining that the chance of her winning twice in a row is greater than that of winning first time around – which is a logical impossibility. Kahneman places this kind of irrationalism within the frame of system 1 v system 2 thinking – roughly equivalent to intuition v concentrated reasoning. His solution to the problem of this kind of suasion-by-story is to step back and take greater stock of the ‘diagnosticity’ of what you already know, or what you have predicted, and how it affects any further related predictions. We’re apparently very bad at this.

There are many examples throughout the book of failure to reason effectively from information about *base rates*, often described as ‘base-rate neglect’. A base rate is a statistical fact which should be taken into account when considering a further probability. For example, when given information about the character of a a fictional person T, information that was deliberately designed to suggest he was stereotypical of a librarian, research participants gave the person a much higher probability of being a librarian rather than a farmer, even though they knew, or should have known, that the number of persons employed as farmers was higher by a large factor than those employed as librarians (the base rate of librarians in the workforce). Of course the degree to which the base rate was made salient to participants affected their predictions.

Here’s a delicious example of the application, or failure to apply, Bayes’ rule:

A cab was involved in a hit-and-run at night. Two cab companies, Green Cabs and Blue Cabs, operate in the city. You’re given the following data:

– 85% of the cabs in the city are Green, 15% are Blue.

– A witness identified the cab as Blue. The court tested the reliability of the witness under the circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colours 80% of the time and failed 20% of the time.

What is the probability that the car involved in the accident was Blue rather than Green?

D Kahneman,Thinking fast and slow,p166

It’s an artificial scenario, granted, but if we accept the accuracy of those probabilities, we can say this: given that the base rate of Blue cars is 15%, and the probability of the witness identifying the car accurately is 80%, we have this figure for the dividend – (.15/.85) x (.8/.2) =.706. Dividing this by the range of probabilities plus the dividend (1.706) gives approximately 41%.

So how close were the research participants to this figure? Most participants ignored the statistical data – the base rates – and gave the figure of 80%. They were more convinced by the witness. However, when the problem was framed differently, by providing causal rather than statistical data, participants’ guesses were more accurate. Here’s the alternative presentation of the scenario:

You’re given the following data:

– the two companies operate the same number of cabs, but Green cabs are involved in 85% of accidents

– the information about the witness is the same as previously presented

The mathematical result is the same, but this time the guesses were much closer to the correct figure. The difference lay in the framing. Green cabs *cause *accidents. That was the fact that jumped out, whereas in the first scenario, the fact that most clearly jumped out was that the witness identified the offending car as Blue. The statistical data in scenario 1 was largely ignored. In the second scenario, the witness’s identification of the Blue car moderated the tendency to blame the Green cars, whereas in scenario 1 there was no ‘story’ about Green cars causing accidents and the blame shifted almost entirely to the Blue cars, based on the witness’s story. Kahneman named his chapter about this tendency ‘Causes trump statistics’.

So there are *causal *and *statistical *base rates, and the lesson is that in much of our intuitive understanding of probability, we simply pay far more attention to causal base rates, largely to our detriment. Also, our causal inferences tend to be stereotyped, so that only if we are faced with surprising causal rates, in particular cases and not presented statistically, are we liable to adjust our probabilistic assessments. Kahneman presents some striking illustrations of this in the research literature. Causal information creates bias in other areas of behaviour assessment too, of course, as in the phenomenon of regression to the mean, but that’s for another day, perhaps.

## Leave a Reply