## what is Bayesian inference?

Canto: So as a dumb non-scientific science aficionado, I’ve come across Bayesian inference and probability a few times before, and even might have come to an understanding of it before losing it again, but I’m wanting to get my head around it, especially in terms of consciousness and how we make sense of the external world via the complex interpreting and understanding systems in our heads. My vague sense of it is that it’s a kind of open-ended system of inferring what’s happening by continually updating the ‘understanding system’ with new data. Is that anything like it?

Jacinta: Okay, we’ve been reading Anil Seth’s *Being You, *subtitled ‘a new science of consciousness’, which argues for consciousness, or at least perception, as ‘controlled hallucination’. Bayesian reasoning is tightly described as ‘inference to the best explanation’, so yes, we take percepts that strike us as surprising or out of the ordinary, and do work on them through memory or the widening of perspective to make them fit with previous experience – the best explanation we can make of the meaning of that percept. I think by ‘controlled hallucination’, Seth is suggesting that the impressionistic blast of data that impinges on our senses at any moment gets its ‘control’, loses its hallucinatory impact, as a result of what we call experience, the connections between this blast and previous blasts.

Canto: So that due to familiarity we stop thinking of them as blasts, though they might’ve seemed that way as new-borns. And might seem again under the influence of drugs.

Jacinta: Yes, which can scramble the regular controls. But returning to Thomas Bayes and his reasoning, Seth describes it as *abductive, *as opposed to the deductive reasoning of classical logic, or the inductive reasoning derived from experience (extrapolation from an apparently unending series of observations, such as the regular waxing and waning of the moon). Here’s what Seth says about abduction:

Abductive reasoning – the sort formalised by Bayesian inference – is all about finding the best explanation for a set of observations, when these observations are incomplete, uncertain or otherwise ambiguous. Like inductive reasoning, abductive reasoning can also get things wrong. In seeking the ‘best explanation’, abductive reasoning can be thought of as reasoning backward, from observed effects to their most likely causes, rather than forward, from causes to their effects – as is the case for deduction and induction.

Anil Seth,

Being you,p98

Canto: Ah right, so what we experience first are effects – stuff in our heads, and we have to make the best guess about their causes – stuff in the world. Or what we believe to be in the world. So, as new-borns we see – in our heads – the faces and bodies of these people making a fuss over us, though we apparently don’t even know what faces and bodies are, let alone parents. But over time and much repetition we come to see these faces and bodies aren’t there to harm us (if we’re lucky) and, with further information over vast swathes of time, that they’re our parents, and that we’re one of the species called *Homo sapiens, *etc etc

Jacinta: Well it’s good that you’ve gone back to earliest childhood, because it makes a mockery, in a way, of inferring ‘the most likely cause for the observed data’, to quote Seth, as obviously infants don’t ‘think’ that way.

Canto: And neither do adults – it’s more automatic than ‘thinking’, it’s a way of understanding and surviving in their world…

Jacinta: We need to think of inference as something more basic, far more basic than an intellectual process, of course. Anyway, here’s how Seth describes it. We go from what we already know, which is termed the *prior*, to what we might know in the future (the *posterior*) by means of what we’re now learning (the *likelihood*). The uniting concept here is ‘knowledge’, in its different stages. The *prior *isn’t necessarily stable, it can be modified or overturned by new learning. You could describe the prior also as a belief. You may believe that, say Ukraine will win the current war – whatever winning means in this context – but further learning may alter that belief one way or another. We’re looking for the best posterior probability, and so, in the Ukrainian example, we’re thoroughly examining future likelihoods – media sources and expert opinions as to the current state of events and what they might lead to – as well as battling with particular tendencies to be optimistic or pessimistic.

Canto: But doesn’t Bayesian inference, or probability, have a mathematical aspect? It doesn’t seem, from what you’ve said, that there’s anything remotely quantifiable here. How can you quantify beliefs or knowledge?

Jacinta: Well, Seth is looking at quantities here only in terms of some percept, say, as being more or less likely to be of a particular thing-in-the-world, say a particular species of bird, based on experience, the likelihood of that species being spotted in that place, at that time, and so on. I know that mathematics is involved in Bayesian probability – just look it up online – but the concept of inferring to the most likely conclusion from best current and past data seems to be mathematical only in that broadest sense. And I must admit I’m more interested in Seth’s concept of consciousness than in the mathematics of probability, Bayesian or otherwise.

Canto: Ah, but I’m wondering if, since all the physicists are telling me the universe is, if not mathematical, inexplicable without mathematics, maybe the full comprehension of consciousness requires maths too?

Jacinta: Okay since our topic is Bayesian inference we might need to wade into the mathematical shallows here. So Thomas Bayes presented an alternative to what is now, and maybe then, called *frequentist *statistical analysis. Here’s a rough example taken from a video referenced below. A ‘frequentist GP would use basic statistics derived from a model, say ‘a certain number/percentage of my male patients above a particular age have heart problems’ to infer that the patient before her’s symptoms are quite likely the result of a heart condition. A Bayesian GP would have a similar model but would also take into account her prior knowledge of this particular patient, which would make the diagnosis more likely or unlikely depending on the content of this prior knowledge.

Canto: Yeah that’s the mathematical shallows all right.

Jacinta: Well it might surprise you how mathematical even examples like this can be made. But put another way, the Bayesian approach is experiential rather than simple statistical number-crunching. ‘Frequentist’ is given away by the title, so maybe it strives to be objective.

Canto: Quantitative vs qualitative?

Jacinta: Well, yes that’s part of it, but there is a Bayesian theorem, which I may as well stick in here for completeness’ sake.

There are different descriptions of the theorem – this one doesn’t give much indication of the importance of prior knowledge/experience. Anyway, returning to Seth and consciousness, these Bayesian inferences would be constantly updated in the case of infants as you say, as new knowledge is being produced at a rapid clip, that this animal is a dog, say, and is mostly harmless but not always, and this item isn’t food though it’s nice to suck on, but that item tastes horrible – though they wouldn’t know what *taste *is…

Canto: Which really explains why all these neural connections are laid down do quickly in early childhood – they’re really essential for survival.

Jacinta: And, as Seth points out, the best scientific methods involve Bayesian inference – theories upgraded or discarded by experimental evidence or new discoveries that don’t fit. But our thinking – that, when we’re infants, these people constantly around us are more significant, for us, than the people who pass by or occasionally visit, doesn’t have to rise to the level of theory. They’re just understandings, more or less accurate, and constantly updated – for example we might learn that these adults or pets aren’t always on our side, for example when we try to eat the dog, or whatever. Anyway, we could go into a little bit of detail about the probabilities, from zero to one, of priors, likelihoods and posteriors, and about probability distributions, of the Gaussian kind, which shift as more information comes to mind, but maybe we’ll come back to it in a future post. My head hurts already.

**References**

Anil Seth, *Being you: a new science of consciousness, *2021

Bayesian vs frequentist statistics (video), Ox Educ

Frequentism and Bayesianism: What’s the Big Deal? | SciPy 2014 | Jake VanderPlas (video)

## Leave a Reply