The paper [1] offered a comprehensive review on the progress on predictive coding and Bayesian inference. Categorizing from Marr’s three levels [2], Bayesian inference is on the computational level and predictive coding is on the representation level. As discussed in the paper, the two concepts can be developed alone or combined with each other. Predictive coding can be a useful neural representational motif for multiple computational goals (e.g. efficient coding, membrane potential dynamics optimization and reinforcement learning). The brain can also perform Bayesian inference with other neural representations (e.g. probability coding, log-probability coding, and direct variable coding). The authors observe that more experiments need to be designed to offer direct evidence about such representations.
Predictive coding is used to describe very different approaches in the neuroscience literature [3]. This paper understood the term as representing prediction error by neural responses. Alternatively, it has been defined as neurons that preferentially encode stimuli that carry information about the future [4]. The two definitions are not necessarily consistent with each other, since the neurons representing errors might not be predictive about the future.
Bayesian inference uses the sensory input to compute the posterior over latent variables. Based on this posterior, the brain might pick an ideal point estimate about the latent. This preferred latent could then be used to compute a prediction about input.
Predictive coding doesn’t specify how to generate predictions, whereas Bayesian inference offers a natural way to compute the prediction. It seems natural to combine them to create Bayesian predictive coding. In [5], a hierarchical model that use neurons to carry out the prediction error generated by the bottom-up inputs and top-down predictions is built to model the vision system. Several experimental results (e.g. surrounding suppression) can be explained by this hierarchical Bayesian predictive coding model. But this model also used neurons to represent the prediction and latent variables. Thus it’s a more like a hybrid Bayesian predictive and direct variable coding model. We then argued in our journal club about if the pure predictive coding idea is enough to represent the brain’s inference process. Mathematically, this can be understood as follows. If the prior and likelihood are all Gaussians, maximization of posterior with respect to weights is equivalent to the minimization of the sum-of-square error function. If the brain uses gradient descent to perform learning, the gradient of sum-of-square error function will be dependent on the prediction errors. This means the brain need to represent the prediction errors if it is doing optimal Bayesian inference under Gaussian assumption. However, the brain might do suboptimal Bayesian inference. The distribution of input can also be non-Gaussian. Thus in general, only representing prediction error is not enough for the brain to perform Bayesian inference.
One advantage of predictive coding is the brain can save neural representational power by just represent the prediction error. However, this idea doesn’t consider the representation of prediction itself. The brain still must spend neural resources to represent the prediction. The efficient coding framework [6] claims that neural representations are trying to maximize the mutual information between responses and stimulus under some constraints. Under high signal-to-noise ratios, the efficient representation will meet with the representation learned from predictive coding principle [7,8]. However, when the signal to noise ratio is small, the efficient coding says that preserving all the stimulus input is more beneficial compared to just preserving the prediction error [7,8]. Thus this predictive coding motif isn’t always true under the efficient coding framework.
In summary, we agreed with the conclusions of this paper. More experiments need to be designed to offer conclusive answers about how the brain performs Bayesian inference. We also can’t rule out the possibility that brain use multiple representations. Unifying such possible representations in a more general theoretical framework would be worth the future efforts of computational neuroscientists.
[1]. Aitchison L, Lengyel M. With or without you: predictive coding and Bayesian inference in the brain. Current Opinion in Neurobiology, 2017, 46: 219-227.
[2]. Marr D: Vision: a computational investigation into the human representation and processing of visual information. WH San Francisco: Freeman and Company; 1982.
[3]. Chalk M, Marre O, Tkacik G. Towards a unified theory of efficient, predictive and sparse coding. bioRxiv, 2017: 152660.
[4]. Bialek W, Van Steveninck R R D R, Tishby N. Efficient representation as a design principle for neural coding and computation[C]//Information Theory, 2006 IEEE International Symposium on. IEEE, 2006: 659-663.
[5]. Rao RP, Ballard DH: Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci 1999, 2:79-87.
[6]. Barlow H: Possible principles underlying the transformations of sensory messages. In Sensory Communication. Edited by Rosenblith W. MIT Press; 1961:217-234.
[7]. Atick J J, Redlich A N. Towards a theory of early visual processing. Neural Computation, 1990, 2(3): 308-320.
[8]. Srinivasan M V, Laughlin S B, Dubs A. Predictive coding: a fresh view of inhibition in the retina. Proceedings of the Royal Society of London B: Biological Sciences, 1982, 216(1205): 427-459.