Review of “Optimal policy for multi-alternative decisions”

Paper by Satohiro Tajima, Jan Drugowitsch, Nisheet Patel, and Alexendre Pouget Nature Neruoscience, August 5th, 2019

Review by Nick Barendregt (CU, Boulder)

Summary

Organisms must develop robust and accurate strategies to make decisions in order to survive in complex environments. Recent studies have largely focused on value-based or perceptual decisions where observers must choose between two alternatives. However, many real-world situations require choosing between multiple options, and it is not clear if the strategies that are optimal for two-alternative tasks can be directly translated to multi-alternative tasks. To address this question, the authors use dynamic programming to find the optimal strategy for an n-alternative task. Using Bellman’s equation, the authors find that the belief thresholds at which a decision process is terminated are time-varying non-linear functions. To understand how observers could implement such a complex stopping rule, the authors then develop a neural network model that approximates the optimal strategy. Using this network model, they then show that several experimental observations, such as the independence of irrelevant alternatives (IIA) principle, that had been thought to be suboptimal can in fact be explained by the non-linearity of the network. The authors conclude by using their network model to generate testable hypotheses about observer behavior in multi-alternative decision tasks.

Optimal Decision Strategy

To find the optimal strategy for an n-alternative decision task, the authors assume the observer accumulates evidence to update their belief, and that the observer commits to a choice whenever their belief becomes strong enough; this can be described mathematically by the belief crossing a threshold. To find these thresholds, the authors assume that the observer sets their thresholds to maximize their reward rate, or the average reward (less the average cost of accumulating evidence) per average trial length. These assumptions allow them to construct a utility, or value, function for the observer. At each timestep, the observer collects a new piece of evidence and uses it to update their belief. With this new belief, the observer calculates the utility associated with two classes of actions. The first class, which has total actions, is committing to a choice, which has utility equal to the reward for a correct choice times the probability their choice is correct (i.e., their belief in the choice being correct). The second class, which has a single action, is waiting and drawing a new observation, which has utility equal to the average future utility less some cost of obtaining a new observation. The utility function selects the maximum of these n+1 actions for the observer. The decision thresholds are then given by the belief values where the maximal-utility action changes.

Using Bellman’s equations for the utility function, the authors find the decision thresholds are non-linear functions that evolve in time. From the form of these thresholds, the authors surmise that the belief-update process can be projected onto a lower-dimensional space, and that the thresholds collapse as time increases, reflecting the urgency the observer faces to make a choice and proceed to the next trial.

Neural Network Model

To see how observers may approximate this non-linear stopping rule, the authors construct a recurrent neural network that implements a sub-optimal version of the decision strategy. This network model has n neurons, one for each option, which track the belief associated with each option. The network also includes divisive normalization, which is widespread in the cortex, and an urgency signal, which increases the gain as time increases. These two features allow the model to well-approximate the optimal stopping rule, and result in a model that has a similar lower-dimensional projection and collapsing thresholds. When comparing their network model to a standard race model, the authors find that adding normalization and urgency improves model performance in both value-based and perceptual tasks, with normalization having a larger impact on performance.

Results and Predictions

Using their neural network model, the authors are able reproduce several well-established results, such as Hick’s law for response times, and explain several behavioral- and physiological-findings in humans that have long been thought to be sub-optimal. First, because of the normalization, the model is able to replicate the violation of IIA, which says that in a choice between two high-value options, adding a third option of lower value should not influence the decision process. The normalization also replicates the similarity effect, which says that when choosing between option 1 and option 2, adding a third option similar to that of option 1 decreases the probability of choosing option 1. The authors then conclude that the key explanation of these behaviors is divisive normalization.

After validating their model by reproducing these previously-observer results, the authors then make predictions about observer behavior in multi-alternative tasks. The main prediction is in the two types of strategies used for multi-alternative tasks: the “max-vs.-average” strategy and the “max-vs.-next” strategy. The model predicts that the reward distribution across choice should cause observers to smoothly transition between these two strategies; this prediction is something that could be tested in psychophysics experiments. 

What is the dynamical regime of the cortex?

A review of a preprint by Y. Ahmadian and K. D. Miller

What is the dynamical regime of cortical networks? This question has been debated as long as we have been able to measure cortical activity. The question itself can be interpreted in multiple ways, the answers depending on spatial and temporal scales of the activity, behavioral states, and other factors. Moreover, we can characterize dynamics in terms of dimensionality, correlations, oscillatory structure, or other features of neural activity.

In this review/comment, Y. Ahmadian and K. Miller consider the dynamics of single cells in cortical circuits, as characterized by multi-electrode, and intracellular recording techniques. Numerous experiments of this type indicate that individual cells receive excitation and inhibition that are approximately balanced. As a result, activity is driven by fluctuations that cause the cell’s membrane potential to occasionally, and irregularly cross a firing threshold. Moreover, this balance is not a result of a fine tuning between excitatory and inhibitory weights, but is achieved dynamically.

There have been several theoretical approaches to explain the emergence of such balance. Perhaps the most influential of these theories was developed by C. van Vreeswijk and H. Sompolinsky in 1996. This theory of dynamic balance relies on the assumption that the number of  excitatory and inhibitory inputs to a cell, K, is large and that these inputs scale like 1/\sqrt(K). If external inputs to the network are strong, under fairly general conditions activity is irregular, and in a balanced regime: The average difference between the excitatory and inhibitory input to a cell is much smaller than either the excitatory input or inhibitory input itself. Ahmadian and Miller refer to this as the tightly balanced regime.

In contrast, excitation and inhibition still cancel approximately in loosely balanced networks. However, in such networks the residual input is comparable to the excitatory input, and cancellation is thus not as tight. This definition is too broad, however, and the authors also assume that the net input (excitation minus inhibition) driving each neuron grows sublinearly as a function of the external input. As shown previously by the authors and others, such a state emerges when the number of inputs to each cell is not too large, and each cell’s firing rate grows superlinearly with input strength. Under these conditions a sufficiently strong input to the network evokes fast inhibition that loosely balances excitation to prevent runaway activity.

Loose and tight balance can occur in the same model network, but loose balance occurs at intermediate external inputs, while tight balance emerges at high external input levels. While the transition between the two regimes is not sharp, the network behaves very differently in the two regimes: A tightly balanced network responds linearly to its inputs, while the response of a loosely balanced network can be nonlinear. Moreover, external input can be of the same order as the total input for loosely balanced networks, but must be much larger than the total input (of the same order as excitation and inhibition on their own) for tightly balanced networks.

Which of these two regimes describe the state of the cortex? Tightness of balance is difficult to measure directly, as one cannot isolate excitatory and inhibitory inputs to the same cell simultaneously. However, the authors present a number of strong, indirect arguments in favor of loose balance basing their argument on several experimental findings: 1) Recordings suggest that the ratio of the mean to the standard deviation excitatory input is not sufficiently large to necessitate precise cancellation from inhibition. This would put the network in the loosely balanced regime. Moreover, excitatory currents alone are not too strong, comparable to the voltage difference between the mean and threshold. 2) Cooling and silencing studies suggest that external input, e.g. from thalamus, to local cortical networks is comparable to the net input. This is consistent with loose balance, as tight balance is characterized by strong external inputs. 3) Perhaps most importantly cortical computations are nonlinear. Sublinear response summation, and surround suppression, for instance, can be implemented by loosely balanced networks. However, classical tightly balanced networks exhibit linear responses, and thus cannot implement these computations. 4) Tightly balanced networks are uncorrelated, and do not exhibit the stimulus modulated correlations observed in cortical networks.

These observations deserve a few comments: 1) The transition from tight to loose balance is gradual. It is therefore not exactly clear when, for instance, the mean excitatory input is sufficiently strong to require tight cancellation. As the authors suggest, some cortical areas may therefore lean more towards a tight balance, while others lean more towards loose balance. 2) It is unclear whether cooling reduces inputs to the cortical areas in question. 3 and 4) Classical tightly balanced networks are unstructured and are driven by uncorrelated inputs. Changes to these assumptions can result in networks that do exhibit a richer dynamical repertoire including, spatiotemporally structured, and correlated behaviors, as well as nonlinear computations.

Why does this this debate matter? The dynamical regime of the cortex describes how a population of neurons transforms its inputs, and thus the computations that a network can perform. The questions of which computations the cortex performs, and how it does so, are therefore closely related to questions about its dynamics. However, at present our answers are somewhat limited. Most current theories ignore several features of cortical networks that may impact their dynamics: There is a great diversity of cell types, particularly inhibitory cells, each with its own dynamical, and connectivity properties. It is likely that this diversity of cells shape the dynamical state of the network in a way that we do not yet fully understand. Moreover, the distribution of synaptic weights, and spatial synaptic interactions across the dendritic trees are not accurately captured in most models. It is possible, that these, and other, details are irrelevant, and current theories of balance are robust. However, this is not yet completely clear. Thus, while the authors make a strong case that the cortex is loosely balanced, a definitive answer to this question lies in the future.

Thanks go to Robert Rosenbaum for his input on this post.

Inferring structural connectivity using Ising couplings in models of neuronal networks

Uncovering the structure of cortical networks is a fundamental goal of neuroscience. Understanding how neuronal circuits are organized could help us understand, for instance, whether certain cell types connect preferentially to others. The patterns of connections could help explain the observed patterns of activity [1]. However, probing the patterns of synaptic connectivities directly using electrophysiological methods is difficult and expensive [2,3]. With the advent of new experimental techniques we can now record the concurrent activity of hundreds and even thousands of neurons in the cortex. It would be much simpler if we could infer the structure of cortical circuits directly from such recordings.

Inferring connectivity from activity is not a new idea: Cross-correlation functions measure the average impact of one cells’ spike on the activity of another directly from their concurrently measured spike trains. The idea that cross-correlations can be used to infer synaptic interactions between cells goes back to at least 1970 [4]. However, this early work also recognized several difficulties of this approach. Cross-correlations will reflect common inputs to the cells, and global patterns of activity of the observed populations. Disentangling synaptic interactions from these other effects is difficult, especially if only a fraction of a population is observed.

Assuming that the activity of an entire population P has been observed, one approach to disentangling direct from indirect interactions between cells is to use partial correlations: the correlations between the residuals of two cells’ activities remaining after regression on the other cells’ activities, P - \{A,B\}. In other words, partial correlations are the correlations that remain between two neurons when their correlations with all other cells are removed. Again, this is an old idea [5], and other approaches have been proposed to tackle the problem: the connectivity inferred by fitting Ising models, generalized linear models, and other types of models have been proposed to uncover synaptic interactions.

In our recent journal club we discussed a recent addition by Kadirvelu, et al [6] to this fairly extensive body of literature. Here the authors asked how well thresholded partial correlations and thresholded weights obtained from fitting an Ising model can represent synaptic connectivity. The authors first simulated networks of 11 to 120 Izhikevich neurons under varying conditions, changing the firing rates, connectivity structure, etc., of the network. They then tried to recover the connectivity using the two methods, and compare the results to the actual ground truth used in the simulations. Synaptic weights were deemed unimportant, and instead binary matrices with 0s and 1s signifying the absence or presence of an interaction, respectively, were compared. As partial correlations do not reveal the direction of an interaction, the ground truth matrices were symmetrized before a comparison. The performance of each method was quantified by the area under the ROC curve obtained from varying the threshold. Low thresholds gives more false positives, and high thresholds more false negatives. Thus as the threshold is changed from low to high, both the fraction of falsely identified synaptic connections (false positives, FP), and the fraction of correctly identified connections (true positives, TP) both increase. The curve traced out by the false and true positive rate in FP-TP space is the ROC curve.

The main conclusion of the paper is that the performance of the methods depends on the level of correlations: At low correlations, fitting an Ising model works better, and at high correlations the partial correlation method works better. Other observations were not unexpected: increasing the firing rates improves inference (as the number of “interactions”, i.e. spikes increases). As the number of neurons increases, inference was harder, etc.

One has to ask what testing these models using simulations can tell us: These settings are highly idealized, and miss many of the features one would encounter with real data. One of the main issues is that latent inputs are not accounted for. In this particular case, all correlations were due to synaptic interactions between model cells, and all cells were observed. Global fluctuations can also induce strong correlations [7], completely overshadowing the effects of direct interactions [1]. There are many other subtleties: the direct inversion of the correlation matrix to obtain partial correlations is problematic, and typically some regularization is required [8]. Moreover, thresholding of inferred interaction weights to try to distinguish real interactions from fluctuations is known to give inconsistent estimators of interactions.

So is the inference of interactions a futile exercise? With the present data, inferring synaptic interactions is likely to be unsuccessful in all but the simplest settings. However, robustly inferring the strength of interactions is still worthwhile, even if these only measure statistical dependencies, rather than structural connections. Changes in such effective connectivity may reflect computations or mental states, and are hypothesized to change under working memory load [9]. Moreover, the effective connectivity could be modulated much more quickly than synaptic connectivity. However, as to which method is best at robustly uncovering such effective connectivity, the article we discussed is silent.

1. Rosenbaum, R., Smith, M. A., Kohn, A., Rubin, J. E., & Doiron, B. (2016). The spatial structure of correlated neuronal variability. Nature Neuroscience, 20(1), 107–114.

2. Jiang, X., Shen, S., Cadwell, C. R., Berens, P., Sinz, F., Ecker, A. S., et al. (2015). Principles of connectivity among morphologically defined cell types in adult neocortex. Science 350(6264).

3. Oswald, A.-M. M., & Reyes, A. D. (2008). Maturation of intrinsic and synaptic properties of layer 2/3 pyramidal neurons in mouse auditory cortex. Journal of Neurophysiology, 99(6), 2998–3008.

4. Moore, G. P., Segundo, J. P., Perkel, D. H., & Levitan, H. (1970). Statistical signs of synaptic interaction in neurons. Biophysical Journal, 10(9), 876–900.

5. Brillinger, D. R., Bryant, H. L., & Segundo, J. P. (1976). Identification of synaptic interactions. Biological Cybernetics, 22(4), 213–228.

6. Kadirvelu, B., Hayashi, Y., & Nasuto, S. J. (2017). Inferring structural connectivity using Ising couplings in models of neuronal networks. Scientific Reports, 7(1), 8156.

7. Ecker, A. S., Denfield, G. H., Bethge, M., & Tolias, A. S. (2015). On the structure of population activity under fluctuations in attentional state.

8. Yatsenko, D., Josić, K., Ecker, A. S., Froudarakis, E., Cotton, R. J., & Tolias, A. S. (2015). Improved estimation and interpretation of correlations in neural circuits. PLoS Computational Biology, 11(3), e1004083.

9. Pinotsis, D. A., Buschman, T. J., & Miller, E. K. (n.d.). Working Memory Load Modulates Neuronal Coupling.