Cortical Areas Interact through a Communication Subspace

JD Semedo, A Zandvakili, CK Machens, BM Yu, and A Kohn. Neuron, 2019.

Summary

How do populations of neurons in interconnected brain areas communicate? This work proposes the idea that different cortical areas interact through a communication subspace: a low-dimensional subspace of the source population activity fluctuations that is most predictive of the target population fluctuations. Further, this communication subspace is not aligned with the largest activity patterns in a source area. Importantly, the computational advantage of such a subspace is that it allows for selective and flexible routing of signals among cortical areas.

Approach used to study inter-areal interactions

Previous studies of inter-areal interactions in the brain have related spiking activities of pairs of neurons in different brain areas or LFP-LFP interactions. Such studies have provided insight into how interaction strength changes with stimulus drive, attentional states, or task demands. However, these methods do not explain how population spiking activity in different areas are related on a trial-by-trial basis.

This work leverages trial-to-trial co-fluctuations of V1 and V2 neuronal population responses, recorded simultaneously in macaque monkeys, to understand population-level interactions between cortical areas.

Experiment details

The activity of neuronal populations in output layers (2/3-4B) of V1, and the primary downstream target of the middle layers of V2, were recorded in three anesthetized monkeys (Fig 1A in the paper). The recorded populations had retinotopically aligned receptive fields. The stimulus comprised drifting sinusoidal gratings at 8 different orientations. The trial-to-trial fluctuations to repeated presentations of each grating were analyzed.

Source and target populations

The recorded V1 neurons were divided into source and target populations (Fig 1B). For each dataset, the target V1 population was drawn randomly from the full set of V1 neurons such that it matched the neuron count and firing rate distribution of the target V2 population. This matching procedure was repeated 25 times, using different random subsets of neurons. Results for each stimulus condition were based on averages across these repeats.

Results

Strength of population interactions

The V1-V2 interactions were first characterized by (i) measuring noise correlations, and (ii) multivariate linear regression to see how well the variability of the target populations could be explained by the fluctuations of the source V1 population (Fig 2). Both these analyses indicated that the interactions between areas (source V1 – target V2) have similar strength as those within a cortical area (source V1 – target V1).

What about the structure of these interactions?

Consider predicting the activity of a V2 neuron from a population of three V1 neurons using linear regression. The regression weights correspond to a regression dimension. In a basic multivariate regression model, each V2 neuron has its own regression dimension and these dimensions could, in principle, fully span the V1 activity space. But what if they span only a subspace (Fig 3)?

If only a few dimensions of V1 activity are predictive of V2, then using a low dimensional subspace should achieve the same prediction performance as the full regression model.

Testing the existence of these subspaces

To test this hypothesis, the authors used linear regression with a rank constraint. They observed that reduced rank regression achieved nearly the same performance as the full regression model. Further, they observed that the number of dimensions needed to account for V1-V2 interactions was less than the ones involved in V1-V1 interactions (Fig 4).

Are the V1-V2 interactions low dimensional because the V2 population activity itself is lower dimensional that the target V1? Factor analysis revealed that the dimensionality of the V2 activity was actually higher than that of the target V1 (Fig 5A).

To assess how the complexity of the target population influenced the dimensionality of the interactions, they also compared the number of predictive dimensions to the dimensionality of the target population activity (Fig 5B). For V1-V1 interactions, the number of predictive dimensions matched the target V1 dimensionality. In contrast, for V1-V2 interactions, the number of predictive dimensions was consistently lower than the target V2 dimensionality.

Based on these observations, the authors conclude that

  • the V1-V1 interaction uses as many predictive dimensions as possible
  • but the V1-V2 interaction is confined to a small subspace of source V1 population activity

The authors term this subspace the communication subspace. This low-dimensional interaction structure was also observed in simultaneous population recordings in V1 and V4 of awake monkeys (Fig S2).

Relationship to source population activity

By removing source population activity along required predictive dimensions, they identified that the V2 predictive dimensions are not aligned with target V1 predictive dimensions (Fig 6).

Next, using factor analysis, they identified the dimensions of largest shared fluctuations within the source V1 population. However, these dominant dimensions of the source V1 population are worse than the V2 predictive dimensions at predicting V2 fluctuations (Fig 7A).

Summary of results

The V1 predictive dimensions are aligned with the largest source V1 fluctuations (dominant dimensions, Fig 7B). In contrast, the V2 predictive dimensions are distinct and:

  • they are less numerous
  • they are not well aligned with the V1 predictive dimensions
  • nor are they well aligned with the V1 dominant dimensions 

The authors conclude by suggesting that the communication subspace is an advantageous design principle of inter-area communication in the brain. The ability of a source area to communicate only certain patterns while keeping others private could be a means of selective routing of signals between areas. This selective routing allows moment-to-moment modulation of interactions between cortical areas.

Comments from the journal club

  • An alternative to reduced rank linear regression would be to use canonical correlation analysis (CCA)
  • What information is encoded in the different dimensions, both predictive and dominant? This should be easy to check.
  • The analyses here are entirely linear, but V2 neurons most likely perform nonlinear operations on inputs received from V1. The approach used here was to study local fluctuations around set points. The justification provided for this approach is that the trial-to-trial variability around the mean response functions effectively as local linear perturbations in the nonlinear transformation between V1 and V2.
  • All the analyses reveal subspaces of relatively low dimensionality. Might this be a consequence of the low-dimensional stimulus? Nonetheless, why would the (“noise”) fluctuations be low-dimensional even for a low-dimensional stimulus?

Distilling Free-Form Natural Laws from Experimental Data

Schmidt, M., & Lipson, H. (2009). Science324(5923): 81-85.

Automating the Search for Natural Laws

Scientists have always been concerned with identifying the laws that govern natural phenomenon. We now live in an age where we can collect massive amounts of experimental data from a wide range of systems – subatomic to astronomical. We are also blessed with constantly growing computational power. Can we use these resources to automate the search for the governing laws of any physical system? In this paper, Schmidt and Lipson present an approach based on symbolic regression to automate the search for natural laws from experimental data. Mathematical symmetries and invariants underlie nearly all physical phenomenon. Thus, the search for natural laws is invariably a search for conserved quantities or invariant equations. However, the most prohibitive obstacle for automating this process is finding meaningful invariants. This paper proposes a principle for the identification of non-trivial invariants.

Symbolic regression

Several methods exist for modeling scientific data:

  • Fixed-form parametric models based on expert knowledge
  • Numerical models aimed at prediction, for e.g. neural networks
  • Restricted model spaces using greedy search

The goal here, however, is to find principal unconstrained analytical expressions that explain symbolically precise conserved relations. This requires searching the space of both functions and parameters.

In this paper, Symbolic Regression, which is an evolutionary computation method, is used for searching the space of mathematical functions. Initial mathematical expressions are constructed using basic mathematical building blocks – algebraic operators (+,-,\times), a basic set of analytical functions (for e.g. sine, cosine), constants and state variables. The search algorithm, based on genetic programming, forms new equations by recombining previous equations and probabilistically varying sub-expressions. These models are associated with a fitness score. Models with high fitness are retained and unpromising solutions are discarded. The algorithm terminates after the obtained equations reach a desired level of fitness.

Identification of nontrivial relations

Rather than try to model any specific signal, the goal here is to find any underlying physical law that the system obeys. The candidate equations should predict relationships between dynamics of the components of the system. Specifically, this paper proposes a Predictive Ability Criterion: the candidate equations should predict connections among derivatives of groups of variables over time. The fitness score used by their symbolic regression algorithm is, thus, a measure of the difference between partial derivatives obtained

  • symbolically from the candidate equations
  • numerically from the experimental data.

Further, instead of producing just one candidate, the algorithm outputs a short list of final candidate analytical expressions on the accuracy-complexity Pareto frontier.

Results

The search algorithm with the partial-derivative-pairs criterion was applied to measurements from a few simple physical systems – air-track oscillators and double pendulums. One important feature of the algorithm is that one could control the type of law that the system might find by choosing the input data. For example, for a pendulum, if only position coordinates are provided as input, the algorithm converges to the equation of a circle. Given position and velocity data, the algorithm converges on energy laws. The algorithm could extract Hamiltonians of the air-track oscillators and the conservation of angular momentum laws for the double pendulum.

Caveats

Though the algorithm can present equations corresponding to physical laws in their mathematical form, the bulk constants in the expressions are not characterized. The authors propose a systematic approach to parsing these coefficients by analyzing multiple data sets from the same system, albeit with different configurations and parameters. They demonstrate this approach by using measurements from simulated air-track oscillators and pendulums.

The time to converge on the law equations depends exponentially on complexity of the law itself, dimensionality of the system and measurement noise. Therefore, a key challenge is scaling to higher complexity. One potential solution proposed is the use of candidate equations obtained from simpler systems as initial seeds for complex systems. This seeding approach does not constrain the equation search, but instead biases the search to reuse terms from previous laws.

Problems

In the course of our discussion of this paper, there were a couple of questions raised. The primary doubt stems from the Predictive Ability Criterion. The text describes a “candidate law equation f” whereas f is just an expression, not an equation. Now, if you do make it a conservation equation, then basic calculus gives you the opposite answer from what is reported as the predictive ability criterion. In particular, for a conservation law f(x,y) = c, any change in x must be accompanied by a change in y, and the result is exactly the negative of Eq S2 in the supplement!

  • f(x,y) = c . . . . . . . . . . . . . . . # conservation law
  • \frac{df}{dx} + \frac{df}{dy} \frac{dy}{dx} = 0 . . . . . . . . . . . . . . . # diff. both sides w.r.t x, use chain rule
  • \frac{dy}{dx}= - \frac{df/dx}{df/dy} . . . . . . . . . . . . . . . # solve for dy/dx as desired

Equation S2 in the supplement states instead that 

  • \frac{dy}{dx}= + \frac{df/dx}{df/dy} . . . . . . . . . . . . . . . # note opposite sign!

The proposed algorithm is designed to identify nontrivial conservation laws. However, the authors also claim that this algorithm can be used to identify equations such as Lagrangian equations, which summarize the system dynamics but are not invariant. It is not quite clear how these Lagrangians could be obtained.

When we contacted the corresponding author about these problems, he replied that we should read the supplement, and mentioned that absolute values on the implicit derivatives produce Lagrangians. However, these suggestions did not resolve our questions. Any further clarifications from the authors or scientific community would be most welcome. There is room for commenting below.

Applications to Neuroscience?

This paper presents an approach to identify analytical, human interpretable governing laws from experimental data without any prior knowledge of the system, and demonstrates its success for simple, low-dimensional physical systems. For us, the pertinent question is – can this method be scaled to neural data? We can now simultaneously record the activities of tens of thousands of neurons in the brain. Neural data has various sources of noise and variations across experimental subjects/systems. Further, the governing principles of the brain (if indeed the brain has any) probably describe the dynamics of what the neurons represent. Perhaps with the right kind of fitness score (which also accounts for identifying the appropriate neural representations) and sufficient computational power, the symbolic regression approach could potentially extract some underlying computational principles of the brain.