Skip to contents

The mathematics behind ANCCR

The ANCCR (Jeong et al., 2022) model, which stands for adjusted net contingency for causal relations, proposes that mesolimbic dopaminergic conveys an adjusted net contingency for causal relationships (to biologically meaningful targets). The mathematics (and logic) behind the model go well beyond what I can cover here, but for now, it will suffice to say that the model:

  • Uses a “Hebbian” mechanism to learn retrospective associations after experiencing a meaningful causal target.
  • Derives prospective associations using Bayes’s rule.
  • Combines those associations into contingency terms that represent dopaminergic activity.
  • Uses the sign of dopaminergic activity to strengthen or weaken causal weights.
  • Responds as a function of prospective associations and causal links.

1 - Maintaining stimulus representations

The degree to which a stimulus ii at time tt is “active” in memory is denoted by:

Ei(t)=Σtite(ttit_constant) \tag{Eq.1} E_i(t) = \Sigma_{t_i \leq t}e^{-(\frac{t-t_i}{t\_constant})}

where tit_i are all the time steps up to time tt, and t_constantt\_constant is a time constant (usually meant to be the inter-reward rate)1

2 - Learning stimulus associations:

The model learns retrospective associations after meaningful causal targets occur. Whether event jj is a meaningful causal target is given by:

Φj={1,if Φj(t)=11,if DAj+βj>θ0,otherwise \tag{Eq.2} \Phi_j = \begin{cases} 1,& \text{if } \Phi_j(t) = 1\\ 1,& \text{if }DA_j + \beta_j > \theta\\ 0,& \text{otherwise} \end{cases}

where Φ\Phi plays the role of an indicator function, DAjDA_j is the total dopamine activity at the time of event jj, βj\beta_j is the unconditioned value of event jj and θ\theta is a global threshold parameter.2 Note that the indicator function is self-preserving: once a stimulus becomes a meaningful causal target, it does not stop being so.

After stimulus jj is observed, the predecessor representation contingency, or PRC, for each stimulus ii is updated via:

PRCij=MijMi \tag{Eq.3} PRC_{i \leftarrow j} = M_{i \leftarrow j} - M_{i}

where MijM_{i \leftarrow j} is the predecessor representation of ii given jj has occurred, and MiM_{i} is the base rate with which ii occurs. Both of these quantities are given by:

Mij=Mij+Φjα(EijMij) \tag{Eq.4a} M_{i \leftarrow j} = M_{i \leftarrow j}' + \Phi_j\alpha(E_{i \leftarrow j} - M_{i \leftarrow j}')

and

Mi=Mi+kα(EiMi) \tag{Eq.4b} M_{i} = M_{i}' + k\alpha(E_{i} - M_{i}')

where MijM_{i \leftarrow j}' and MiM_{i}' are the quantities before jj was observed, kk and α\alpha are learning rate parameters, and EijE_{i \leftarrow j} is the eligibility trace of stimulus ii at the time jj occurs (see Eq. 1).

Then, the PRC can be used to derive the prospective association, aptly named the successor representation contingency, or SRC via Bayes rule:

SRCij=PRCijMjMi \tag{Eq.5} SRC_{i \rightarrow j} = PRC_{i \leftarrow j} \frac{M_j}{M_i}

The base rate for jj, MjM_j is calculated via Eq.4b.

3 - Releasing Dopamine

The model postulates that dopaminergic signaling encodes the adjusted net contingencies for causal relations between stimuli, or ANCCRs. The total dopaminergic activity at the time of event ii is equal to:

DAi=Σj(ANCCRijΦj) \tag{Eq.6} DA_i = \Sigma_j (ANCCR_{i \rightarrow j}\Phi_j)

And the ANCCR from stimulus ii to stimulus jj is given by:

ANCCRij=NCijCWijki(ANCCRkjΔkiΦki) \tag{Eq.7} ANCCR_{i \rightarrow j} = NC_{i \leftrightarrow j} CW_{i \rightarrow j} - \sum_{k \neq i}(ANCCR_{k \leftrightarrow j}\Delta_{k \leftarrow i}\Phi_{k \leftrightarrow i})

where NCijNC_{i \leftrightarrow j} is the net contingency between stimuli ii and jj, CWijCW_{i \rightarrow j} is the causal weight that ii has with jj, Δki\Delta_{k \leftarrow i} is the recency of stimulus kk with respect to stimulus ii, and Φki\Phi_{k \leftrightarrow i} is an indicator function denoting whether kk and ii have a putative causal relationship with each other.

The net contingency between stimuli ii and jj, NCijNC_{i \leftrightarrow j}, is given by:

NCij=wSRCij+(1w)PRCij \tag{Eq.8} NC_{i \leftrightarrow j} = wSRC_{i \rightarrow j} + (1-w)PRC_{i \leftarrow j}

or a weighted sum of successor and predecessor representation contingencies.

The net contingency is used to calculate the indicator function above, as:

Φki={1,if NCij>θ0,otherwise \tag{Eq.9} \Phi_{k \leftrightarrow i} = \begin{cases} 1,& \text{if } NC_{i \leftrightarrow j} > \theta\\ 0,& \text{otherwise} \end{cases}

where θ\theta is the same threshold parameter used in Eq.23, and the indicator function for a stimulus and itself, Φii\Phi_{i \leftrightarrow i}, is 0.

The recency term, Δki\Delta_{k \leftarrow i}, is given by:

Δki=e(tjtit_constant) \tag{Eq.10} \Delta_{k \leftarrow i} = e^{-(\frac{t_j-t_i}{t\_constant})}

where t_constantt\_constant is the same parameter used in Eq.1. Note however that Eq.9 does not include the sum term in Eq. 1. Finally, the causal weight from stimulus ii to stimulus jj is given by:

CWij=CWij+αrewardδij \tag{Eq.11} CW_{i \rightarrow j} = CW_{i \rightarrow j}' + \alpha_{reward}\delta_{i \rightarrow j}

where CWijCW_{i \rightarrow j}' is the previous causal weight, αreward\alpha_{reward} is a learning rate parameter exclusive for causal weights, and δij\delta_{i \rightarrow j} is a delta term depending on the sign of the total dopaminergic activity, given by:

δij={CWjjCWij,if DAj0(0CWij)ni1ΔijΦijΣkj(nk1ΔkjΦkj),otherwise \tag{Eq.12} \delta_{i \rightarrow j} = \begin{cases} CW_{j \rightarrow j} - CW_{i \rightarrow j}, & \text{if } DA_j \ge 0\\ (0-CW_{i \rightarrow j})\frac{n_i^{-1}\Delta{i \leftarrow j} \Phi_{i \leftrightarrow j}}{\Sigma_{k \neq j}(n_k^{-1}\Delta_{k \leftarrow j} \Phi_{k \leftrightarrow j})},& \text{otherwise} \end{cases}

where CWjjCW_{j \rightarrow j} above is the reward magnitude of stimulus jj. In plain words, when dopaminergic activity is positive, causal weights (from all present and absent stimuli) strengthen. Conversely, when dopaminergic activity is negative, causal weights (from all present and absent stimuli) weaken, proportional to their normalized frequency and recency (as long as they have putative causal relations with jj).

4 - Generating responses

Responding in ANCCR is lightly specified. The value of responding upon presentation of stimulus ii is given by:

Qi=Σk(SRCikCWik) \tag{Eq.13} Q_i = \Sigma_k(SRC_{i \rightarrow k} CW_{i \rightarrow k})

which can then be mapped onto probabilities via a softmax function4.

A diagram

The diagram below shows the dependencies in the model. I am excluding the indicator functions and parameters for simplicity.5

Note

The implementation of this model is a port from the MATLAB code that Jeong et al. shared in the GitHub repository associated with their paper. The output of the R model was checked against the outputs of the MATLAB model, using training routines (“eventlogs” in their parlance) generated using their MATLAB code. The training routines generated in calmr differ somewhat, to accommodate generality. For example, as of version 0.6.1, it is not possible to specify probabilistic relations between cues and rewards. Instead, it is left to the user to specify an exact probability via trial numbers (e.g., an 80% reward probability can be specified as “80A>(US)/20A”). The naming of parameters also differs between codebases.

References
Jeong, H., Taylor, A., Floeder, J. R., Lohmann, M., Mihalas, S., Wu, B., Zhou, M., Burke, D. A., & Namboodiri, V. M. K. (2022). Mesolimbic dopamine release conveys causal associations. Science (New York, N.Y.), 378, eabq6740. https://doi.org/10.1126/science.abq6740