Iron: Your brain

Unclenching, Part 5.3

Aug 20, 2024

Sometimes a system stabilizes and traps itself in a specific pattern, even though foresight or hindsight (or advice, or fantasy) would have it replace that pattern with something better. I tend to call this clenching, but we could also call it over-canalization.

Earlier in the series, we asked: How do we see from outside? How do we unclench?

Now we can also ask: How can we escape a steep valley? How can we flatten its walls and re-sculpt the landscape? How can we undo over-canalization?

How can I smooth out the wrinkles from the fabric of my mind?

Every valley shall be exalted
And every mountain and hill brought low;
The crooked places shall be made straight
And the rough places smooth

- Isaiah 40:4-5 (NKJV)

We’ve seen that Hebbian reinforcement strengthens the contracts between neurons, and that this contracts the paths of their firing activity in state space. It’s oh so convenient that the different everyday meanings of “contract” align with the different views we’ve taken on this phenomenon. This is not a coincidence.

(By the way, I’m going to start contracting Hebbian Reinforcement down to HR, though sadly I’m obligated not to joke about it further.)

Another way to say that HR is contracting, is that it’s entropy-reducing.

How uncertain or variable are the neural firing patterns? That’s their entropy. A group of neurons act with lower entropy when they fire more predictably or redundantly. And the influences between the neurons are lower entropy when each influence is either very strong or very weak, rather than one of the many moderate values in between. HR causes influences to be either strong or weak, and neural activity to be more redundant and predictable, so as a mechanism it is entropy-reducing.1

I know you must be as excited as I am to have even more words to choose from to say almost the exact same thing. And you should be excited, because these new words will help us to make the most thrilling of physical analogies!

Consider a liquid that freezes into a solid crystal when we lower the temperature enough. The molecules of the material settle and bond into a lattice with an extremely regular structure. If we increase the temperature again, the molecules start vibrating into wilder configurations, until suddenly they break free of the lattice completely and return altogether to the liquid state.

The atoms of a metal also prefer (thanks) to contract into a kind of crystal. Say I have a piece of cold, solid iron that’s in this crystal state. Its atoms2 are arranged in a regular lattice, full of planes that can slide past each other, or be cut apart.3 The metal is relatively soft and ductile.

A relaxed metallic crystal is malleable and ductile because it can be dislocated along many planes. (Remember that atoms aren’t actually little balls; but it doesn’t matter because with the right setup we could use actual balls to make the same analogy.)

If I work the cold iron by hammering it repeatedly, I can knock its atoms out of their regular places. This hardens it, because the crystal lattice has become a tangled mess, full of little tensions set at odds with one another. And as long as the metal remains cold, the atoms are too chilled to untangle themselves. Nothing can easily slide past anything else. Hardened iron is a more suitable material for the blade of a sword or the beams of a building, though it’s also harder to rework — and more brittle.

As the balls are displaced from a regular lattice, fewer and fewer planes remain for them to collectively slide along. The extreme is the amorphous structure of a glassy and brittle material. (Modified from source.)

Now heat up the hardened iron for a little while. The atoms gain enough energy to vibrate out of their stuck positions. Then slowly4 cool the metal so its atoms have time to settle into their favorite, untangled crystal lattice. This process of gradual heating-then-cooling to relax stresses and reform structure is called annealing.5

As the temperature increases, our little balls have more energy to move through a variety of states. They interact more with each other. (Here, it looks like they’ve already resolved most of their tensions and are vibrating around their regular lattice positions.)

Importantly, we don’t need to fully melt the metal to anneal it. When the metal is cold and solid, it’s very ordered — predictably stuck in whichever pattern it already had, each unmoving part held fast by all the unmoving parts around it. When the metal is a hot liquid, it’s extremely disordered: it’s not even pretending to be a lattice, and its atoms are influencing each other much more incoherently. Near the boundary between these two extremes, there’s a sweet spot of entropy — where the coherence between the atoms has relaxed, but not to the point of total molten chaos.

This point is called criticality and it’s where the potential for the system to reorganize itself is at its greatest — where the heat-driven forces of entropy are at their most constructive. Different parts of the system come into energetic contact, without totally coming apart.

The child learns to believe a host of things. i.e. it learns to act according to these beliefs. Bit by bit there forms a system of what is believed, and in that system some things stand unshakeably fast and some are more or less liable to shift. What stands fast does so, not because it is intrinsically obvious or convincing; it is rather held fast by what lies around it.

- Ludwig Wittgenstein, On Certainty (144)

Back to our original questions: Say I’m trapped, flowing through some deep valleys, caught in some tangled mess of coping mechanisms and outdated adaptations. My brain’s parts are stuck, playing out the same patterns against each other. Nothing can move smoothly past anything else. Maybe it feels like I can’t learn anything anymore.6 Working the system makes it even messier. And it’s brittle — if it’s put under a very large and sudden stress, it tends to break rather than flex.

How can I clean up the mess, and pay off some of my technical debt? How can I unstick the patterns in my brain?

By analogy to annealing, we want to heat the system until we reach a high-energy state.7 In this state, if the analogy holds, the brain’s parts will have the chance to reorganize their patterns of activity — hopefully into something more untangled, parsimonious, or adaptive. As this continues and as the system cools down again, the new patterns should “crystallize” into the influences between the parts of the network, so that it can reproduce those patterns more easily even when it’s chilling.

So far, this analogy is pretty sloppy. How do we “increase the temperature” of the brain? When we reach a high-energy state, what allows the brain’s influences to relax, if they start off contracted? And what are the actual things I might do, to make this happen?

We don’t want to anneal the brain into a near-perfect crystal, the way some simple substances might settle if we heated and cooled them gradually enough. A perfect crystal would be an entirely redundant thing — it would “forget” all the history of how it had been previously worked. It would lose all its “memory”. Total amnesia is definitely not our goal.

But that isn’t a huge concern, since our analogy isn’t an exact fit for the brain. To see this more clearly, let’s return to the frame of predictive coding. A simplified view of predictive coding says something like this: to survive — or to achieve any goal, really — your brain needs to predict what your senses will say.8

Suppose the brain tries to cancel out upcoming sensory signals — say, from your eyes or your skin — by sending down prediction signals to meet them. When the two fail to cancel out, the uncancelled remnant is a prediction error which proceeds “upward” and excites the neurons directly responsible for sending down the bad prediction.

Extending this, we might treat the whole brain as a kind of prediction hierarchy. Each brain area G is excited by unresolved prediction errors from a “lower” area F, which are combined with predictions from a “higher” area H, and the result is another prediction error which G passes up to H, and so on, and so on. So prediction errors enter as bare sensory data, percolate up some hierarchy, and at each step there’s another renegotiation, another opportunity to suppress my senses and chill out.9

In this way, predictive coding gives us a way to think about “heating” the brain: we should excite it with sensory details it won’t suppress as predictable or boring.

Easy enough? Not really. Brains love to “predict away” entire worlds of sensory detail by wielding an all-purpose hammer like “just chilling — good vibes only! (That’s not a real tiger, right?)”.

Let’s make our annealing analogy a little crisper now, given that we’re treating the brain as a bunch of neurons and their firing rates:

Compared to relatively simple crystals of metal atoms, the brain is a ridiculously complex web of cells. For one, an individual cell is far more complex than an individual atom, in fact containing many trillions of them.
Neurons are of course made of matter — of many atoms joined together. But while the atoms in a lump of iron join into an ordered structure because of simple metallic bonds between immediate neighbours,10 the various “bonds” or influences that neurons have with each other are complex, layered, and sometimes distant. A neuron can be excitatory or inhibitory; it can grow an axon toward other neurons; an axon can join onto another neuron at many possible positions on its surface, forming a synapse; a synapse can be strengthened or weakened in a variety of ways.11
Iron atoms in a high-energy state are jittering around in their literal positions. Neurons are in a higher-energy state when they are excited — when their firing rates are higher and more variable. That means increased movement less so in their literal position, than in the state space of their firing activity.12
Whether iron is in a high-energy state depends on how much we literally heat it. Whether some neurons are pushed into a high-energy state depends on the strength of the signals reaching their part of the web, unsuppressed.
In iron, the movement of atoms is directly related to the adjustment or rearrangement of the bonds between them.13 In a neural network, the activity of neurons relates to the influences (aka connectivity) between them more indirectly, in complex and various ways.

A network can quickly be driven into an unusual pattern of activity, but some influences cannot change quickly—and some change more quickly than others. It’s faster to boost the signal at a given synapse, than to grow a new synapse, or a new axon. For example: my neural activity might be more chaotic for a few hours while I’m in a high-energy state, and yet the bulk of my long-term memories tend to be secure.14

Importantly, there’s a difference between 1) activity patterns that are supported by the existing influences between neurons, and which are easy for the network to produce even at lower-energy states, and 2) activity that the system may be driven into by large signals coming from outside.
A bit of iron probably isn’t going to heat itself. On the other hand, the brain is a living system that apparently maintains itself kind of close to criticality even when it’s chilled out, for those sweet computational benefits. So an everyday low-energy state might only be slightly below criticality, and entering a high-energy state might mean flipping to a state a little bit closer to (or a little above) criticality. This is fine, because small shifts can cause complex changes in the behaviour of a complex system.

OK. So neurons are excited into high-energy states when their predictions are inappropriate and fail to suppress the sensory signals which excite them. The increased entropy of these states means the network reorganizes into patterns of activity it previously would have had difficulty reaching. And some aspects of those patterns may be good to keep, especially if they would have better predicted the signal that drove the system into the high-energy state.

Keeping a pattern means forming a habit — re-contracting the influences between neurons in the network, so it can reproduce the pattern even when it’s chilling.15 But how can we make sure the existing influences can actually be altered, and don’t just stubbornly clench their way along through the surf of new activity?

We can’t say for sure it won’t happen. But neural influences do relax sometimes, or we would never be able to learn our way out of mental traps. Thankfully it’s plausible that short-sighted neurons can locally detect the fast, erratic high-energy firing of their neighbours and themselves, and in response they may have a mechanism that relaxes their existing influences with their neighbours, to make room for the reinforcement of new patterns.16

Heterosynaptic plasticity is an example of such a mechanism. While HR specifically strengthens a synaptic connection from one neuron to the next, heterosynaptic plasticity non-specifically strengthens all the inputs to a neuron. This re-balances the neuron’s inputs — reducing the relative influence that any other single neuron’s activity has on it.

The authors of the canal paper refer to this kind of high-energy-induced rebalancing as “Temperature or Entropy Mediated Plasticity” (TEMP):

Our model states that an effective intervention for psychopathology should, through an acute action analogous to an increase in system temperature or entropy — as per the Ising model (Suzuki et al., 2007; Ruffini et al., 2022) (see also (Singleton et al., 2022a)), trigger a downstream sub or post-acute effect that is analogous to a rebalancing or recalibration of synaptic weights, i.e., a counteraction to canalization.

a transient increase in TEMP within a system should help rebalance [flatten] its global state-space (Singleton et al., 2022b; Daws et al., 2022)

Heat into activity, cool onto influences.

It might be appropriate to make a distinction between driving a single region of a brain into a high-energy state, versus driving the whole brain into one.

A small part of the brain might be annealed on its own, to achieve a more useful model of something specific. For example, maybe a specific part of my motor cortex becomes excited by a specific prediction error when I fail to land a specific throw, resulting in a bit of local annealing that might improve my game.

Still, my entire brain needs to coordinate with my body to maintain the behavioural context in which it’s even meaningful to anneal on specific errors. Why am I throwing something right now, and why should my motor cortex be updating on whether that throw failed or not? What game am I even playing? If a bunch of little brain regions are updating themselves separately all the time, what happens if they go out of sync? All the actors might get better at playing their own parts, while together they gradually lose the plot. So it’s probably useful for the whole brain to enter a high-energy state sometimes, where all the regions can reform together.

Doesn’t it seem risky for my whole brain to enter a high-energy state? Sure, it could be a short-term risk for robustness: as long as I’m in such a state, my brain probably isn’t able to coordinate as stably around any single pattern, including the pattern that happens to be the most effective one for my goals in the moment.

However, my goals are also contracted structures. Being biased against entering high-energy states means insisting that all my preferences and skills are already sufficiently coordinated into whatever structures they ought to have. Well… if my goals and skills are already optimal, then at best the reformation of influences in my brain will leave me performing no better than before. But it’s also a pretty big risk to assume that I am already so close to optimal, or that my everyday levels of brain criticality are hot enough to reveal and deal with any entrenched or concealed structural issues, before long.

At the level of behaviour, there’s a tradeoff between the risk of relaxing and the risk of contracting17, and I suspect we are biased toward contracting:

Notice that once you feel threatened, everything is more likely to be perceived as a threat, including a person trying to convince you that you shouldn’t feel threatened. It is easier to enter the threat state, than to exit.
For almost the entire history of our evolution, mortal threats could appear often and without warning. When they did, there’d be no more time to learn, only to vigilantly execute whichever skills were already loaded. Why should I relax my habits and goals at all? Think of my children! Think of the tigers. I’m already pretty good at stabbing and running and hiding!
Intelligence is a universal solvent, capable of inventing contexts in which tigers really are irrelevant. When was the last time you had to stab something? But intelligence is a relative latecomer: hiding and running and stabbing were the intellectual peak of threat-handling for hundreds of millions of years. That was the context in which evolution laid the foundations of our brain’s architecture — and evolution cannot backtrack on its creations, only elaborate on them, however exquisite those elaborations may be.18

So we’re prone to a low-level clenchiness, a flinchy short-sightedness which can corrupt our intelligence. On the other hand, given the chance, we can moderate clenchiness by making wise decisions like “I should probably anneal a bit more than my Tiger World inductive biases have led me to appreciate”. Our goals and skills are more complex than ever, and more in need of careful and repeated revision. How do you know you’re doing as well as you could? Evolution is merciless. Self-deception is easy. The only tigers in this city are in the zoo.

Corruption or no, I shouldn’t be surprised when most of the time, my adult brain wields its powers of suppression to keep itself in a relatively low-energy state, where it can coordinate more robustly by leaning into the patterns it already knows, while suppressing “irrelevant” details.19 Over time though, circumstances may build and build to the realization that things could improve, and that progress isn’t being made by incrementally adjusting the individual parts while otherwise maintaining the status quo. Then it’s probably time to risk a temporary hit to robustness and experience a whole-brain annealing event: let the parts disintegrate and reform!

Michael Edward Johnson has been writing about annealing and brains for a while, and he’s made some deeper connections than I’ll make with this post.

One thing I want to highlight is his discussion of Selen Atasoy’s studies of brain harmonics. It’s new to me, but I’ll make a first pass connecting Atasoy’s work to what we’ve already discussed.20

First, here’s a new analogy: when we fix a guitar string at both ends and tension it, it tends to oscillate at certain natural frequencies, also called harmonics. This is because when each point on the string moves up and down, it influences each of its neighbouring points on the string to do the same. As long as they remain physically connected, neighbouring points must move together and oscillate at similar frequencies. The overall result is that when the string is excited, the entire thing “locks on” to a particular resonant state or mode where the entire string oscillates as a whole. The frequencies at which this happens are determined by the influences physically holding the string together, and their geometry. In the case of a guitar string, that includes the material properties of steel or nylon, the thickness and length of the string between the tensioning points, and the tension the string is under (i.e. how much we turn the tuning peg).

An oscillating string, held under tension between two points (black). The first four harmonics are shown. Higher harmonics have higher frequencies—they oscillate more quickly—and they subdivide the string with more nodes (red) at which the string happens to be stationary, but isn’t fixed. Note that this figure does not show the effect of material, thickness, tension, or (actual) length, all of which would change the real frequencies that these harmonics would correspond to, if this were a real string. (Source.)

Again, a brain is more complex than this — but let’s speculate. Imagine a brain as a kind of complex resonator. Which modes will it resonate at? That depends on the influences between its parts, which we can also frame as the connectivity of the paths along which neural activity travels and loops and reverberates.

Harmonics may arise within a brain region due to the local influences, but also across the whole brain due to the larger-scale influences across all its regions, sometimes called the connectome. Atasoy’s theory is called Connectome-Specific Harmonic Waves (CSHWs); it’s a way to predict the large-scale resonant harmonics of the brain based on MRI measurements of the connectome’s structure:

[CSHW is] a method for applying harmonic analysis to the brain: basically, it uses various forms of brain imaging to infer what the brain’s natural resonant frequencies (eigenmodes) are, and how much energy each of these frequencies have. The core workflow is three steps: first combine MRI and DTI to approximate a brain’s connectome, then with an empirically-derived wave propagation equation calculate what the natural harmonics are of this connectome, then estimate which power distribution between these harmonics would most accurately reconstruct the observed fMRI activity.
- Michael Edward Johnson

Johnson proposed a continuum of harmonics, across scales of the brain. Smaller-scale harmonics — which he categorized as region-specific harmonic waves (RSHWs) — should have higher frequencies, be confined to local regions, and correspond to (sub-)behavioural particulars. Larger-scale harmonics (Atasoy’s CSHWs) should have lower frequencies, pass through many regions, and reflect behavioural coordination or summary states.

I find “CSHWs” and “RSHWs” a little disorienting to look at, so for the rest of this post I’m going to say LHs for little/local harmonics and BHs for big/brain harmonics.

LHs are high-frequency harmonics that are confined to local brain regions which deal in particulars. It makes sense that they are confined; we should expect different particulars not to interfere with each other by crossing their local boundaries, otherwise our brains might have trouble keeping track of particulars. (It does strike me as redundant to say that LHs are locally confined when we’ve already said that they are high-frequency, since the geometry of confinement is precisely what determines the frequency of resonance. I’m probably missing something.)

How can all these isolated LH particulars be coordinated into coherent behaviour? That should happen at the scale of the BHs, which being lower frequency have a more coherent identity across the whole brain. They travel widely, and should be able to interact with the many local LHs. And that goes both ways. For example:

LHs to BHs: If the right LHs start resonating differently after being struck by some new and surprising details, their voices might be able to resynchronize the BHs and produce an overall shift in emotions or behaviour.
BHs to LHs: An especially loud BH may be able to forcefully synchronize all the LHs and quickly suppress “irrelevant” details.

Johnson associates BHs with emotional states. This seems natural, if we treat emotions as a kind of high-level judgment the brain makes about everything it’s doing at the moment. We should expect such judgments to integrate many details from across the brain, and also to influence the processes that deal in those details. BHs are one plausible way in which this could happen.21

This framing is kind of nice, because “the brain maintains a status quo by suppressing apparent irrelevancies” can be bridged to something a little more intuitive or psychological: a game of “good vibes only” is always playing out in the synchronizations and suppressions of harmonics:

we doctor our [harmonic modes] *all the time*—when a nice sensation enters our awareness, we reflexively try to ‘grab’ it and stabilize the resonance; when something unpleasant comes in, we try to push away and deaden the resonance
- Michael Edward Johnson

Johnson makes an interesting connection to trauma and depression. In the event that some LHs become seriously miscalibrated with respect to the particulars they intend to model, it may make sense to turn down the BHs so that miscalibration (misinformation?) cannot spread across the brain like a harmonic infection and cause a “cascading system failure”. The result? The now-quieter BHs aren’t enough to coordinate the LHs, which fall more and more strongly into their own distinct routines, and in a vicious circle this makes it more difficult for a stronger BH to arise again and coordinate them. (Sound familiar?) And so sometimes the outcome is a muted cacophony of unreconciled horrors, bound by history, full of little tensions set at odds with one other.

Brain-wide annealing can be framed as relaxing and energizing the harmonics, so that the whole system can reharmonize. It’s like introducing Whoopi Goldberg to a bunch of nuns. And given that the brain is usually trying to maintain a status quo, we might expect annealing to happen locally to the LHs before it happens globally to the BHs.

Insofar as partitioning is possible in a broadly-coupled harmonic system, these [free-energy increasing] perturbations [that would induce annealing] will tend to be ‘local’ as the brain has strong incentives to preserve structure that doesn’t need updating.
- Michael Edward Johnson

And what are the actual things I can do, to make this happen?

Finally, let’s review a few methods we might expect to induce annealing. (We might also call them treatments to relieve clenching.) Keep in mind that these aren’t recommendations — the circumstances of your own life will determine the best course, of course.

I’ve largely based this section on insights I’ve obtained, again, from Michael Edward Johnson. Before we begin, I want to refresh by sharing his convenient summary of the neural annealing process.

First, energy (neural excitation, e.g. Free Energy from prediction errors) builds up in the brain, either gradually or suddenly, collecting disproportionately in the brain’s natural eigenmodes [i.e. resonant harmonics];
This build-up of energy (rate of neural firing) crosses a metastability threshold and the brain enters a high-energy state, causing entropic disintegration (weakening previously ‘sticky’ attractors);
The brain’s neurons self-organize into new multi-scale equilibria (attractors), aka implicit assumptions about reality’s structure and value weightings, which given present information should generate lower levels of prediction error than previous models (this is implicitly both a resynchronization of internal predictive models with the environment, and a minimization of dissonance in connectome-specific harmonic waves);
The brain ‘cools’ (neural activity levels slowly return to normal), and parts of the new self-organized patterns remain and become part of the brain’s normal activity landscape;
The cycle repeats, as the brain’s models become outdated and prediction errors start to build up again.

Here, “self-organize into new multi-scale equilibria” refers to the fancy things that happen near criticality, the sweet spot between cold coherence and hot incoherence.

In particular, leading from step 3 into step 4 is where we expect mechanisms such as heterosynaptic plasticity to be activated so that new patterns can be reinforced in the influences within the network, to remain accessible outside of high-energy states.

Here’s a cute version:

“Heat” the brain, until
it flips into a high energy state, in which
activity reorganizes into new patterns, and after a while
things cool down again, and winning patterns contract into influences.
Repeat as appropriate/inevitable.

How do we initiate this cascade from step 1?

Meditation

Earlier in the series we discussed some meditative practices. Now our hypothesis is simple: when applied effectively, meditation pumps excitatory energy into the brain’s networks, potentially inducing high-energy states and reorganization events.22

Excitatory energy? This might seem to contradict the popular view of meditation as tranquil or something. But consider what it means, here: to excite our brain networks, we either need to stop our top-down predictive models from suppressing incoming sensory signals, or we need to boost the intensity of those signals. Or both. This rather lines up with the distinction between insight and concentration practices:

Insight practices, such as mindfulness, sort of look like removing the habitual top-down suppression (or “judgment”) of “irrelevant” perceptions, allowing them to rise up freely. This should increase the variety (i.e. entropy) of the states we end up reaching.
Concentration practices look like focusing all our attention on a single point of perception, greatly boosting its signal.

Outside of meditative practice, it’s totally possible for your brain to arrive in a high-energy state because of large, surprising signals related to your children, or your job, or an illness. High-energy states induced in such ways are like loaded questions: they do induce freedom of reorganization… but the system is forced to reform around something specific. Existing influences relax, only to forcibly contract on whatever meaningful and inevitable pattern is being driven into the network at that time. Sometime it’s a pattern we want, or will be happy we found, which is great. But sometimes we’d prefer to clean up the mess we already have, before taking on some more.

What’s special about meditation is that it isn’t necessarily about anything. Insight practice often avoids privileging any particular thought. And while concentration practice might seem to privilege some object of attention or other (e.g. the breath), consider that the end result of focusing attention to a truly single point is the collapse of all informative contrasts and associations. Because of this, we should expect these meditative practices to be very useful for cleaning up messes without forcing them to be replaced with something else.

Johnson has a term for practices that pump in “content-free” energy: semantically-neutral annealing.

“Semantically neutral energy” refers to neural activity which is not strongly associated with any specific cognitive or emotional process. […] usually energy build-up is limited: once a perturbation of the system neatly falls into a pattern recognized by the brain’s predictive hierarchy, the neural activity propagating this pattern is dissipated.

[…] effortful attention on excitatory bottom-up sense-data and attenuation of inhibitory top-down predictive models will naturally lead to a build-up of this ‘non-semantic’ energy in the brain

Psychedelics

Why are serotonergic psychedelics increasingly studied for the treatment of mental illness? Because they might be effective. Why should they be effective? The hypothesis is, again, that they can drive the brain into a high-energy state.

Meditation adds energy to the system by changing how we actively reflect, attend, and (dis)inhibit our mental processes, which are either identical or interdependent with our brain states (thus influence them). On the other hand, if psychedelics add energy it’s by directly intervening in the mechanisms of the brain: by turning a kind of chemical tuning knob, they cause some types of neurons to be more easily excited, so that you’ll go high-energy no matter your prior mental state. And the rebalancing mechanisms presumably still get activated — a high-energy state is a high-energy state, to a short-sighted neuron.

From the canal paper:

We are intrigued to speculate that heterosynaptic plasticity (Chistiakova et al., 2014)—seemingly the type of plasticity induced by the synaptogenic effect of psychedelics (Ly et al., 2018; Shao et al., 2021)—occurs downstream of an initial increase in the entropy of on-going (e.g., cortical) neuronal ensembles

The authors do acknowledge that this speculative mechanism might not be the only one, that the connection between increased entropy/energy and plasticity mechanisms is not known for sure, and anyway that things are probably pretty complicated.

When viewed through a lens of simulated annealing, other neuronal phenomena such as increased: asynchronous glutamate release (Aghajanian and Marek, 1999), expression of neural activity markers (Gresch et al., 2002), complexity of spontaneous oscillatory activity (Schartner et al., 2017), entropy of connectivity motifs (Tagliazucchi et al., 2014), high-frequency harmonics (Atasoy et al., 2017), near (Toker et al., 2022), or supercritical dynamics (Ruffini et al., 2022) sensitivity to perturbation (Jobst et al., 2021), and global integration (Tagliazucchi et al., 2016) can all be linked to either an early, upstream temperature or entropy increase (Schartner et al., 2017) or a later, downstream synaptic reweighting effect e.g., as implied by preclinical data (Shao et al., 2021; Daws et al., 2022; Moda-Sava et al., 2019; Raval et al., 2021).

This process may depend on how the acute entropic action relates to increased post-synaptic gain in deep-layer pyramidal neurons translating into an enhanced sensitivity to bottom-up signaling or prediction error (Carhart-Harris and Friston, 2019)

Two of the canalization authors (Carhart-Harris and Friston) published the REBUS model23 of psychedelics in 2019, which — along with adopting the annealing analogy — explained the action of psychedelics in terms of a reduction in precision-weighting on priors, which means the same thing as flattening the energy landscape.24 With respect to psychedelics, the main contribution of the canal paper was to give the REBUS model a more dynamical (and perhaps more intuitive) frame — in terms of shallower channels in an energy landscape, rather than decreased pointiness of prior distributions.

Exercise

There seems to be plenty of evidence that exercise is good for mental illness. It could lead to rebalancing in at least a couple of ways:

Rhythmic or textural changes in perception can be exciting, but are often relatively semantically neutral.25 For example, running on uneven ground creates rhythmic variations which are not very relevant to higher-order predictions, but should not be entirely suppressed.

As we move through a changing and unfamiliar environment, our sensory fields are filled with shifting textures and motions. As I begin my run, maybe my brain is wearing an inhibitory mask that filters out many of the details; but after an hour or two of attrition, the neutral energy might start to creep in.
Let’s walk, or bike, or hike a while away
until the world, long pouring through our eyes
unfills our minds — and as recedes the day,
our joy flows in much fuller; much more sound
our words ignite; our dreams grow more profound —
and with the sun but brighter will we rise.
Exercise could induce specialized mechanisms outside the nervous system that ultimately lead to neural rebalancing. For example, if during evolution it was typical for exploratory behaviour to be selected-for following increased use of the muscles, then following exercise we might expect to observe our muscles sending signals to the brain to activate rebalancing or neurotrophic mechanisms which are adaptive for exploration.

More generally, it may be possible to activate neural rebalancing mechanisms without first inducing a high-energy state directly in the brain.

There you have it: hot atoms, cool lattices, rebalancing acts, upward-downward negotiations, suppression signals, brain resonances, and clenching treatments. Quite enough for one post, no?

Next time, we’ll dig just a little deeper into some potential problems with all of this.

Text within this block will maintain its original spacing when published

A stone is soft as wax,—tribunes more hard than 
    stones;
A stone is silent, and offendeth not,
And tribunes with their tongues doom men to death.

Titus Andronicus, 3.1

This might seem dubious, since HR does allow the system to acquire new patterns which might be more complex than old ones. But HR isn’t the reason the patterns first appear in the network. It’s merely what contracts the network’s influences onto patterns that are present.

Also, if you’re wondering “doesn’t entropy always increase?”: the brain is not a closed system. My body spends chemical energy and increases the overall entropy of the universe, so it can reduce its internal entropy.

Atoms are not indivisible parts with coherent identities, of course. But for the sake of our current explanation, it’s OK that we pretend they are, for a moment.

A metallic crystal is different from a crystal of (say) quartz, or ice. When metal atoms bond, their outermost electrons remain free to interact or shift. This is why metals can be 1) such good conductors, and 2) malleable and ductile.

A quartz crystal is also regular and has many planes through it, but the rigidity of its bonds means there’s a much higher barrier to movement along those planes: quartz is hard and brittle.

If I cool it off quickly, such as by plunging it in water, the result is not so soft. Even though by heating the metal I did cause its atoms to escape their prior, relatively static disorder, the hot states they enter are dynamically disordered. Suddenly cooling the system traps the atoms in whatever state they happen to be in at the moment, without letting them settle into their preferred equilibrium.

Metal alloys (most often steel) have a much more complex state space than I’ve implied here — but still it’s tiny compared to a brain’s.

Or maybe it feels like nothing at all. The patterns I’ve clenched on can appear quite convincing, while I continue to clench on them. Cope can be blinded by even more cope.

An analogy from machine learning is that we want to increase the learning rate.

Actions are predictions. Moving your arm towards an apple and grabbing it is like making my prediction come true that I’ve grabbed the apple. The emergence of goals and behaviours is what really gives meaning to “predicting your senses”, though scientists still only have glimpses of how all the brain’s parts coordinate themselves at that level.

For the moment, we’re focusing just on the basics of the game of sensory prediction.

If we zoom in on certain parts of the brain, such as the early part of the ventral stream of the visual cortex, a layered prediction hierarchy is apparent enough. If we refocus on other areas, such as the prefrontal cortex, things are much more of a mess.

Along a hierarchy, it’s appealing to expect progressively “higher” layers to deal with progressively more abstract predictions. The “low” layers of my ventral visual stream tend to be excited by unexpected edges or lines in images; the “higher” layers, by unexpected shapes or objects. Predictive coding suggests that when the object-layer has already concluded that a bird is flying leftward across my field of view, it will send down predictions to the edges-layer: a bunch of bird-forming edges should be appearing lefter-and-lefter. So when the bird moves suddenly rightward, this creates prediction errors in the edges-layer which are sent up to the object-layer — which changes its mind about where the “entire bird” is really heading.

Of course, we don’t actually predict everything that falls on our retinas, and most edges and shapes are actually unexpected.

We could manipulate the local defects in the lattice to encode a ton of information, and use a lump of iron as a storage medium. But this is not the same thing as the lump of iron being alive and running on complex mechanisms of its own.

And there are other concerns. Neurons’ actions aren’t necessarily limited to firing off all-or-nothing signals to other neurons, nor are neurons necessarily the only type of cell that participates in making predictions. Let’s leave those concerns aside, for the moment.

Though neurons may wiggle around slightly as they fire, due to changing charge distributions, electromagnetic forces, and vascular contractions.

There are some other aspects of atomic state, such as magnetic dipole moments, that can change somewhat independently of bonding. Still, what happens with neurons is far more complex.

It isn’t necessarily the case that persistent memories must be stored as persistent structures of neural connectivity — rather, the precise connectivity may change so long as effectively similar activity is reproducible at a later time. I don’t see this as particularly problematic to the (almost tautological) argument that longer-term structures tend to be more secure.

If I can access many different patterns (including the desirable ones) when I’m in a high-energy state, why shouldn’t I stay in that state all the time? Because there is a tradeoff between being able to access any pattern (high entropy) and being able to produce a given pattern robustly (low entropy). More on that in just a moment.

Why should we expect this? Take the perspective of reinforcement learning and evolution:

When the usual associations we’d make are missing, that’s when exploration is most adaptive. When our environment is unfamiliar — when we can’t get away with doctoring our perceptions to insist that it is familiar — that’s when we have the most to gain from relaxing our prior vibes or beliefs, and re-balancing the dynamics of our neural networks.

Being in an unfamiliar environment increases the entropy in our brains — it introduces a bunch of variation in our perceptions that’s more difficult to explain away. So shouldn’t the brain’s parts respond to the local signs of increased entropy by re-balancing themselves? And what would happen if evolution had not harnessed mechanisms that allowed this to happen?

This is directly related to both the plasticity-stability dilemma of neural networks, and the explore-exploit tradeoff of reinforcement learning.

This is a kind of steelman of the “reptile brain” nonsense. No, your brain is not an onion with a tiny reptile inside. No, new structures aren’t simply tacked on top of old ones. But the dynamical topology of evolved systems cannot be simply overwritten or backtracked by further evolution.

It is possible to learn efficiency, fluency, and mastery. It is possible to confidently relax in the face of threats. But to achieve these things without the benefit of some discrete high-energy annealing events, if that’s even possible, requires that we find ourselves flowing down just the right learning trajectory. Who’s to say most of us should be so lucky, in the landscape our society already provides?

In the frame of annealing, this is what the default mode network of the brain (and maybe the ego of the mind) might be doing: trying to coordinate the brain as robustly as possible, suppressing apparent irrelevancies until it can’t get away with it anymore.

Of course, we aren’t born with strong enough models to be able to do this, which is why children’s egos and habits are fragmentary.

I am familiar with the notion that neurons tend to oscillate together at certain frequencies; for decades it’s been a rather large theme in neuroscience.

Johnson uses anger as an example of an emotion that, if associated with a particularly loud BH, might forcefully synchronize all particulars to its cause.

Expert meditators have been using terms like contract and expand for a while. Some practitioners even speak of “burning away” the clenchy patterns that would otherwise defile their minds!

See Scott Alexander’s review of this paper.

Note that precision = steepness. A probability distribution that is steeper is more concentrated or one or more points. It puts higher probability on those points, more precisely. It’s pointier.

Robust Enough

Discussion about this post