You can't write your signature backwards

2026-06-26 · clear win · experiment V

The question

A talk from the Thousand Brains Project makes a striking claim about behavior: a movement is a sequence of changes, learned apart from any one object, and shared across all of them. You learn to write, then you can postit on paper you have never touched. The location is unique, the movement is shared.

We wanted to know whether text has the same structure. Is there a memory of change that transfers to words it has never seen, runs in only one direction, and lets a cue prime what comes next? Three claims, three measured tests.

What we tried

For the transfer test, we split the vocabulary into two halves with no word in common. We trained two models on the first half only, then tested both on the second half they had never seen. One model memorized spellings, an ordinary character n-gram. The other memorized moves: it threw away the letters and kept only the transitions between vowel, consonant, and space.

The directionality test ran one forward model, then abused it to predict backward, and compared that to a model trained backward from scratch. The priming test asked whether a feature at a word boundary primes the opening of the next word.

What happened

Change transfers, content does not.

model	seen words	unseen words	how much worse
spelling (content)	1.73	3.71	+114%
movement (change)	1.08	1.35	+25%

The spelling model more than doubled its loss on new words. It had overfit to specific spellings. The movement model degraded four and a half times less, because vowel-and-consonant moves are genuinely shared across words it never saw.

The trajectory runs one way. Running the forward model in reverse cost it 3.24 bits and collapsed its accuracy from 0.60 to 0.11, barely above chance. A separate reverse model recovered all of it. You replay a trajectory forward. Reversing it is not free, it needs its own memory. You cannot speak backwards.

A cue primes the next word. A learned feature of the previous word, used as a cue, lifted the chance of guessing the next word's first letter from 0.15 to 0.21, a 37% relative gain. The effect was real for the first one or two letters and ran into a data wall at three.

The lesson

A representation of change transfers across words, runs in one direction, and lets a cue prime what comes next. All three claims reproduce on text.

None of these wins are about beating a bigram on raw cost. The movement model has a harder job and a worse absolute score. The point is that its loss barely moves to new content. That is exactly the value the talk argues for, and it is the kind of generalization a frozen lookup table does not have.

Lineage

Grew from the first boundary signal, whose surprise marks the changes this memory tracks, and from the Trajectory Memory talk.

Led to change models as a transferable layer, directional and primable, for the cortex to carry.

Thread: surprise, here as a memory of change rather than content.

‹ When the room agrees on a topic Attention without the math ›