Meaning is a map, not a road

2026-06-26 · negative result · experiment P

The question

Here is a seductive idea. Place every word in a space, so that words near each other mean similar things. Then a sentence is a path through that space, and you predict the next word by extending the path, like casting a ray. Reasoning becomes movement. We had to test it, because if it worked it would be a different and lovely engine.

What we tried

We built the space. We measured how words co-occur, turned that into coordinates, and placed each word column in it. Then we tried to predict the next word three ways: gather from nearby words, spread activation through the graph of neighbors, and extrapolate the recent trajectory like a ray.

What happened

The space is real and meaningful. Nearness is similarity, cleanly:

three sits beside four, five, six
king sits beside prince, son, daughter
france sits beside spain, italy

But the space does not predict the next word.

method	next-word accuracy
plain bigram	20.9%
gather from nearby	17.1%
spread through the graph	20.7%
extrapolate the trajectory (the ray)	1.8%

The ray idea was not a little worse. It was catastrophically worse, 1.8% against the bigram's 20.9%. A sentence is not a straight line through meaning-space. Gathering and graph-spreading roughly matched the bigram but never beat it, so they bought nothing.

This matched a warning we had read independently in the research forums: a metric, Euclidean reference frame for language is a dead end, because symbolic categories have no natural origin or unit.

The lesson

The embedding is a map of similarity, not a road for prediction. Proximity is meaning. Proximity is not sequence.

So we did not kill the idea, we re-aimed it. The space stays, as a similarity tool: for inspection, and as a backoff prior for rare contexts the sequence model has never seen. The fair test we have not run is exactly that, whether spreading beats a bigram only on the rare contexts where the bigram has no counts at all. That is where a similarity map should help, and it is the one place we have not yet looked.

Lineage

Grew from the seductive idea that meaning is a geometry, tested on its own.

Led to the fair rematch, where we gave the map its best shot, the graph form inside the best stack on the rare-context slice, and parked it deeper with a reason. The loss also helped write the fragile-ideas rule, judge each idea on the axis it can win.

Thread: a clarifying negative, parked rather than killed.

‹ A vote that remembers More data, all the way to a gigabyte ›