We gave the map its best shot
2026-06-26 · negative result · experiment W
The question
We have a rule for a losing idea: park it, do not kill it, and give it a fair rematch on the axis it can win. The meaning-map lost once. We placed words in a space where nearness meant similar meaning, and it could not predict the next word. We parked it.
This is the rematch, and we set it up to win. We gave proximity its best form, a graph of word associations the sources endorse, instead of raw distance in a space. We put it inside the best stack we have. And we judged it on the one slice it was always meant for: rare contexts, where the sequence model has run out of counts and a similarity map should finally help.
What we tried
We built one integrated predictor over words. Its core is offset-keyed count-attention, the workhorse from earlier. Around it we pooled three experts by a weighted geometric mean: accumulated leaky evidence for robustness, an online-clustered topic prior, and the proximity gather. The gather seeds the recent context words on the association graph, spreads one hop, and pools what lights up. A smooth backoff weight turns proximity and topic up only when the direct context is starved.
Then we sliced the held-out text by how much the immediate context had been seen, swept the weights, and bootstrapped the gaps with confidence intervals.
What happened
The offset core is the workhorse, in perplexity, lower is better:
| model | perplexity |
|---|---|
| bigram | 694.5 |
| offset-attention core | 322.7 |
On the slice that decides it, the rare contexts, the experts split cleanly:
| added to the core | rare-context perplexity | the gap | significant? |
|---|---|---|---|
| evidence | 225.3 | +12.5, CI [6.7, 20.6] | yes |
| proximity | 242.2 | −4.8, CI [−10.8, 3.0] | no |
Evidence earns its keep. It costs about 7% on common contexts and so loses overall, but on rare and unseen contexts it wins significantly, standalone and inside the combination. That is the same pattern accumulated evidence showed before, repeating one level up. Proximity, given a real supported shot, has no rare-context win. Its gap is negative and not significant, and removing it from the full stack never significantly hurts.
There is a clear structural reason. A thin context means the preceding word is rare, and rare words are absent from the top of the association graph. So on the rare slice the gather is seeded only by the common words that remain, and those spread to a generic distribution the topic prior already covers, better.
The lesson
Accumulated evidence earns its keep on rare contexts, significantly, standalone and in combination. Proximity, given a real supported shot inside the best stack on the right slice, still has no prediction niche. We followed the rule and earned the right to set it down.
So proximity is parked deeper, with a mechanism named. Its live use is inspection and similarity, not prediction. This is what the fragile-ideas rule is for: not to keep every idea forever, but to give each one a fair, supported test on the axis it claims, and to set it down only once it has had that test and lost with a reason.
Lineage
Grew from the meaning-map that did not predict, the idea this gave a fair rematch, a vote that remembers, the accumulated evidence that finally earns its keep here, topic ignition, the topic prior, and counted attention, the core.
Led to the graph kept for inspection and similarity, not prediction; the proximity-as-predictor line is now parked deeper, with its reason recorded.
Thread: the right combiner, and fragile ideas, a loss earned the honest way rather than declared on the headline.