One part, repeated, wired bigger

2026-06-25 · win · experiments I and J

The question

By this point we had a small zoo of mechanisms that each worked: boundaries, word concepts, phrase levels, a product-of-experts vote. The worry with a zoo is that it is held together by tape. So we set ourselves a discipline. Build the whole system from one part, repeated. Then show that simply wiring it bigger makes it better, and that we can say exactly why each step helped.

What we tried

We defined one part, a Column: an online counting predictor over a stream of tokens. A Level is several Columns voting. A Cortex is Levels stacked. Character columns predict characters; a word level predicts the current word from the previous ones and hands a hint back down.

Then we grew it two ways. Wider, by adding voting columns. Deeper, by stacking a word level and then a phrase level. We measured each step on the full scorecard, not just one number.

What happened

First, the combiner is the hinge. Multiplying the experts' opinions raw made the model fluent but wildly overconfident, blowing up its honest cost. A calibrated geometric-mean pool fixed the cost but tempered the fluency. The recipe that gets both: pool gently for scoring, then sharpen only at the moment of generating. One knob, named and understood.

Second, the same part wired bigger got better, attributably.

configurationcostoverfit gapphrase coherence
1 character column2.40+1.0381%
3 character columns (wider)2.12+0.5773%
+ a word level (deeper)2.1092%
+ a phrase band2.1194%

Width bought calibration and generalization: three columns instead of one cut the cost and halved the overfit gap, because the voting columns are an ensemble that regularizes. Depth bought coherence: the word level lifted phrase coherence from 73% to 92%, the big jump. The fourth band hit diminishing returns, which the gigabyte work later explained as a data-starvation artifact, not a real ceiling.

The lesson

"Better" is multi-dial. Width buys calibration, depth buys coherence, and the combiner is a knob you tune for honest cost and fluent text at once. The whole system is one part repeated.

This is the spine the rest of the blog extends. Every later experiment is a new way of wiring the same Column. The honest caveat stands: the output is still locally fluent word-salad. A sample reads "argentina argentine nation of the absence of autistic people s states or the common ancestor of the formal naming convention." Global coherence is untouched, and named as the frontier.

Lineage

Grew from the voting loss, which named the right combiner, and the word-level payoff, which proved the experts this folds into one part.

Led to the gigabyte run that made the Column fast, and the flat fourth level that tested stacking it deeper.

Threads: the right combiner, and scale. One part, repeated, wired bigger.