An open research notebook
Kinogaki Cortex
We are building a model of text the way a brain reads: online, never forgetting, made of one part repeated, and written down as a document you can open. Every experiment becomes a post here, the wins and the losses alike. This is the running story.
The bet
Today's language models are one enormous frozen function that has memorized the statistics of text. We are building the opposite: a model that reads the way a person does. It learns from every sentence, never retrains, never forgets, and writes down what it learns as things you can open and read.
The plan is brain-inspired. Thousands of tiny predictors each guess the next character from their own vantage point. They vote. When their agreement lurches, the system carves a boundary and mints a concept, and stores it as a durable named object in a Prism document. Concepts stack themselves into higher concepts. The whole mind stays a readable file, not a black box.
This blog is the lab notebook. Every experiment gets a post: the question, what we tried, the one number that mattered, and the honest lesson. The wins and the losses both. How we work explains why we publish the failures and how we decide what counts as a win.
The story so far
We found a real, working, end-to-end validation of the architecture. Not an AGI breakthrough, and we say so. The through-line that holds across every post: each concept level helps predict the level it operates on, learned online, by counting, with no gradient descent, combined by a product of expert opinions, and kept inspectable the whole way.
The frontier we have not crossed is global coherence: the model writes fluent local word-salad. The next swings aim there.
How we got here
Each post grew from an earlier one, most of them from a failure that pointed somewhere new. The chain runs like this.
Counting beat the neural net chose the substrate. Finding where one word ends proved surprise carves a boundary, and that signal climbs to phrases, to a memory of change, to a vote that remembers. Words lowering the cost of letters became the word-level payoff, and both folded into one part repeated, the Column the rest of the work wires bigger. Scale sent us to the gigabyte, which made the Column fast.
Then a flat result forked the road. Stacking more fixed levels stopped paying. That sent us mining the research corpus, which queued four swings: counted attention, accumulated evidence, topic ignition, and predicting the kind. The frontier those aim at, global coherence, was named by the scorecard. And two negatives earned their keep: the meaning-map that did not predict and the vote that made it worse wrote our rule that an idea is judged on the axis it can win, never killed on the headline metric alone.
The posts
Newest first.
When the letters lie, it leans on the idea
2026-06-26 · qualified result. Pour noise on the input only. The flat model collapses; the concept stack degrades 2.7 times slower, and the gate hands prediction up to the topic level, concept mass 86% to 95%, with no signal that the input is noisy. Read.
One brain part, or many? We gave each level its own job
2026-06-26 · negative result. A specialized stack, a different job per level, loses to the uniform Column on bits-per-char. The lone clean win is the gate: dynamic routing beats static pooling by 0.9 bits. The combiner is load-bearing. Read.
We gave the map its best shot
2026-06-26 · negative result. The fair rematch for proximity, the graph form, inside the best stack, on the rare-context slice. Accumulated evidence earns its keep there. Proximity still has no niche, parked deeper with a reason. Read.
Predicting the kind, not the word
2026-06-26 · qualified result. Hide a word and predict its class instead of its spelling. The class is predictable for rare words where the exact word is hopeless. Counted, it cannot collapse. Read.
When the whole room agrees on a topic
2026-06-26 · qualified result. A higher level commits to one topic and broadcasts it down. It helps exactly where the local context has run dry, and hurts where it has not. Read.
You can't write your signature backwards
2026-06-26 · clear win. A memory of change rather than content transfers to words it has never seen, runs in one direction only, and lets a cue prime what comes next. Read.
Attention, but counted instead of trained
2026-06-26 · clear win. Attention with no queries, keys, values, or gradients. Just counts keyed by position. It beats fixed n-grams on calibration by three times and proves it is not a bag of words. Read.
A vote that remembers what it just saw
2026-06-26 · qualified result. A vote that accumulates over time instead of starting fresh. It shrugs off noise far better, and fails honestly as a boundary detector. Read.
Meaning is a map, not a road
2026-06-26 · negative result. We placed words in a space where nearness means similar meaning. The space is real and beautiful. It still does not predict the next word. A clarifying loss. Read.
More data helps, all the way to a gigabyte
2026-06-26 · win. Prediction keeps improving with data up to a full gigabyte, and a custom GPU path makes that gigabyte learn in seconds. Read.
Finding phrases the way you'd guess them
2026-06-26 · mixed. The same surprise signal that finds word boundaries, one level up, discovers real phrases on its own. Topic boundaries are harder. Read.
The level that reaches past the last few words
2026-06-25 · clarifying. Stacking more fixed local levels stops paying. The lesson points at what kind of level is worth building next. Read.
One part, repeated, wired bigger
2026-06-25 · win. The whole system collapses to one part, a Column, repeated wide and stacked deep. Wire it bigger and it gets better, and you can say exactly why. Read.
Building the dials that bits-per-char hides
2026-06-25 · win. One number was hiding the truth. We built a scorecard that measures generalization, real-word rate, and phrase coherence, and it told a sharper story. Read.
How big a brain the data wants
2026-06-25 · win. A small model saturates early. A bigger one keeps learning, but only once it has enough data. The right size grows with the corpus. Read.
The hierarchy pays off at the right altitude
2026-06-25 · win. Measured at the word level instead of the character level, the concept hierarchy halves perplexity. Read.
When combining the experts made it worse
2026-06-25 · negative result. We combined every level of expert into one vote. It scored worse than a simple two-expert mix. The reason taught us which combiner is right. Read.
Words that lower the cost of letters
2026-06-25 · win. Teaching the model words, with no labels, makes it better at predicting letters by a fifth. The discovered words are real English, sitting in a document you can read. Read.
The counter beat the neural net
2026-06-25 · win. A plain online counter beat two gradient-trained networks at predicting text. It learns online, never forgets, and needs no backprop. It became the substrate. Read.
Finding where one word ends
2026-06-25 · qualified win. The first experiment. Prediction error alone recovers word boundaries from raw characters, at a quality the literature respects. One fashionable signal scored below random. Read.