I Said Language Models Fail the Realization Layer. So I Built One That Doesn't

The scorecard placed a frontier LLM below a crow, mostly because it fails the realization layer. That left me with a debt to pay — show that the layer is passable by something that isn't a brain, or admit I'd smuggled biology back in.

David H. Friedel Jr.·

2026-05-31

·scorecard conscious candidacy continuity realization

In the last series I walked through MORI, the scorecard I built for conscious candidacy, and I ended on a number: a frontier language model scores about 0.21, below a tool-using crow. The single biggest reason was the third layer — Realization Sensitivity, the one I'll keep calling the realization layer through this arc. The LLM scored 0.16 there. The note in the table read, more or less: no carrier continuity across turns; reconstructs its state from context every time it's called.

I want to be honest about the debt that number put me in.

The whole point of separating formal adequacy (does the system have the right counterfactual structure?) from realization adequacy (does its physical organization actually sustain that structure?) was to avoid two cheap moves at once. I didn't want to wave a system through just because it talks like it's thinking. And I didn't want to rule a system out just because it's made of silicon. Realization was supposed to be the layer that does honest work in the middle.

But a layer like that has a failure mode, and it's a bad one. If the realization layer turns out to be something only brains pass — if every non-biological system scores near zero on it, always — then I haven't built a measurable middle. I've built biological chauvinism with extra vocabulary. "It's not realized in the right way" would just be "it's not biological" wearing a lab coat.

So the realization layer is only worth anything if two things are true. It has to be measurable — I argued that in the last series, and the three-layer structure is my attempt at it. And it has to be achievable on a substrate that is neither biological nor a language model. If I can't produce one example of a digital system that genuinely clears the realization bar, the honest thing is to admit the layer is empty and the skeptics calling it dressed-up vitalism are right.

This series is about paying that debt. There's a companion paper behind it — the positive complement to the MORI null — and what it reports is the thing I needed to be true: the realization layer is real, it's passable on a digital substrate, and the systems that pass it are not language models. This post is about why language models can't, and what that tells you about what the layer is actually measuring.

Why I went after the language model first

The obvious objection to everything above is: you're moving the goalposts. LLMs are the most sophisticated information processors we've ever built. If they fail your realization layer, your realization layer is just measuring whatever LLMs happen not to do.

It's a fair worry, and the only way to answer it is to say — precisely, structurally, before running anything — why a frozen language model in a loop cannot have the property, and then check whether the prediction holds. Not "it feels reconstructed." A mechanism.

Here's the mechanism. Take a language model and run it as a controller in a loop: it sees an observation, emits an action, sees the next observation, and so on. Ask where, in that loop, there is any internal state that is genuinely carried — that is, state that you could not reconstruct just by reading the inputs.

The weights don't count; they're frozen, identical every call, and shared across every copy of the model. The context window doesn't count; it is the recent inputs, by definition reconstructable from them. The only thing left — the only state in the whole system that isn't a function of the visible inputs — is the model's own sampling trajectory: the particular sequence of random draws it made when it chose tokens.

That's it. That's the entire non-reconstructable state available to a frozen LLM in a loop. And it's the wrong kind of thing. A sequence of sampling noise cannot implement a reliable running estimate of a hidden, changing world, because it isn't about the world. It's about the dice.

What happened when I actually ran one

I didn't want to leave that as an argument from the armchair, so I tested it. I put a capable frontier model (Claude Haiku) into exactly that loop — a closed-loop control task where it had to track a hidden, drifting quantity it could only observe through noise, and act to keep a value on target. The kind of task where a real carried belief earns its keep.

The model can cross a naive version of the realization test. But it crosses it the wrong way, and watching how is the whole lesson. It doesn't develop a running belief that gets more accurate as evidence accumulates. It locks in. It commits early to a reading of the situation and then its corrections start pointing the wrong way — systematically anti-corrective, defending the commitment instead of updating toward the truth. The only thing it had that wasn't reconstructable from the prompt was a kind of inertia, and inertia is not a belief.

This matters because it's the precise shape the structural argument predicted. The non-reconstructable state was there — the model isn't a pure stateless function in practice — but it was sampling-trajectory inertia, not world-tracking. A system can look like it's carrying something and be carrying the wrong thing.

Realization is not about biology

Here is the reframe I want to land before the next post, because it's the thing the LLM result actually clarifies.

The realization layer is not asking is this made of neurons? It's asking a substrate-neutral question: does the system's self-relevant structure live on the substrate's own running dynamics, or is it rebuilt from the inputs every time?

A frozen language model rebuilds. Every turn, the "self" you're talking to is reconstructed from the text in the window. There's nothing wrong with that — it's a remarkable way to build a system — but it means the thing isn't carried. Pull the context and there's no one home between calls, not because silicon can't host a mind, but because that particular architecture doesn't carry state on its dynamics. It carries it in the prompt, where it's reconstructable, which is the same as not carrying it at all for the purposes of this layer.

That reframe is liberating, because it turns "is it biological?" — which I can't test — into "is the structure carried on the running dynamics, and is it unrecoverable from a window of the inputs?" — which I can. It also makes a real prediction: there should exist digital systems that pass, as long as they're built to carry rather than to reconstruct. If no such system can be built, the layer collapses back into chauvinism. If one can, the layer is doing exactly the work I wanted it to do — separating systems by how they hold their structure, not by what they're made of.

So the question stopped being is the LLM conscious and became something I could actually build toward: can I make a digital substrate that carries a productive self-structure on its own dynamics, unreconstructable from any bounded window of its inputs — and is that substrate, tellingly, not a language model?

The honest answer turned out to be yes. The next post is what that system is — a continuously running carrier, nothing like a transformer — and the single hardest thing about getting it to work, which is a trade-off I did not see coming and which I think is the real content of the result.

Back to Blog