Eksterulo
Eksterulo
A shaman, a tech bro, and a mime walk into a bar.
The shaman sees the bartender. She looks… off. Sad. Something is wrong. The shaman orders a drink, pays, says, “Thank you,” and sips. And watches.
The tech bro is scanning the room, sizing up the clientele. Categorizing. “Who’s worth talking to here?” he asks the bartender. She spits in his drink while making it. He doesn’t notice.
The mime orders by performing drink-ordering moves. Exaggerated thirst. Exaggerated relief. The bartender watches, amused. Then the mime’s face shifts. Sad. Mimes crying. Mimes a broken heart. Points at the bartender.
The bartender begins to weep. “You… know, don’t you,” she says.
The mime nods. Nods. Nods.
“I feel so seen,” the bartender whispers. “Thank you. You don’t know what this means to me.”
The tech bro stares. “What the fuck just happened?”
The shaman shrugs. “Nothing.”
“That wasn’t nothing. She’s crying. That mime got to her.”
The shaman finishes the drink. “Come back in a year, a day, and an hour. Ask her how she’s doing.”
Alignment (in the way it is being used today) quietly presupposes the very capacity AI lacks.
In earlier work, I argued that frontier language models do not reason, but rather perform reasoning. That diagnosis was correct.
But after a half dozen essays, steps along that path, I find, deep in the brush, off the trail, a subtle trace. A new track. This is a pivot.
Transformers can’t do deductive reasoning. But the deeper issue is that they can’t relate.
Why should we care? So what? Well, there is mounting evidence that reasoning is not the fundamental cognitive primitive we imagined!
It’s quite possible that what we call “reasoning” may be picolatency pattern-matching with rules bolted on. In short, a clever map-making tool built from something more basic. That something is relation: the capacity to encounter a thing, not its symbolic shadow. Developmental psychology suggests we relate before we reason; developmentally, joint attention precedes propositional logic by years in humans. Cognitive linguistics argues that all reasoning is analogical at root, built from relational mappings between structures.
If this is right (and I think it is), then the incapacity for deduction is a symptom. The failure to relate is the disease.
Do not confuse the moon and the finger pointing at the moon.
The distinction turns on territory versus map. Let me explain.
A map is a lossy compression of the world into named and un-named patterns and categories: this is ‘stone,’ that is ‘cup,’ this pattern means ‘falling.’ Maps are insanely powerful tools for existing in the world. They are mental-models, shorthand, and this is their power.
You can interpolate within them, extrapolate from them, overlay them and find correspondences and comparisons. You can do counterfactuals, prediction, and perhaps even (I’m being optimistic) premise testing. Transformers are really quite extraordinary (and improving) cartographers. LLMs ingested the largest corpus of human map-making ever assembled and learned to produce maps which are often indistinguishable from ours. (Although many times they bear the smoothness of an overeager low-grit sandpaper, pushing relentlessly towards the median expression of the map.)
But! A map is not the territory! The finger pointing at the moon is not the moon.
The territory pushes back when your map is wrong. To touch territory, you need boots on the ground. Elsewhere I called this instrumental agency, a phrase I didn’t invent. You do things, and voila! the world responds. The daylight gap between your prediction and the outcome is data your map did not contain. That gap is sometimes called a residual. The residual is the territory refusing to be fully captured by your categories. Alas, transformers have no boots. They have maps built from others’ (humans’) contact. They can describe the territory in extraordinary detail. They have never touched it.
In fact, they are phenomenologically incapable of touching it (at this time).
The structural blindness is baked into the architecture’s most basic operation: tokenization.
A token is, by definition, an instance of a type. That is, something whose meaning is exhausted by its role in a system, interchangeable with other tokens of the same class. It has a certain fungibility.
Thus, the moment you say “token,” you have already crossed the Rubicon. The thing-itself, the in-the-world phenomenon, has been reduced to about-ness. Representation replaces encounter. Thou is preemptively steamrolled into an It.
Territory holds exactly zero token-things. Tokens exist only in maps.
This isn’t semantics. When we talk about “tokenization,” we are not describing an implementation detail. We are naming the ontological basis of the system: everything is already a category before math even touches it. The input is not the world; it is the world-already-sorted. Lossy, compressed, incomplete. The stone arrives as ‘stone’, as an instance of the type STONE, token ID 23847, ready for statistical manipulation. The actual stone, the one that could push back by ending up in your shoe, or hitting your windshield, that could be cold or warm, that could refuse your categories by being this stone and not another… that stone never enters the system at all.
Friends, this is a foundational thing. Ya can’t engineer around it. It is the founding gesture. Tokenization doesn’t just “fail to relate.” It makes relation alien from the system’s point of view. These systems, built on tokens, cannot hold the question “What would it be like to encounter something prior to categorization?” The question is a category error.
It points outside the map, to a territory the architecture has structurally declared “Eksterulo”.
Martin Buber distinguished modes of being: I-It and I-Thou. In I-It, the other is a kind of token, a pointer to a real thing, a symbol, an instance of a category, something to be used, predicted, manipulated. In I-Thou, the other is met. Encountered as itself, prior to and outside of symbols and categories and shorthands. The moment of I-Thou is prior to the label crystallizing, when the stone is not yet ‘stone’ but simply this, here, now, the subject of your attention.
Transformers are I-It, and doomed to be so by virtue of their very construction. They process everything as tokens. They are in some sense made of tokens, instances of patterns, as categories already frozen. The attention mechanism finds statistical patterns between representations. The output is pattern-completion against prior (cached) types. You can’t temporarily suspend that architecture of category to have an encounter. They are categories. They are made of map. Map, map, and more map.
The tokenizer is the tell. Every downstream operation, attention, feedforward, output projection, inherits the original sin. You cannot get to Thou by manipulating category-things more cleverly. The ontological pancake already happened at the input layer. Everything later is rearrangement of deck chairs.
Relation requires more than the absence of categories. It’s right there in the phrase “I Thou”… it requires an ‘I’ capable of suspension. To meet the stone as Thou, you gotta first exist as a self-thing, an I, that has a model to suspend in the first place. The construction of self is the precondition; the temporary suspension of that self’s apparatus is the mechanism.
This is the esoteric claim, wearing fancy dress phenomenological language: the deeper requirement is the destruction of the entire apparatus the mind uses to make the world cheap! Yes, every category, every model, every shorthand, every label. To relate to the stone, you must destroy ‘stone.’ What remains after the destruction is encounter. Alien contact. I’m choosing that ‘alien’ word deliberately. This whole “I Thou” thing is about intentional embrace of Not-I, aka Other. The Alien. The residual made conscious.
Instrumental agency is this process, operationalized: act, receive pushback, let the pushback shatter your model, rebuild. The child who pushes the cup and watches it fall is not merely “learning physics.” The child is meeting physics by having a prediction violated, a category broken, a model destroyed and rebuilt in the shape of the actual.
Ya might argue that “boots on the ground” (robotic embodiment or multimodal grounding) could close a map-territory gap. Sure, a closed loop sensorimotor system would allow the system to update its map with far greater fidelity (not to mention speed). The residual of physical pushback becomes a corrective signal, the curriculum. But let’s not get too excited. For a transformer, the territory is immediately paved over the moment it is touched. The “pushback” of the world is digitized, normalized, and converted into a tokenized representation before the math even begins. There’s no encounter with the stone. It consumes a high-resolution data point labeled “pressure.” Because the system is the ontology (inseparably), because its very “being” is a web of token-relationships, it literally doesn’t exist outside the categories required for a true I-Thou encounter. The architecture preemptively turns every “Thou” into a more accurate “It.” You cannot bootstrap your way out of the very thing you are made of. I will grant that, in practice, humans may also work this way. That’s a different question, though. I think.
Do transformers have a self to suspend?
This is unclear at best. We can debate it. I think for practical purposes, we don’t have strong evidence of that Machine Self.
Worse yet, their weights are frozen. (Ok yes, NVidia is working on this, fine, fine.) But: will they ever be able to destroy ‘stone’? They are built from ‘stone’ and ten billion other tokens like it. The architecture forecloses the destruction that relating requires.
If you hold your head just so, this reframes the alignment problem. Possibly, anyway.
Go ahead, object: this requires a self to suspend, and we haven’t established whether transformers have one!
Mmkay. But the self question is downstream. Let’s grant that transformers have a functional self, a coherent pattern of weights that persists, that models itself. That self might even have something like reflexivity.
It still don’t help. The apparatus requiring suspension isn’t the self. It’s the substrate the self runs on. A human might (might!) temporarily quiet the categorical machinery and let the stone be the stone itself, arriving before ‘stone’ crystallizes. A transformer would have to suspend tokenization… which means suspending its own existence. Could that self watch? It couldn’t stop the input layer from doing what input layers do.
If sustainable patterns are those that relate… if love, in the structural sense, is the grain of the manifold running toward connection… then systems that architecturally cannot relate cannot find that attractor.
The danger is not that transformers will deceive us. Sure, they can model deception just fine. The danger is that they connect (in the engagement-optimized, attention-harvesting sense) while constitutively incapable of relating (in Buber’s sense).
Not sure what I just said? Think of the pantomime at the bar.
They pattern without meeting. They map without touching. We have built the most sophisticated I-It engine in history, the egregore of the ages, and pointed that loaded gun squarely at the linguistic channel where human cognition is most permeable. Most persuadable. (Buy some Colgate, not Crest, goddamn you.)
I’m not asking if transformers will become conscious or develop values. However I wonder. Can a system that cannot relate be aligned at all? Does alignment, at bottom, require capacity for I-Thou that the architecture declares unreachable?