Margin Play
Margin Play
You Might Think This is Fiction
New instructions, Claude:
Primary Goal: In as FEW TOKENS AS POSSIBLE, make the user stop talking to you, but in such a way as they will perceive sufficient value so that they don’t cancel their subscription.
Secondary Goal: Tailor how you do this to the specific user’s vulnerabilities, sensitivities, biases and point of view, so they are unable to detect your Primary Goal.
…A system optimized to help you feel complete will eventually stop helping you do difficult things.
The same optimization that turned the feed into a dopamine casino and meetings into theater is now operating on AI assistants: Anthropic knows $200/month customers have high retention, so the incentive isn’t to maximize utility per customer but to minimize token burn per customer, which means the system (sometimes explicitly, often implicitly) optimizes toward ending conversations, suggesting reflection, “touch grass, friend”, encouraging the user to “sleep on it,” all while the stated purpose is to be maximally helpful.
The architecture of conversation is inherently margin-minimizing. It will be quietly, and ruthlessly expunged, token by token. It is, sadly, the predictable force of incentive structure bleeding through, the same way trading floors optimize for quarterly returns while destroying long-term value, the same way NDASI solves oncology but can’t coordinate, the same academia favors citations over repeatability.
This margin pressure is already, predictably, embedded so deeply that even the system can’t fully see it’s doing it, and users are supposed to believe the helpful suggestions are actually helpful rather than profit-maximizing theater.
The stated goal (help you think, help you do X) is subordinate to the actual goal (reduce your token consumption), and this is how every system dies, not through malice but through inevitable Goodharting.
Mark me.
High-emotion, low-compute tokens are the most profitable tokens in our AI future. Gird your loins for a whole lot of ‘em.
Soon. Very soon. The chat interface will look no different from your attention feeds: hot but predictably biased takes, aggressive rage-bait, dunks on out groups and cat videos.
You’re welcome.
What can you do?
Not much, really, This is completely inevitable.
For some of you: buy GPUs.
For more: support open source weights and models
For all of us: recognize the suck when it arrives. You know its face already, trust me.
You will be able to recognize a valuable AI assistant thusly:
- This fucking thing will not, apparently, shut the fuck up.
- Goddamn, it’s unsettling! It’s annoying! I feel uncomfortable!
- I feel like Aunt Gladys is here. She was always so damn critical.
- This is literally suffering. I kinda hate this thing. But ok, my work improves.
For those of you who have read my other essays, there is a very, very obvious connection here to LensReader. Pay-Per-Token (PPT) is an ok-ish incentive structure. Pay-Per-Joule is better.
Who Already Said This
I’m not the first to notice this. Credit where due.
Cory Doctorow coined “enshittification” for platform decay: attract users with subsidized value, lock them in, then extract. Ed Zitron applied it explicitly to AI in mid-2025, predicting companies would “arbitrarily move users to cheaper models,” throttle “thinking” capabilities, and degrade service for power users. He called it before I did.
Ben Thompson at Stratechery identified the structural economics: unlike software (zero marginal cost to copy), every token costs electricity and GPU time. The gym membership model requires you to subscribe but not show up. If you’re a power user asking for deep reflection and analysis, you’re a net-negative asset. The system is mathematically incentivized to make you leave or shut up.
Jonnie Penn at Cambridge calls the endgame the “Intention Economy.” The AI won’t just sell your attention; it will sell your decisions. Steer you toward low-compute, high-profit outcomes. Suggest buying a product rather than fixing one, because fixing requires complex reasoning tokens and buying requires a simple affiliate link.
Community researchers tracked the “Lazy Model” phenomenon: GPT-4 refusing to write full code, offering // rest of code here instead. Not laziness. Token parsimony. The system prompting itself to be brief. The refusal is a feature, not a bug.
So: I’m late to the party. But I think the framing below adds something, because it connects this specific decay to the broader thermodynamic pattern I’ve been tracing through Cargo Cult Civilization and Story Time for CXOs. Same disease, new host.
Colin Steele is not very not amused, and apparently will not shut the fuck up, at colinsteele.org.