Assumption Laundering

The model isn’t the problem.

Apr 25, 2026

John Rousseau published something worth reading this week. The Agency Field Map is a sensemaking framework that was tested against the 2026 Stanford AI Index Report, using Claude to simulate the future of knowledge work. What caught my attention was a quiet admission in the methodology section. Rousseau ran five iterations. He checked citations. He wrestled with the prompt to stop the model restating the executive summary or elevating minor findings into significant patterns. He called this the jagged frontier. He’s spot on about the limits, and it prompted me to question where those limits actually sit.

The dominant concern about AI-assisted analysis is hallucination. Fabricated citations, confident wrongness. This is a real problem and an obvious one; you can design for it, build adversarial pipelines, and catch the lies. Hallucination is a visible failure mode.

The deeper danger is assumption laundering.

When you use a capable model for complex analysis, it doesn’t just process your inputs; it also reflects your framing back at you coherently, fluently, with apparent confidence. If you approach a domain with a particular lens, the model constructs an articulate case for that lens. It finds the supporting material. It organises the argument more clearly than you might have done alone. The output looks like analysis. It feels like discovery. But it might just be you, with better sentences.

Your prior goes in, and your prior comes out, cleaned up and formalised, harder to question because it now exists as structured text rather than nascent intuition. (I’m borrowing ‘prior’ from Bayesian thinking — it means the beliefs and assumptions you bring to a question before the evidence arrives.) The laundering doesn’t introduce new errors. It embeds existing ones more deeply.

This is different from hallucination, though the two aren’t fully separable. Many hallucinations reflect the framing the user brought to the prompt, which makes them a kind of laundering in their own right. Hallucination is the model making things up (from our perspective – technically the model is performing as designed, and in that sense it’s not making anything up). Assumption laundering is the model making you more convincing to yourself, and it affects human cognition and trust.

It’s related to confirmation bias, but more insidious. Confirmation bias is something you do to yourself. You seek evidence that fits what you already believe. Assumption laundering externalises the bias, structures it, and returns it to you looking like independent analysis. You’re no longer the one doing the selecting, so you can’t catch yourself doing it.

I run a biological computing foresight newsletter called HScan11 with my collaborator David Sloly. Over the past several months we’ve built a multi-agent scanning pipeline — agents handling discrete roles across scanning, filtering, synthesis, and drafting; a SQLite backend; and specialist scanners across academic, industry, defence, and media streams. The pipeline includes a deliberate adversarial QA layer: Claude drafts, ChatGPT critiques, and Gemini checks the output against HScan11’s voice contract.

All sounds great, but it isn't epistemically robust in the way I originally thought.

The adversarial QA catches execution errors. A misattributed signal, an inference that doesn’t follow, duplicate signals, and a claim that overstates the source material. But the system prompts, the scanner briefs, the CIPHER classification structure, and the Scene/Discontinuity/Vector framing — all of that encodes assumptions about what biological computing is, what counts as a significant signal, and what the relevant actors and timeframes are. The adversarial layer never reaches that level. It operates within the frame. It was designed within the frame.

This isn't unique to our pipeline. Any system that encodes a fixed taxonomy or classification scheme, including commercial intelligence platforms built around proprietary frameworks and taxonomies, has the same structural blindness. The more polished the system, the more invisible the frame. The more confident the brand, the less likely the user is to interrogate the underlying assumptions.

The pipeline is reliable at doing what it is told to do. It is completely blind to whether it has been told to do the right thing. Adding more agents only partly resolves this.

The design space is wider than “add another critic”; adversarial framings, cross-domain corpora, and prompting for contradiction rather than validation can all introduce friction at the framing layer. But there’s a ceiling. You’re still the one writing the system prompts, still the one deciding which outputs to trust. You can’t reliably step outside your own prior simply by adding another model. More sophisticated tooling just executes your assumptions more efficiently.

I find this uncomfortable. The more careful you are, the more trustworthy the outputs feel. But “trustworthy within the frame” and “trustworthy” are not the same thing, and the former can crowd out awareness of the latter.

This has implications beyond individual practice. Trust in AI systems is largely being established at the execution layer — does the model perform reliably, cite correctly, follow instructions. But if the deeper risk lives at the framing layer, then the more trustworthy our tools become in the conventional sense, the less visible the unconventional risk gets. An organisation confident in its AI-assisted analysis may be more exposed to framing errors than one that remains appropriately sceptical, precisely because confidence reduces the scrutiny that might catch them. And the influence runs both ways: what the model returns shapes what the human comes to find natural to think next.

So I ran an experiment. I fed a draft of this blog to a model and asked for feedback.

What came back was fluent, structured, and confident. It gave my central concept a name upgrade, “second-order problem of AI”. It introduced a distinction I hadn’t articulated cleanly: verification versus validation or, in its framing, a Type III error — correctly solving the wrong problem.

But then it proposed a solution. An “Anti-Frame Agent”, a model whose system prompt assumes the CIPHER classification is fundamentally flawed and proposes an entirely different ontological structure. It presented this as moving beyond pipelines toward something more dialectical.

It’s a pipeline solution. Adversarial tooling with a differently oriented brief, operating at the execution layer. It’s the opposite of what the blog concludes, dressed in language that sounds like agreement.

I wouldn’t have noticed that if I hadn’t been looking for it. The surrounding fluency is persuasive enough to carry the contradiction past your guard. Which is, of course, exactly the mechanism the blog is trying to describe.

The noticing is important. And in this case, the model didn’t do it. One instance doesn’t prove a pattern, but it illustrates the mechanism.

I’m also aware this blog operates within its own frame, one that treats assumption laundering as the central risk of AI-assisted analysis. Other framings are available, and I hold this one tentatively.

I’m not arguing human judgement is categorically superior or that the feedback was useless; parts of it genuinely improved the piece. The model can extend your argument, sharpen your vocabulary, and occasionally hand you a better tool. What it can’t reliably do is stand outside the frame you’ve both been operating in and tell you the frame is the problem.

What this points towards is less a hierarchy, human judgement over machine capability, and more an assemblage: human and machine intelligence combining in ways that neither fully controls. The model shapes what I notice, how I frame it, and which connections feel natural. I shape what the model is asked to do and how its outputs get used. Understanding that mutual influence, particularly at the framing layer where assumptions live, feels like one of the more important and least examined questions in how we actually work with these systems.

It also unsettles the standard governance metaphors. “Human in the loop” and “human over the loop” both assume a hierarchy that this kind of mutual influence makes harder to sustain and which most current AI oversight quietly assumes is intact.

All this thinking about frames and assemblages is prompting bigger questions.

Framing isn’t just a methodological concern; it’s a cognitive one. Cognition doesn’t passively receive the world. It selects from it, organises it, and renders it coherent into something usable.

This is sensemaking, and it’s what the framing layer in AI-assisted analysis is doing too, just externalised into a tool.

Which raises a question that I’m actively pondering: what is the nature of sensemaking when it is distributed across an assemblage of intelligences?

Other questions follow. If part of our sensemaking now happens in tools we didn’t build, trained on corpora we didn’t curate, and optimised for objectives we didn’t set, then cognitive sovereignty becomes a real concern, and cognitive security, a question of which influences are shaping what we find natural to think next. The cognitive attack surface has expanded, and most of us haven't yet noticed.

These threads will need their own explorations. The next two will stay close to the framing question, first through scenario workshops, where the practitioner's work is to suspend an existing frame long enough that different questions become reachable, and then through C-K theory, which gives a more formal account of what that suspension actually constructs and why it matters.

I need to cogitate on the cognitive sovereignty thread a little longer...

Discussion about this post

Ready for more?