Issue 01Founder voice

Composition, not stack — clinical edition.

Cooperating reasoning models, not a tower of bigger and bigger general-purpose models. Why the F5/reasoner architecture is composition, what that buys, and why the clinical edition needed it.

By Bill Faruki·Apr 9, 2026·9 min read

The phrase “AI stack” suggests something I do not believe is the right shape for clinical work. A stack stacks. A bigger model on top, a smaller model below, retrieval as a layer, agentic orchestration as a wrapper. Each layer leans on the one above it. Failure on the upper layer is final.

Chiron is not built that way. Chiron is built as a composition of cooperating reasoning models, none of which is asked to do work it is not suited for. The clinical edition required this shape, because no single model is ready to do all of clinical reasoning, and pretending otherwise is how clinical AI products fail under scrutiny.

What the F5 in F5/reasoner actually is

F5 is not a single model. F5 is a fast routing classifier — based on Phi-3 — whose job is to read the incoming clinical query and decide which downstream reasoner configuration the work should land on. Routine differential? F5 routes to the core clinical-reasoning adapter on Phi-4. Imaging study? F5 loads the radiology specialty adapter on top. WC causation analysis? F5 loads the OM regulatory adapter and triggers the forensic-reasoning posture.

That routing layer is not glue. It is the architectural commitment that no specialty has to accept generic-purpose reasoning, and no general workflow has to load specialty depth it does not need. Routing is composition. The composition is what the platform actually is.

Why this is not a stack

A stack runs every query through every layer. A composition runs each query through the cooperating reasoners that are appropriate to it. The radiology consultant is not in the path of a routine pharmacology question. The OM regulatory adapter is not in the path of a primary-care visit. The base Phi-4 weights stay stable; what shifts is which adapters load on top.

That gives us two properties a stack cannot give. First, latency is bounded by what the query actually needs, not by the depth of the deepest layer. Second, capability extension does not require retraining the base. We can ship a cardiology specialty adapter without touching the radiology adapter or the core. New adapter, new corpus, new evaluation, ship. The base does not have to know it happened.

Why this is not a vertical SaaS wrapper

I want to be careful here, because the phrase “cooperating models” is sometimes a euphemism for a vertical SaaS wrapper that calls a frontier API for the hard parts. We do not do that. The clinical reasoner is in the path. The radiology consultant is in the path. The retrieval layer is in the path. The substrate the reasoner inherits from Eve-Genesis is in the path. None of these is a wrapper around a model someone else owns the cognitive shape of. We own the shape. The composition is ours.

The implication is awkward. We have to maintain more parts. The corpus team has to keep Eve-Genesis (Clinical Edition) growing on its quarterly cadence. The adapter team has to re-fit the LoRA weights on the expanded corpus. The evaluation team has to hold out traces and re-run them per release. None of this is amortised by leaning on a frontier-lab roadmap. We own the roadmap.

What composition buys at decision time

At 11:28, when the clinician is reading a 12-lead ECG and asking for a differential, the platform does not consult a generic chatbot and pray. It consults the routing classifier first; the classifier identifies the query as a rhythm interpretation; the rhythm specialty adapter loads; the core reasoning adapter is already loaded; the structured ranked differential renders with the substrate-grade abductive chain visible to the clinician. The latency budget is bounded by what the work actually needs.

At 14:12, when the WC physician is composing a PR-2 report, the platform routes to the OM regulatory adapter. The forensic posture comes online. The §3600 causation framework loads. The MTUS treatment-guideline alignment runs. The structured report drafts and the physician edits and signs. None of this drove a decision through a clinical-reasoning configuration that did not need it. The composition keeps each query in its right cognitive lane.

Why this commitment is durable

The market will keep producing bigger and bigger general-purpose models, and the general-purpose models will keep getting better at clinical work. We will benefit from that — the base Phi-4 weights ride the open-weight roadmap. What we will not do is let the base eat the architecture. The composition holds because the cognitive work in clinical practice is not uniform. Differential diagnosis is not radiology second-look is not WC causation analysis. A platform that pretends those are the same problem produces the same fluent-sounding wrong answer to all of them.

Composition holds because clinical practice is not uniform. The architecture matches the cognition. The reasoning the platform produces holds up at decision time because we designed it to.

The full F5/reasoner architecture is described on the technology page. The composition between the core reasoner and the specialty adapters is in the F5 architecture whitepaper.