Keelpilot

Safety architecture

The LLM cannot sign a trade.

Four architectural firewalls keep language models out of the decision path. Here is how each one works, and why it matters when your CRO reads this page.

Firewall01

Deterministic math, narrow language-model role

Weights, risk figures, and portfolio constraints are produced by deterministic code — not by a model’s next-token prediction. Language models are confined to narrow, auditable roles: articulating rationale, summarising context, translating between representations.

Numbers that bind are traceable to inputs and to code. Nothing about the weights shown in the IC memo originates in a probabilistic head.

Why it matters. You can reproduce every number shown to the Investment Committee. There is no “the model decided.” There is math, and a person who chose to run it.

Firewall02

Peer-review firewall

A dedicated reviewer layer inspects each proposal. Its outputs never reach the decision maker directly. Dissents are logged, surfaced, and archived with the decision record — but the decider sees a governed summary, not raw reviewer text.

Reviewers challenge. They do not steer.

Why it matters. You get the benefit of independent challenge without the risk of a persuasive reviewer pulling an outcome off-piste. Every dissent is preserved and testable in audit.

Firewall03

Schema-validated inter-agent boundaries

Every message between internal components is validated against a typed schema at the boundary. Malformed or out-of-bounds outputs are rejected before they propagate. There is no silent field drift, no loose stringly-typed contract between stages.

Prompt-injection cannot leak through as structured data, because there is no path by which unstructured text becomes structured input without validation.

Why it matters. The system fails loudly at the boundary — not quietly two stages downstream where the corruption is hard to trace.

Firewall04

Circuit breakers and bounded-retry gates

Retries are capped. Anomalies trip circuit breakers and return control to a human operator. Degraded states are explicit: the system announces it is degraded and halts, rather than quietly continuing on best-effort.

There are no silent retries, no optimistic completions, no exception-swallowed edge cases.

Why it matters. Failure modes are observable and contained, not creeping. Incident response has a clear handoff point — and an audit trail that ends at a named person, not at a stack trace.

Autonomy posture

Capped at L4. Not L5.

Human sign-off is contractual, not a toggle. We do not ship a level of autonomy in which the system commits on its own — not now, and not on roadmap. If a vendor tells you otherwise, ask them who signs the memo.

  1. L0 · Manual

    Human produces every decision.

  2. L1 · Assistive

    Tooling accelerates analysis; human decides.

  3. L2 · Partial automation

    Routine paths automated; human reviews outcomes.

  4. L3 · Conditional automation

    System executes within guardrails; human approves scope.

  5. L4 · High automation — our cap

    System proposes, dissents, drafts the memo, and surfaces the decision for a human to sign. Human sign-off is contractual.

  6. L5 · Full autonomy — not on offer

    The system signs. Not available, not on roadmap. We do not ship this level to any customer, at any price.

Pilot posture

Paper trade by default.

Pilots begin in paper-trade mode. Committed allocations require your sign-off. Your Investment Policy Statement is the ground truth, not ours. We configure to it — we do not reinterpret it.

Review with your team

If your CRO is reading this page, send us a note. Demos are delivered live by a principal, under NDA, within one business day.