# GaugeWright Workbench — Architecture & Security

*An enterprise architecture and security reference for engineers, security
reviewers, and compliance teams evaluating GaugeWright. Written to be verified:
every claim is grounded in the product repository, and capability claims carry
an honest status. For a short summary, see the [security overview](brief.html);
for a control-by-control crosswalk, see the [CAIQ-style appendix](caiq.html).*

**Status legend** — **Available**: in the shipped product today · **Built**:
implemented and tested in code, not yet operationally deployed · **Planned**:
committed, not yet built · **Not implemented**: absent today.

**Frameworks referenced:** SOC 2 (AICPA TSC), ISO/IEC 27001, NIST CSF 2.0 /
SP 800-53, NIST SSDF (SP 800-218), NIST AI RMF (+ GenAI Profile), OWASP Top 10
for LLM Applications (2025), MITRE ATLAS, SLSA / SBOM, CSA CCM/CAIQ.

---

## 1. Executive trust summary

GaugeWright runs an expert's **method** (an AI agent — instructions, skills,
tools) against a client's **private context** (their data) under an enforced
**boundary**, so neither the method nor the data leaks to the other party or to
the runtime. The boundary is the product; its guarantees are **structural** —
expressed as invariants that are **machine-checked** (42 formal models, each with
an adversarial "teeth" test), not as policy that can be misconfigured.

What a reviewer should take away up front:

- **Shipped today** is a single-party **desktop workbench**: local orchestration,
  encrypted local storage, kernel-enforced method isolation, an append-only
  audit log. — *Available*
- **Inference is remote.** The agent's reasoning calls the third-party LLM
  provider you configure; your prompts and the in-scope context are sent to that
  provider over the network. The model provider is in the trust boundary today.
  — *Available; confidential inference Planned*
- **Hosted, relayed, attested, and enterprise-identity** capabilities range from
  code-complete-and-tested to design-only, and are **not operationally
  available**. — *Built / Planned*
- **No third-party attestations yet.** No SOC 2, ISO 27001, or penetration test.
  These are committed and prioritized. — *Planned*

The security *model* is verified; operational *readiness* for a regulated
deployment is in progress, and this document states exactly where each piece is.

## 2. System context & stakeholders

The trust model has four principals and two external dependencies:

| Principal | Role | Trusts |
|---|---|---|
| **Method-owner** (consultant/expert) | Builds and packages the agent | The boundary not to leak their method to the client |
| **Context-owner** (client) | Provides private data | The boundary not to leak their data to the method-owner or runtime |
| **Runtime / operator** (local machine today; hosted later) | Executes the agent | Is *not* trusted with payload — handles convey no access |
| **End-user** (public-hosting mode) | An identified principal *inside* a consultant's authority, not a federation authority | The consultant's scoping |
| **LLM provider** (external) | Performs inference | **In the trust boundary today** (sees prompts + context) |
| **Relay** (external, federation) | Routes encrypted bytes between authorities | Cannot read payload (`INV-14`) |

```
        ┌─────────────────────── local trust boundary ───────────────────────┐
        │                                                                     │
  method│   ┌───────────┐      admitted work       ┌──────────────────┐       │
  owner ├──▶│  Workbench │─────(handles + basis)───▶│  Boundary /      │       │
        │   │  (Tauri)  │                           │  sandbox (Pi)    │       │
 context│   └───────────┘                           └────────┬─────────┘       │
  owner ─▶  records · event log · content store              │                │
        │                                                     │ prompts +      │
        └─────────────────────────────────────────────────── │ context ───────┘
                                                              ▼  (egress, plaintext)
                                                   ┌────────────────────┐
                                                   │   LLM PROVIDER     │  ◀ in TCB today
                                                   │  (OpenAI/Anthropic │
                                                   │   /Azure OpenAI)   │
                                                   └────────────────────┘
```

*Authority & scope (`INV-1`):* every durable fact names the authority responsible
for it and the scope it touches; no authority writes outside its scope. Projects
and tenants are isolated by scope, not by row filters. — *Available*

## 3. Architecture & deployment views

**Components.** A Rust core (pure, event-sourced reducers) + an application layer
(stores, identity, crypto, networking) + a Tauri desktop shell + the **Pi** LLM
runtime spawned as a sandboxed subprocess (`--mode rpc`) + a Node sidecar for
SAML XML-dsig verification. The Rust backend is a Cargo workspace
(`crates/core`, `store`, `workspace`, `boundary`, `pi-bridge`, `app`).

**Deployment modes** — the *same* protection model applies in every mode;
governance is added, not re-architected:

| Mode | Description | Status |
|---|---|---|
| **Local desktop** | Orchestration + storage on your machine; federation opt-in; **inference calls your configured LLM provider** | **Available** |
| **Multi-authority federation** | Expert and client collaborate across machines; cert-pinned TLS; relay routes opaque bytes only | **Available** (loopback + NAT-isolated CI harness) |
| **Hosted multi-tenant** | Cloud-hosted relay + compute for consultants' deployments | **Planned** (needs infra) |
| **Attested compute** | AMD SEV-SNP confidential VM + Azure Key Vault Secure Key Release; both parties verify the measurement | **Built** verifier; **Planned** live host |
| **Public hosting / embed** | Browser-embeddable agent for end-users, per-session isolation, origin allowlist + budget caps | **Planned** |

## 4. Data: classification, flow, residency, retention

**Data model (`ADR 0007`).** Four kinds, kept distinct: **Records** (durable
declarations — metadata, versions, membership), **Streams** (append-only
events/commands/observations; events are product truth), **Content** (protected
payload — prompts, tool results, outputs, transcripts — addressed by **handle**
only, `INV-10`), and **Projections** (rebuildable views, never authority,
`INV-5`). — *Available*

**Where data lives.** An append-only **SQLite** event log + a **git-backed content
store**. Single-machine by default; multi-machine via relay; hosted/attested is
roadmap. — *Available (local)*

**Data-flow during a run (trust-boundary crossings):**

```
 client context (handle)
        │  access basis evaluated at boundary (INV-10/12, fail-closed INV-20)
        ▼
 ┌──────────────┐    admitted work     ┌────────────────────┐   prompts+context
 │  boundary    │────(no ambient ──────▶│  Pi runtime        │──────(plaintext)─────▶  LLM PROVIDER
 │  admission   │      authority)       │  (OS-sandboxed)    │                          (external, in TCB)
 └──────────────┘                       └─────────┬──────────┘
   egress OPEN by default;                        │ output
   isolation opt-in per project                   ▼
   (network_isolated, off)         review lifecycle ──▶ release only to
                                           (derived-output SOUND_RELEASE)   authorized stakeholders (INV-13/22)
```

The **single highest-scrutiny flow** is the egress to the LLM provider: prompts
and in-scope context leave in plaintext today (OWASP LLM02). By product default a
project's **network egress is open** (a run can reach the model out of the box);
an operator **opts into per-project network isolation** (`network_isolated`,
default off), which restores kernel-enforced network containment. There is no
per-host model-endpoint allowlist proxy yet, so isolation today is all-or-nothing
(no egress vs. open egress), not a filtered allowlist. — *Available*

**Retention & erasure (`ADR 0008`).** **Revocation** stops future use without
rewriting the past (`INV-18`); **erasure** tombstones content payload while
preserving audit handles/metadata. Modeled (`content-erasure.qnt`, teeth
`CANT_UNERASE`) and implemented in the reducer. — *Available* (model + reducer);
bulk/admin erasure UI and a GDPR DPA are *Planned*.

**Residency.** Local-only today; hosted residency options are roadmap.

## 5. Security architecture → control families (the invariant crosswalk)

The differentiator: each protection invariant maps to a recognized control
family **and** to a machine-checked proof. This turns assertions into evidence.

| Invariant | Guarantee | Control family (SOC 2 / ISO 27001 / NIST) | Proof model · teeth |
|---|---|---|---|
| **INV-10** Handles don't grant access | Reference conveys no payload read | Confidentiality (CC6.1) · A.8.3 · AC-3/AC-24 | `boundary.qnt` SAFE_EGRESS · `EGRESS_LEAKED…` |
| **INV-12** Method & context reads both explicit | No implicit read inside the boundary | Confidentiality · Least privilege (AC-6) | `boundary.qnt` · `METHOD_HIDDEN_FROM_B` |
| **INV-13** Cross-authority needs source + target | Nothing crosses without both sides | Confidentiality · ZTA (800-207) | `federation.qnt` · `SIGNATURE_FORGERY` |
| **INV-14** Relays aren't payload authorities | Routing ≠ reading | Confidentiality | `federation.qnt` · `RELAY_READS_PAYLOAD` |
| **INV-22** Confidentiality goal | Protected payload never reaches a non-stakeholder w/o basis | SOC 2 Confidentiality | `engagement-taint.qnt` SOUND · `UNSOUND_RELEASE` |
| **INV-11** Execution consumes admitted work | No ambient authority | Least privilege (AC-6) · ZTA | `run-admission.qnt` · `SKIP_ADMISSION` |
| **INV-20** Fail-closed | Uncertainty denies | Access control (CC6.x) | `fail-closed.qnt` · `FAIL_OPEN` |
| **INV-24** Method is edit-authored | Agent can't rewrite its own method | Integrity · Change management (CC8.1) | `method-integrity.qnt` · `USE_WRITE_LEAKS` |
| **INV-6/7/8** Append-only, ordered, replayable | Tamper-resistant audit; deterministic state | Processing integrity · Audit (CC7.x) · AU-9 | event-store semantics |
| **INV-19** Idempotent admission | Effect applied at most once | Processing integrity | `idempotency.qnt` · `NO_DEDUP` |

CI enforces both that each invariant *holds* (`quint run --invariant`) and that
its tooth *bites* (flips the probe true and asserts failure). — *Available*

## 6. Identity & access management

- **OIDC** — id-token verifier (JWKS signature, `iss`/`aud`/`exp`/`nbf`, claim →
  authority mapping), fail-closed; verified per-commit against a self-hosted
  Keycloak in CI. — *Available (verifier)*
- **SAML 2.0** — verification delegated to a hardened Node sidecar
  (`@node-saml/node-saml`) behind the same seam; **single-use assertion** replay
  defense (assertion id + `NotOnOrAfter` enforced). — *Available (verifier)*
- **SCIM** — inbound provisioning/de-provisioning endpoint. — *Built*; outbound
  sync *Planned*.
- **RBAC/ABAC** — role assignments + an ABAC policy evaluator. — *Built*; admin
  console *Planned*.
- **MFA** — *Not implemented* in the product (org-level enforcement is roadmap).
- **Build-vs-buy** — own SSO/SCIM, no third-party broker in the auth path
  (`ADR 0056`); the trust source is SOC 2 + a SAML-scoped pen test, both
  *Planned*.

Live interop with specific IdP vendors (Okta, Entra, Google) and deploy-time
secret wiring are *Planned*. The shipped desktop product is local and needs no
account.

## 7. Threat model

Methodology: **STRIDE** for the platform, **OWASP LLM Top 10 (2025)** and **MITRE
ATLAS** for the AI surface.

**STRIDE → mitigation:**

| Threat | Mitigation | Status |
|---|---|---|
| **Spoofing** | Actor authenticity verified before admission (`INV-21`); OIDC/SAML id verification; signed governance envelopes | Available |
| **Tampering** | Append-only immutable events (`INV-6`); AEAD at rest; method surface read-only at kernel (`INV-24`) | Available |
| **Repudiation** | Per-actor append-only audit log; SIEM export. *Cryptographic non-repudiation not yet shipped* | Available / Planned |
| **Information disclosure** | Handles don't grant access (`INV-10`); both reads explicit (`INV-12`); opt-in per-project network isolation (egress open by default); relay opacity (`INV-14`) | Available |
| **Denial of service** | Idempotent admission (`INV-19`); entitlement/budget caps for paid runs. *Platform rate-limiting/quotas limited* | Available / Partial |
| **Elevation of privilege** | No ambient authority (`INV-11`); retries can't widen scope (`INV-17`); fail-closed (`INV-20`); kernel sandbox | Available |

**OWASP LLM Top 10 (2025) → posture:**

| Risk | Posture | Status |
|---|---|---|
| **LLM01 Prompt injection** | Agent has no ambient authority; acts only on admitted work; tool calls gated at the boundary; egress open by default with opt-in per-project network isolation | Available |
| **LLM02 Sensitive-info disclosure** | Egress is the known exposure: prompts + context reach the LLM provider in plaintext. Disclosed, not hidden; confidential inference Planned | Available (disclosed) |
| **LLM06 Excessive agency** | Execution consumes admitted work only (`INV-11`); kernel sandbox bounds tool/file/network reach | Available |
| **LLM07 System-prompt leakage** | Method definition runs read-only, kernel-enforced; a work chat cannot read/rewrite the definition to exfiltrate it (`INV-24`) | Available (Linux/macOS) |
| **LLM05 Improper output handling** | Output review lifecycle (`SOUND_RELEASE`); release gated on stakeholder taint | Built |
| **LLM10 Unbounded consumption** | Entitlement gate + per-engagement budget caps (paid/embed) | Built |
| **LLM03 Supply chain** | Pinned `Cargo.lock`; reproducible image + measurement digest. *No SBOM / dependency-CVE scanning yet* | Partial |
| **LLM04 Data/model poisoning · LLM08 Vector weaknesses · LLM09 Misinformation** | Out of current scope (no first-party training/RAG/fact-checking) | Not implemented |

**MITRE ATLAS:** the sandbox + admitted-work model addresses ML-model
exfiltration and unauthorized-use tactics; model-extraction via the provider
remains a residual until confidential inference.

## 8. Cryptography & key management

- **At rest** — **AES-256-GCM** (AEAD via the `ring` crate; no OpenSSL),
  `nonce(12)‖ciphertext‖tag(16)`, fresh nonce per call; tamper/wrong-key fails the
  auth tag. — *Available*
- **Key management** — **envelope encryption**: a 256-bit data key (DEK) wrapped
  by a KMS key-encryption key (KEK) in **Azure Key Vault**, via a `KeyWrap` seam;
  KMS integration verified live (`keyvault_live`). — *Built* (local encryptor
  Available; live KMS deployment needs a service principal).
- **Signatures** — real **P-256 ECDSA** (`p256`/`ecdsa`, pure-Rust) for governance
  envelopes and attestation reports. — *Available*
- **Attestation** — real **AMD SEV-SNP** quote verifier: parses the report,
  validates the **ARK→ASK→VCEK** chain and the **ECDSA-P384/SHA-384** signature,
  checks `report_data` freshness, and consults a measurement allow-list
  (`MeasurementStore`). Tested against **real Milan vectors** on the green path.
  — *Built*; fresh-quote generation needs a confidential VM (*Planned*).
- **TLS** — cert-pinned egress seam for attested model endpoints
  (`pinned_tls.rs`). — *Planned* (live pinning ships with confidential inference).

## 9. Audit, logging, monitoring & incident response

- **Audit log** — per-actor, append-only entries `{actor, action, target}` in a
  reserved scope; references only, never payload (`INV-10`); position-ordered,
  filterable. — *Available*
- **SIEM export** — `HttpAuditSink` POSTs each entry as JSON to a
  customer-configured collector (Splunk/Datadog/webhook) with auth headers, over
  rustls. — *Available*
- **Tamper-evidence** — append-only is enforced *semantically* (immutable event
  log), **not yet cryptographically** (no signature/merkle chain). Cross-party
  log non-repudiation is a *Planned* federation invariant.
- **Monitoring / alerting / incident response** — **Not implemented**; no
  production observability, runbooks, or IR procedures in tree today.
- **Uptime / SLA** — *Planned* (status page + SLA tracked, not built).

## 10. AI governance (NIST AI RMF)

Mapped to the AI RMF functions:

- **Govern** — AI use is bounded by the same authority/scope/admission model as
  everything else; method changes are edit-authored and audited (`INV-24`).
- **Map** — the LLM provider is named in the trust boundary; data-to-model flow is
  documented (§4). BYO-credentials (`ADR 0053`) lets the LLM relationship be the
  customer's own subprocessor. — *Built (core)*
- **Measure** — protection properties machine-checked (§13); no model-performance
  / bias / hallucination evaluation today (out of scope).
- **Manage** — output review lifecycle gates release; budget caps bound
  consumption; per-project network isolation is opt-in (egress open by default).

ISO/IEC 42001 (certifiable AI management system) is named as a future target, not
a current certification. — *Planned*

## 11. SDLC & software supply chain

**CI gates every push** (`.github/workflows/ci.yml`): `cargo fmt --check`,
`cargo clippy -D warnings`, `cargo test --workspace`, web typecheck + unit, the
SAML sidecar tests, `quint typecheck` + invariant/teeth model checks, and
`scripts/audit-gate.py` (a tracker cannot close a gate while any coverage row is
unfinished). Tier-1 adds a real networked relay + governance handshake; a Compose
lane exercises NAT-isolated federation. — *Available*

**Mapped to NIST SSDF (SP 800-218):** version control + `Cargo.lock` (PO1.1);
fmt/clippy enforced (PO2.1, PS2.1); provenance via reproducible image + digest
(PO2.2, *Built*); property/fuzz-style proptests (PS3.1); coverage-gate review
records (PS4.1).

**Honest supply-chain gaps:**

| Expectation | State |
|---|---|
| Pinned dependency lockfile | **Available** (`Cargo.lock`) |
| Reproducible build + measurement digest | **Built** (`flake.nix`, `image-digest.yml`) |
| SBOM (CycloneDX/SPDX) | **Not implemented** |
| SLSA provenance attestation | **Not implemented** |
| Dependency CVE scanning (`cargo audit` / Dependabot) | **Not implemented** |
| SAST / secret scanning in CI | **Not implemented** |
| Code signing / notarization of desktop builds | **Not implemented** (deferred; builds are unsigned) |

These are known, named gaps — not oversights — and are the most actionable
near-term hardening items.

## 12. Compliance posture & roadmap

| Item | State |
|---|---|
| SOC 2 Type II | **Planned** (highest-priority; the trust source for own-built SSO) |
| ISO 27001 | **Not implemented** (no roadmap committed) |
| Penetration test (SAML-scoped) | **Planned** |
| DPA + published subprocessor list | **Planned** |
| GDPR right-to-erasure | **Built** (erasure model); DPA *Planned* |

**Subprocessors** (see also the [CAIQ appendix](caiq.html)): your configured LLM
provider(s); Microsoft Azure (Key Vault, confidential VM — for Built/Planned
modes); Stripe (billing — *Planned*); customer-operated IdP. With BYO-credentials,
the LLM provider is the *customer's* subprocessor.

## 13. Assurance evidence

- **42 Quint formal models** in `specs/models/`, covering the protection
  invariants; CI verifies each invariant holds and each adversarial "teeth" probe
  bites. — *Available*
- **Property tests** in the pure core (`crates/core`) tie the Rust reducers to the
  models (run admission, boundary, revocation, idempotency, federation, …). —
  *Available*
- **Live verifications** in CI/tooling: Keycloak OIDC per-commit; Azure Key Vault
  KMS wrap/unwrap; real SEV-SNP Milan attestation vectors; NAT-isolated
  federation. — *Available / Built*
- **No independent third-party penetration test or audit yet.** — *Planned*

---

*Security contact:* jack@gaugewright.com · Source &amp; formal specs:
github.com/jamesjscully/un-tie · Reviewed against spec rev 2026-06.

**Change log** — 2026-06: first enterprise edition (structure per arc42 + a
security/AI overlay; control crosswalk; STRIDE + OWASP LLM Top 10 threat model;
honest supply-chain and compliance gaps).
