GaugeWright Workbench — Architecture & Security

An enterprise architecture and security reference for engineers, security reviewers, and compliance teams. Written to be verified: every claim is grounded in the product repository, and capability claims carry an honest status. For a short summary see the security overview; for a control-by-control crosswalk see the CAIQ-style appendix.

Status legendAvailable in the shipped product today · Built implemented and tested, not yet operationally deployed · Planned committed, not built · Not implemented absent today.

Frameworks referenced: SOC 2 (AICPA TSC), ISO/IEC 27001, NIST CSF 2.0 / SP 800-53, NIST SSDF (SP 800-218), NIST AI RMF (+ GenAI Profile), OWASP Top 10 for LLM Applications (2025), MITRE ATLAS, SLSA / SBOM, CSA CCM/CAIQ.

1. Executive trust summary

GaugeWright runs an expert's method (an AI agent — instructions, skills, tools) against a client's private context (their data) under an enforced boundary, so neither the method nor the data leaks to the other party or to the runtime. The boundary is the product; its guarantees are structural — expressed as invariants that are machine-checked (42 formal models, each with an adversarial "teeth" test), not as policy that can be misconfigured.

  • Shipped today is a single-party desktop workbench: local orchestration, encrypted local storage, kernel-enforced method isolation, an append-only audit log. Available
  • Inference is remote. The agent's reasoning calls the third-party LLM provider you configure; your prompts and the in-scope context are sent to that provider over the network. The model provider is in the trust boundary today. Available Confidential inference Planned
  • Hosted, relayed, attested, enterprise-identity capabilities range from code-complete to design-only, and are not operationally available. Built Planned
  • No third-party attestations yet — no SOC 2, ISO 27001, or penetration test. Committed and prioritized. Planned

2. System context & stakeholders

PrincipalRoleTrusts
Method-ownerBuilds and packages the agentThe boundary not to leak their method to the client
Context-owner (client)Provides private dataThe boundary not to leak their data to the method-owner or runtime
Runtime / operatorExecutes the agent (local today; hosted later)Is not trusted with payload — handles convey no access
End-user (public hosting)Identified principal inside a consultant's authorityThe consultant's scoping
LLM provider (external)Performs inferenceIn the trust boundary today (sees prompts + context)
Relay (external)Routes encrypted bytes between authoritiesCannot read payload (INV-14)
LOCAL TRUST BOUNDARY IN TCB TODAY Method-owner Context-owner Workbench (Tauri) Boundary / sandbox Pi runtime — admitted work only Records · append-only event log · content store (content addressed by handle — INV-10) LLM provider OpenAI / Anthropic / Azure admitted work (handles + basis) prompts + context (plaintext egress)
System context. The local trust boundary contains the workbench, sandboxed runtime, and stores. The highest-scrutiny flow is the plaintext egress to the external LLM provider, which is in the trust boundary today.

Authority & scope (INV-1): every durable fact names the authority responsible for it and the scope it touches; no authority writes outside its scope. Projects and tenants are isolated by scope, not by row filters. Available

3. Architecture & deployment views

Components. A Rust core (pure, event-sourced reducers) + an application layer (stores, identity, crypto, networking) + a Tauri desktop shell + the Pi LLM runtime spawned as a sandboxed subprocess (--mode rpc) + a Node sidecar for SAML XML-dsig verification. The Rust backend is a Cargo workspace (crates/core, store, workspace, boundary, pi-bridge, app).

Deployment modeDescriptionStatus
Local desktopOrchestration + storage on your machine; federation opt-in; inference calls your configured LLM providerAvailable
Multi-authority federationExpert and client collaborate across machines; cert-pinned TLS; relay routes opaque bytes onlyAvailable
Hosted multi-tenantCloud-hosted relay + compute for consultants' deploymentsPlanned
Attested computeAMD SEV-SNP confidential VM + Azure Key Vault Secure Key Release; both parties verify the measurementVerifier Built Live Planned
Public hosting / embedBrowser-embeddable agent for end-users; per-session isolation, origin allowlist + budget capsPlanned

4. Data: classification, flow, residency, retention

Data model (ADR 0007). Four kinds, kept distinct: Records (durable declarations), Streams (append-only events/commands/observations; events are product truth), Content (protected payload — prompts, tool results, outputs, transcripts — addressed by handle only, INV-10), and Projections (rebuildable views, never authority, INV-5). Available

Where data lives. An append-only SQLite event log + a git-backed content store. Single-machine by default; multi-machine via relay; hosted/attested is roadmap. Available (local)

client context (handle) boundary admission basis check · fail-closed Pi runtime OS-sandboxed · no ambient auth LLM provider plaintext egress (external) output → review → release to authorized stakeholders only (INV-13/22) output
Data flow during a run. Context is admitted under a fail-closed basis; the sandboxed runtime has no ambient authority; egress to the LLM provider is plaintext (the known exposure); output release is gated to authorized stakeholders.

By product default a project's network egress is open (a run can reach the model out of the box); an operator opts into per-project network isolation (network_isolated, default off), which restores kernel-enforced network containment. There is no per-host model-endpoint allowlist proxy yet, so isolation today is all-or-nothing (no egress vs. open egress), not a filtered allowlist. Available

Retention & erasure (ADR 0008). Revocation stops future use without rewriting the past (INV-18); erasure tombstones content payload while preserving audit handles/metadata. Modeled (content-erasure.qnt, teeth CANT_UNERASE) and implemented in the reducer. Available — bulk/admin erasure UI and a GDPR DPA are Planned.

5. Security architecture → control families (the invariant crosswalk)

The differentiator: each protection invariant maps to a recognized control family and to a machine-checked proof. This turns assertions into evidence.

InvariantGuaranteeControl family (SOC 2 / ISO / NIST)Proof · teeth
INV-10Handles don't grant accessConfidentiality (CC6.1) · A.8.3 · AC-3boundary.qnt SAFE_EGRESS
INV-12Method & context reads both explicitLeast privilege (AC-6)boundary.qnt
INV-13Cross-authority needs source + targetConfidentiality · ZTA (800-207)federation.qnt · SIGNATURE_FORGERY
INV-14Relays aren't payload authoritiesConfidentialityfederation.qnt · RELAY_READS_PAYLOAD
INV-22Confidentiality goalSOC 2 Confidentialityengagement-taint.qnt SOUND
INV-11Execution consumes admitted workLeast privilege · ZTArun-admission.qnt · SKIP_ADMISSION
INV-20Fail-closedAccess control (CC6.x)fail-closed.qnt · FAIL_OPEN
INV-24Method is edit-authoredIntegrity · Change mgmt (CC8.1)method-integrity.qnt · USE_WRITE_LEAKS
INV-6/7/8Append-only, ordered, replayableProcessing integrity · Audit (AU-9)event-store semantics
INV-19Idempotent admissionProcessing integrityidempotency.qnt · NO_DEDUP

CI enforces both that each invariant holds and that its tooth bites (flips the probe true and asserts failure). Available

6. Identity & access management

  • OIDC — id-token verifier (JWKS signature, iss/aud/exp/nbf, claim → authority mapping), fail-closed; verified per-commit against self-hosted Keycloak. Available
  • SAML 2.0 — verification delegated to a hardened Node sidecar behind the same seam; single-use assertion replay defense. Available
  • SCIM — inbound provisioning/de-provisioning. Built; outbound sync Planned
  • RBAC/ABAC — role assignments + an ABAC policy evaluator. Built; admin console Planned
  • MFANot implemented in the product (org-level enforcement roadmap)
  • Build-vs-buy — own SSO/SCIM, no broker in the auth path (ADR 0056); trust source is SOC 2 + a SAML-scoped pen test (Planned)

Live interop with specific IdP vendors (Okta, Entra, Google) and deploy-time secret wiring are Planned. The shipped desktop product is local and needs no account.

7. Threat model

Methodology: STRIDE for the platform, OWASP LLM Top 10 (2025) and MITRE ATLAS for the AI surface.

STRIDEMitigationStatus
SpoofingActor authenticity verified before admission (INV-21); OIDC/SAML; signed governance envelopesAvailable
TamperingAppend-only immutable events (INV-6); AEAD at rest; method surface read-only at kernel (INV-24)Available
RepudiationPer-actor append-only audit + SIEM export; cryptographic non-repudiation not yet shippedAvailable Planned
Information disclosureHandles don't grant access (INV-10); both reads explicit (INV-12); opt-in per-project network isolation (egress open by default); relay opacity (INV-14)Available
Denial of serviceIdempotent admission (INV-19); entitlement/budget caps; platform rate-limiting limitedPartial
Elevation of privilegeNo ambient authority (INV-11); retries can't widen scope (INV-17); fail-closed (INV-20); kernel sandboxAvailable
OWASP LLM Top 10 (2025)PostureStatus
LLM01 Prompt injectionNo ambient authority; acts only on admitted work; tool calls gated; egress open by default with opt-in per-project network isolationAvailable
LLM02 Sensitive-info disclosureThe known exposure: prompts + context reach the LLM provider in plaintext. Disclosed, not hidden; confidential inference PlannedDisclosed
LLM06 Excessive agencyExecution consumes admitted work only (INV-11); kernel sandbox bounds tool/file/network reachAvailable
LLM07 System-prompt leakageMethod definition runs read-only, kernel-enforced; a work chat cannot read/rewrite it (INV-24)Available (Linux/macOS)
LLM05 Improper output handlingOutput review lifecycle (SOUND_RELEASE); release gated on stakeholder taintBuilt
LLM10 Unbounded consumptionEntitlement gate + per-engagement budget capsBuilt
LLM03 Supply chainPinned Cargo.lock; reproducible image + measurement digest. No SBOM / CVE scanning yetPartial
LLM04 / LLM08 / LLM09Poisoning / vector weaknesses / misinformation — out of current scope (no first-party training/RAG/fact-checking)Not implemented

MITRE ATLAS: the sandbox + admitted-work model addresses ML-model exfiltration and unauthorized-use tactics; model-extraction via the provider remains a residual until confidential inference.

8. Cryptography & key management

  • At restAES-256-GCM (AEAD via ring; no OpenSSL), nonce(12)‖ciphertext‖tag(16), fresh nonce per call; tamper/wrong-key fails the auth tag. Available
  • Key managementenvelope encryption: a 256-bit DEK wrapped by a KMS KEK in Azure Key Vault via a KeyWrap seam; KMS integration verified live. Built
  • Signatures — real P-256 ECDSA (pure-Rust) for governance envelopes and attestation reports. Available
  • Attestation — real AMD SEV-SNP quote verifier: ARK→ASK→VCEK chain + ECDSA-P384/SHA-384 signature + report_data freshness + measurement allow-list; tested against real Milan vectors. Built — quote generation needs a confidential VM (Planned)
  • TLS — cert-pinned egress seam for attested model endpoints. Planned

9. Audit, logging, monitoring & incident response

  • Audit log — per-actor, append-only {actor, action, target} in a reserved scope; references only (INV-10); position-ordered, filterable. Available
  • SIEM exportHttpAuditSink POSTs each entry as JSON to a customer-configured collector (Splunk/Datadog/webhook) over rustls. Available
  • Tamper-evidence — append-only is enforced semantically, not yet cryptographically (no signature/merkle chain). Planned
  • Monitoring / alerting / incident responseNot implemented (no production observability, runbooks, or IR procedures in tree today)
  • Uptime / SLAPlanned

10. AI governance (NIST AI RMF)

  • Govern — AI use is bounded by the same authority/scope/admission model as everything else; method changes are edit-authored and audited (INV-24).
  • Map — the LLM provider is named in the trust boundary; data-to-model flow is documented (§4). BYO-credentials (ADR 0053) lets the LLM relationship be the customer's own subprocessor. Built (core)
  • Measure — protection properties machine-checked (§13); no model-performance / bias / hallucination evaluation today (out of scope).
  • Manage — output review gates release; budget caps bound consumption; per-project network isolation is opt-in (egress open by default).

ISO/IEC 42001 (certifiable AI management system) is named as a future target, not a current certification. Planned

11. SDLC & software supply chain

CI gates every push (.github/workflows/ci.yml): cargo fmt --check, cargo clippy -D warnings, cargo test --workspace, web typecheck + unit, the SAML sidecar tests, quint typecheck + invariant/teeth model checks, and scripts/audit-gate.py (a tracker cannot close a gate while any coverage row is unfinished). Tier-1 adds a real networked relay + governance handshake; a Compose lane exercises NAT-isolated federation. Available

Mapped to NIST SSDF: version control + Cargo.lock (PO1.1); fmt/clippy enforced (PO2.1, PS2.1); provenance via reproducible image + digest (PO2.2, Built); proptests (PS3.1); coverage-gate review records (PS4.1).

Supply-chain expectationState
Pinned dependency lockfileAvailable (Cargo.lock)
Reproducible build + measurement digestBuilt (flake.nix, image-digest.yml)
SBOM (CycloneDX/SPDX)Not implemented
SLSA provenance attestationNot implemented
Dependency CVE scanning (cargo audit / Dependabot)Not implemented
SAST / secret scanning in CINot implemented
Code signing / notarization of desktop buildsNot implemented (builds are unsigned)

These are known, named gaps — not oversights — and are the most actionable near-term hardening items.

12. Compliance posture & roadmap

ItemState
SOC 2 Type IIPlanned (highest-priority; trust source for own-built SSO)
ISO 27001Not implemented (no roadmap committed)
Penetration test (SAML-scoped)Planned
DPA + published subprocessor listPlanned
GDPR right-to-erasureBuilt (erasure model); DPA Planned

Subprocessors (see also the CAIQ appendix): your configured LLM provider(s); Microsoft Azure (Key Vault, confidential VM — Built/Planned modes); Stripe (billing — Planned); customer-operated IdP. With BYO-credentials the LLM provider is the customer's subprocessor.

13. Assurance evidence

  • 42 Quint formal models covering the protection invariants; CI verifies each invariant holds and each adversarial "teeth" probe bites. Available
  • Property tests in the pure core tie the Rust reducers to the models. Available
  • Live verifications: Keycloak OIDC per-commit; Azure Key Vault KMS; real SEV-SNP Milan vectors; NAT-isolated federation. Available Built
  • No independent third-party penetration test or audit yet. Planned

Security contact: jack@gaugewright.com · Source & formal specs: github.com/jamesjscully/un-tie · Reviewed against spec rev 2026-06.

Change log — 2026-06: first enterprise edition (arc42 + security/AI overlay; control crosswalk; STRIDE + OWASP LLM Top 10 threat model; honest supply-chain and compliance gaps).