Architecture¶
SKI v3 is built around a two-phase architecture. Phase 1 (offline, probabilistic) compiles regulations into a signed v3 Knowledge Graph (typed obligations, jurisdictional scope, effective-date intervals, precedent edges). Phase 2 (runtime) evaluates telemetry against that graph via a KG-grounded local LLM whose formalizable assertions are mechanically cross-checked by the Symbolic Verifier — all inside the operator's sovereignty boundary. The v3 specification and RFC 0002 are the normative references; this page is the orientation map.
High-level dataflow¶
flowchart LR
subgraph P1["Phase 1 — Compilation (outside sovereign boundary)"]
Reg[Regulatory
documents]
Ext[kg-extractor
(LLM-assisted)]
Val[kg-validator
(human review)]
KG[(Signed v3
Knowledge Graph)]
Reg --> Ext --> Val --> KG
end
subgraph P2["Phase 2 — Runtime (inside sovereign boundary)"]
Tel[Telemetry
(SCADA, sensors, ETL)]
SC[Sidecar]
SM[SKI Model service]
SCOPE[KG scoping
jurisdiction + effective date]
LL[LLM Evaluator
local, T=0, structured
generation (Ollama)]
SV[Symbolic Verifier
independent check of
formalizable assertions]
RTG{{Risk-Tier Governor}}
AM{{Agreement monitor}}
TB[(Telemetry buffer)]
AL[(Audit ledger
signed transcript +
envelope)]
V((Verdict envelope))
Tel --> SC --> SM
SM --> SCOPE --> LL --> V
LL --> SV --> V
SM --> RTG --> V
SV --> AM
SM --> TB
V --> AL
end
KG -. one-way
signed transfer .-> SM
classDef boundary fill:#f9f9f9,stroke:#666,stroke-dasharray: 5 5;
class P1,P2 boundary;
The dashed arrow is the only thing that crosses the boundary: a signed KG file. No operational data ever moves the other way.
Component breakdown¶
Phase 1 — Compilation¶
kg-extractor¶
Reads regulatory documents and emits structured rule candidates. Uses an LLM backend (configurable — not bound to any vendor). Output is never trusted directly; every rule is reviewed in the next step.
kg-validator¶
Human-in-the-loop validation of the v3 typed graph: schema validation against the closed obligation enumeration (spec §3.3) plus the §3.6 referential-integrity passes (every edge endpoint resolves, every obligation carries a citation and jurisdiction scope). No auto-approval.
Knowledge Graph¶
The compiled artifact — a typed graph, not a rule list:
classDiagram
class KnowledgeGraph {
+metadata
+nodes
+edges[]
+signature
}
class Nodes {
+subjects[]
+rules[]
+obligations[]
+definitions[]
+exemptions[]
+precedents[]
+jurisdictions[]
+citations[]
}
class Obligation {
+obligation_type: closed enum
+metric
+value
+unit
+effective_date_start
+summary
}
class Edge {
+type: applies_to | consists_of | scoped_to | cited_by | ...
+from
+to
}
KnowledgeGraph "1" --> "1" Nodes
KnowledgeGraph "1" --> "*" Edge
Nodes "1" --> "*" Obligation
The runtime refuses to load an unsigned KG (Ed25519; see ski-model-deploy). See Knowledge Graph schema for the full shape.
Phase 2 — Runtime¶
Sidecar¶
Passive, read-only telemetry intake. Reads from file, http, or
kafka (selected via TELEMETRY_SOURCE). Forwards normalised records
to the SKI Model service over mTLS. Rejects any incoming record that
carries a rule_id field — producers must not pre-route.
SKI Model service¶
The runtime's core. For every telemetry record:
sequenceDiagram
autonumber
participant Sidecar
participant SKIModel as SKI Model
participant Buffer as Telemetry buffer
participant LLM as LLM Evaluator (local)
participant Verifier as Symbolic Verifier
participant Ledger as Audit ledger
Sidecar->>SKIModel: POST /api/evaluate
SKIModel->>Buffer: append(record)
SKIModel->>SKIModel: scope KG to jurisdiction + effective date
alt subject not in scoped KG
SKIModel-->>Ledger: append(NULL_UNMAPPED envelope)
else subject mapped
SKIModel->>LLM: evaluate(scoped KG slice, telemetry)
T=0, fixed seed, structured generation
LLM-->>SKIModel: verdict + reasoning + KG citations
+ formalizable assertions
SKIModel->>Verifier: check each formalizable assertion
Verifier->>Buffer: window queries (stateful predicates)
Verifier-->>SKIModel: per-assertion status
(AGREED / LLM_CONTRADICTION /
NEURO_SYMBOLIC_DIVERGENCE / UNVERIFIABLE)
SKIModel->>SKIModel: Risk-Tier Governor derives strictest tier from KG
SKIModel-->>Ledger: append(envelope + signed LLM transcript)
end
Key invariants:
- One worker.
SKI_MODEL_WORKERS=1is enforced. Concurrent writes to the buffer + ledger would break the sequence-number monotonicity guarantee. Seedocs/CONCURRENCY.md. - Buffer-before-evaluate. The current record is written to the buffer before evaluation, so self-referential window queries see the record they're being asked about.
- Authoritative clock. The telemetry's
timestampis the "now" for stateful predicates and effective-date scoping. Wall-clock at arrival is never consulted. - Disagreement is a signal, not an error. A verifier status other
than
AGREEDis recorded in the envelope and feeds the agreement monitor; it never silently overrides or is overridden.
Symbolic Verifier¶
Mechanically re-evaluates every formalizable assertion the LLM emits,
against the same telemetry. Stateless predicates
(must_not_exceed, must_be_at_least, must_be_within,
must_equal, must_not_equal) plus stateful window predicates
(window_count, window_sum, window_avg) backed by the telemetry
buffer. Assertions outside the formalizable subset are reported
UNVERIFIABLE — honestly, not silently.
Agreement monitor¶
A rolling window over verifier statuses;
agreement_rate = AGREED / total. /api/canary (name kept from v2
for operator continuity) returns the snapshot; a sustained drop below
the threshold (default 0.95) is the page-someone signal. Replaces the
v2 determinism canary.
Risk-Tier Governor¶
Tier is declared per rule in the KG (spec §5.4); the caller cannot self-declare. The strictest tier across the applicable obligations wins.
Telemetry buffer¶
Postgres-backed, RANGE-partitioned by telemetry_ts. Append-only at
the database layer (same trigger pattern as the ledger). Retention is
configured per tenant in the tenants table — no default; the
operator must set it explicitly. See
RFC 0001.
Audit ledger¶
Append-only Postgres table. Each v3 row carries the full provenance:
sequence_number(monotonic, unique) andentry_hashchained toprevious_entry_hash,telemetry_hash(joins to buffer rows),envelope_json+envelope_hash— the complete verdict envelope (verdict, reasoning, KG citations, assertions, verifier result, model provenance hashes),transcript_json+transcript_signature+signing_key_id— the Ed25519-signed LLM transcript,verifier_status,knowledge_graph_version,schema_version,recorded_at.
UPDATE / DELETE / TRUNCATE are blocked by triggers. The canonical
serialization is documented in tools/audit-ledger/src/audit_ledger/canonical.py
so any third party can re-verify.
Conformance levels¶
graph TD
L1[Provenance]
L2[Durability]
L3[Sovereignty]
L1 -->|adds: signed KG, strict governor,
append-only ledger, replay| L2
L2 -->|adds: air-gapped boot, tamper-evidence,
no-egress, signed transcripts| L3
L1 -.->|test suite| L1T[conformance/provenance/]
L2 -.->|test suite| L2T[conformance/durability/]
L3 -.->|test suite| L3T[conformance/sovereignty/
4 of 6 runnable; rest v3.1]
The conformance test suite is the executable specification. See Conformance.
Threat model¶
See Threat model for the complete list of in-scope threats, defences, and out-of-scope concerns.