SKI Framework v3.0.0 — Neuro-Symbolic Pivot¶
Released: 2026-06-01
The first version of the SKI Framework we market. RFC 0002 (Accepted) is implemented end-to-end. Every architectural commitment in the v2 line — strict mypy on the deterministic core, replay determinism, the canary, the conformance suite — is rebuilt around a different shape of defensibility: verifiable provenance of a neuro-symbolic decision instead of bit-identical replay of a rule engine.
What's in the box¶
A KG-grounded sovereign LLM is the primary reasoner. On every
verdict, the local LLM (Ollama by default; V3LLMBackend protocol for
pluggability) reads the obligations applicable to the tenant's
jurisdiction and the measurement's effective date, runs structured
generation with temperature=0 and a fixed seed, and emits a verdict,
reasoning, KG citations, and a structured set of formalizable
assertions.
The Symbolic Verifier mechanically cross-checks every formalizable
assertion. For each assertion, the verifier evaluates the underlying
predicate against the same telemetry and emits AGREED,
LLM_CONTRADICTION, NEURO_SYMBOLIC_DIVERGENCE, or UNVERIFIABLE.
Five stateless predicates plus three stateful (window_count,
window_sum, window_avg) are supported.
The Risk-Tier Governor is strict. Risk tier per obligation is
declared in the KG (spec §5.4). The caller cannot self-declare a tier.
The strictest tier across the applicable obligations wins; default is
tier-2.
Signed LLM transcripts. Every evaluation produces an
LLMTranscript signed with the runtime's own ed25519 key
(auto-provisioned at $SKI_TRANSCRIPT_KEY_PATH). Auditors can
independently replay any verdict via the signature plus the ledger's
transcript_json and envelope_json columns. Backend-agnostic by
construction — no provider wire format reaches the ledger.
Jurisdiction-scoped KG snapshots. KnowledgeGraph.scope_to returns
only the obligations applicable to a tenant's jurisdiction (and
effective at the measurement's timestamp). Real-sized KGs no longer
blow the LLM context window, and the snapshot's scope block travels
in the signed transcript so an auditor can confirm what was sent.
Agreement monitor. Replaces the v2 determinism canary. A rolling
window of the last N verifier statuses; agreement_rate = AGREED /
total. Pages on a sustained drop below the configured threshold
(default 0.95).
Conformance reorganised around verifiable provenance. Three levels:
- Provenance — every verdict envelope is complete, the Symbolic Verifier ran, citations exist, the agreement monitor is mounted, and the verdict taxonomy is exactly the five canonical values.
- Durability — the KG is signed; the Risk-Tier Governor is strict; the audit ledger is append-only at the DB layer; the hash chain recomputes entry hashes (not just chain linkage); replay reproduces historical verdicts.
- Sovereignty — operable air-gapped, tamper-evident, end-to-end signed. (Scaffolded; harness is the v3.1 milestone.)
Tools¶
kg-extractor and kg-validator ship at 3.0.0 and emit / consume
the v3 typed-graph shape directly. The extractor's
ConfidenceLevel → ExtractionQuality rename reflects that this is the
extractor's authoring-time signal, separate from the runtime's
prohibited confidence score.
Breaking changes from the v2 line¶
MeasurementRecord.risk_tieris removed. The KG-side governor wins. v2-shape payloads parse without error (Pydantic silently drops the unknown field); a regression test pins the behavior.ski_model/canary.pyandski_model/backends.pyare gone. The agreement monitor replaces the canary; theski_model.v3.backendspackage replaces the v2 inference-backend abstraction.kg-validatorno longer accepts the flat-rule-list (v2) shape. The CLI exposes a singlevalidatesubcommand.review,detect-conflicts,detect-duplicates, and HTMLreportare retired.confidenceon extracted rules is renamedextraction_quality.- Conformance markers
level1/level2/level3are renamedprovenance/durability/sovereignty. Pytest invocations selecting by marker need to update.
What's NOT in v3.0.0¶
- Sovereignty conformance harness. The six tests are scaffolded
with spec citations and
pytest.skip(). The harness (network sandbox,--network=nonecontainer, destructive-DB tamper rig, subprocess startup, transcript inspection) is the v3.1 milestone. - Full-fidelity LLM-emitted typed obligations.
kg-extractorproduces v3 KGs today via a deterministic wrap of its flat-rule output (emit_v3_kg); a follow-up will have the LLM emit the typed obligation directly. - Horizontal scaling. Per-shard scaling, shard router, ledger partitioning, and the Kubernetes operator land in v3.2.
Migrating from v0.2.x¶
No schema migration is required. v0.2 ledgers upgrade in place — v3
adds nullable columns for the signed transcript, model provenance
hashes, KG citations, and verifier status; existing rows continue to
verify under audit-ledger verify.
API changes:
- Stop sending
risk_tieron/api/evaluaterequests (the field is silently dropped but logging warns once per shape). - Switch CI invocations from
pytest -m level1topytest -m provenance(andlevel2→durability). - If you call
kg-validator validateprogrammatically, drop the--schemaflag — v3 is the only path. - If you build on
kg_extractor.ComplianceRule, renameconfidence→extraction_quality.
Acknowledgements¶
The release closes the v3 pivot proposed in RFC 0002. Thanks to everyone who reviewed the architectural direction during the RFC's feedback window and to the upstream Ollama, Pydantic, and FastAPI maintainers whose work this builds on.
Full ship log¶
See CHANGELOG.md for the complete
list of changes. Source-of-truth references:
- Specification:
docs/specification-v3.md - RFC:
docs/RFCs/0002-v3-neuro-symbolic-pivot.md - Conformance methodology:
docs/conformance.md