Architecture¶

This page is the why. For the API, see the Reference; for the record of specific decisions, the Design decisions.

model-ledger is built on four load-bearing choices. Each was made against a real alternative, and each carries a cost we accepted on purpose.

The shape¶

graph TB
    subgraph consumers ["Consumers"]
        direction LR
        A["Agents<br/><small>MCP</small>"] ~~~ R["Frontends<br/><small>REST</small>"] ~~~ S["Scripts<br/><small>SDK</small>"] ~~~ C["CLI"]
    end
    subgraph protocol ["Agent protocol — consolidated tools"]
        direction LR
        T["discover · record · investigate · query · trace · changelog · tag"]
    end
    subgraph sdk ["Ledger SDK (tool-shaped)"]
        L["register · record · add · connect · trace · history · inventory_at · composites"]
    end
    subgraph sources ["Discovery"]
        direction LR
        CO["SourceConnector protocol<br/><small>sql · rest · github · yours</small>"]
    end
    subgraph storage ["Storage"]
        direction LR
        B["LedgerBackend protocol<br/><small>memory · sqlite · json · snowflake · http</small>"]
    end
    consumers --> protocol --> sdk
    sdk --> sources
    sdk --> storage
    classDef ink fill:#1c1a17,color:#f7f3ec,stroke:#000;
    classDef ox fill:#efe8da,stroke:#7a1a1a,color:#1c1a17;
    class protocol ink;

The consumers are interchangeable because they all bottom out in the same tool-shaped SDK. Discovery and storage are both protocols, so the core stays tiny and the ecosystem extends it without forking.

1. The inventory is an event log, not a registry¶

A registry stores current state and overwrites it. model-ledger stores what happened and never overwrites anything: a model is an identity (ModelRef), and every change is an immutable, content-addressed Snapshot.

Why. The question a governance regime actually asks is "show me the complete history of every change, approval, and validation" — and "what was true on this past date?" A mutable registry structurally cannot answer the second question; an append-only log answers both for free, and content-addressing makes the chain tamper-evident.

The cost we accepted. More storage, and reconstruction (inventory_at) is a replay rather than a row read. We trade write-time simplicity for an audit trail that can't be quietly edited — the right trade for a system of record. → ADR 0001

2. Everything is a DataNode¶

An ML model, a heuristic rule, an ETL job, and an alert queue are the same shape: each consumes some inputs and produces some outputs. So they're one type — DataNode with typed ports — and the dependency graph assembles itself when an output port name matches an input port name.

Why. Discovery scales only if connectors stay dumb. A connector emits nodes with their ports and knows nothing about the rest of the graph; the cross-platform edges (an ETL job in your warehouse → a model in MLflow → a queue in your alerting system) fall out of port matching, with no shared ID scheme to maintain.

The cost we accepted. Two models can legitimately write a table with the same name. Bare names would over-link, so DataPort carries optional schema discriminators to keep edges precise. We rejected per-platform model types and a fixed metadata schema — both too rigid to span platforms. → ADR 0002

3. Agents are the primary interface¶

The SDK is tool-shaped: each method maps to one consolidated agent tool, exposed identically over MCP and REST. The verb set is deliberately small (discover, record, investigate, query, trace, changelog, tag) rather than a sprawl of endpoints.

Why. The most natural way to ask "which high-risk models changed this week and haven't been validated?" is to ask. Designing for the agent first (per Anthropic's tool-writing guidance) makes the SDK and REST surfaces cleaner as a side effect — consolidated, orthogonal, hard to misuse.

The cost we accepted. Fewer, broader tools mean a single call does more, which is a worse fit for fine-grained REST conventions. We optimize for the agent's working memory over endpoint granularity. → ADR 0003

4. Framework-agnostic core, pluggable everything¶

Storage, discovery, introspection, and compliance are all @runtime_checkable Protocols discovered via entry points. Regulations live in profiles — a plugin layer — not in the core. The core depends only on httpx + pydantic.

Why. model-ledger is an inventory for any organization with deployed models, not a single-regulation tool. Keeping regulations as a thin, swappable layer means a renumbered rule (SR 11‑7 → SR 26‑2) is a profile change, not a core change — see Governance. The tiny core is also what lets a downstream package add org-specific connectors and auth without touching it. → ADR 0004 · ADR 0005

The cost we accepted. record() takes a schema-free payload; envelope validation is the caller's (or a profile's) responsibility. We trade a rigid schema for the freedom to record whatever a platform actually has.

What model-ledger is not¶

Stating the boundary is part of the design:

Not a feature store or a serving layer. It inventories and relates models; it does not store features or serve predictions.
Not a monitoring/metrics system. It records that a validation or retrain happened (as an event); it doesn't compute drift or accuracy.
Discovery is point-in-time, not streaming. Connectors run on a schedule and snapshot what they find; last_seen lets you detect models that have gone silent, but the graph is as fresh as the last sync.
Connectors that need live credentials run from the SDK, not the agent. rest and prefect are pure-config and run through the discover tool; sql/github need a live connection or a callable and are driven from the SDK. The agent gets an actionable error, never a crash.

Where to go next¶

The primitives, in three ideas → Concepts
The guarantees the event log provides → Snapshots & the event log
The record of each decision and its alternatives → Design decisions